question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FITS problem reading binary table with variable length columns

See original GitHub issue

I want to read a certain FITS file (P190mm-PAFBE-FEBEPAR.fits.zip), which is part of a Multi-Beam-FITS measurement set (MBFITS) as used by several radio observatories around the world. The file has a binary table extension with variable length columns. Usually this works fine, but this particular example has some columns with a “1PJ(1)” type (and one row only), which seems to lead to problems when reading with astropy.io.fits:

import astropy
astropy.__version__
# '3.0.4'
from astropy.io import fits
data = fits.getdata('P190mm-PAFBE-FEBEPAR.fits', 1)
data
# FITS_rec([(1, 1)],
#          dtype=(numpy.record, {'names':['USEBAND','NUSEFEED','USEFEED','BESECTS','FEEDTYPE','FEEDOFFX','FEEDOFFY','REFFEED','POLTY','POLA','APEREFF','BEAMEFF','ETAFSS','HPBW','ANTGAIN','TCAL','BOLCALFC','BEGAIN','BOLDCOFF','FLATFIEL','GAINIMAG','GAINELE1','GAINELE2'], 'formats':['>i4','>i4',('>i4', (1, 1)),('>i4', (1, 1)),('>i4', (1, 1)),'>f8','>f8','>i4','S1','>f4',('>f4', (1, 1)),('>f4', (1, 1)),('>f4', (1, 1)),('>f4', (1, 1)),('>f4', (1, 1)),('>f4', (1, 1)),'>f4','>f4',('>f4', (1, 1)),('>f4', (1, 1)),('>f4', (1, 1)),'>f4','>f4'], 'offsets':[0,4,8,16,24,32,40,48,52,53,57,61,65,69,73,77,81,85,89,93,97,101,105], 'itemsize':109}))

Here it appears already, that the content of the record (“(1, 1)”] is smaller than the ‘itemsize’ (109). In fact, accessing the first two columns works, but all var-length columns raise an error:

data['USEBAND']
# array([1], dtype=int32)

data['NUSEFEED']
# array([1], dtype=int32)

data['USEFEED']
# IndexError                                Traceback (most recent call last)
# ...
# site-packages/astropy/io/fits/fitsrec.py in _convert_p(self, column, field, recformat)
#     792 
#     793         for idx in range(len(self)):
# --> 794             offset = field[idx, 1] + self._heapoffset
#     795             count = field[idx, 0]
#     796 

# IndexError: index 1 is out of bounds for axis 1 with size 1

I checked the file with fitsverify, which results in zero warnings and errors.

Thanks a lot for your help!

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:11 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
devonhollowoodcommented, Jan 22, 2019

I’ve noticed a few more problems besides those listed above. Specifically:

  • Variable-length character arrays are read as the deprecated chararray type, and thus display poorly. In the io.fits interface, they interfere with the table being displayed at all.
  • Tables containing variable-length arrays cannot be written to disk in the table interface, and the io.fits interface writes them incorrectly.

I’ve noticed this issue on both Linux and Mac OS. Tested with python versions 3.6.0 and 3.7.2, ipython version 3.7.2, astropy version 3.1.1, and numpy version 1.16.0.

@saimn I’m not sure if you are still working on this, but if not I’m happy to hack on this and try to submit a patch.


To reproduce:

  1. Use the attached vla-example.fits from astropy-fits-bug.tar.gz, or use this program to generate it.
    #include <fitsio.h>
    
    int main() {
        fitsfile *handle;
        int status = 0;
        fits_create_file(&handle, "!vla-example.fits", &status);
        char *colnames[3] = {"YEAR", "BEST_PICTURE", "BOX_OFFICE_GROSS"};
        char *colforms[3] = {"K", "1PA", "K"};
        fits_create_tbl(
            handle,
            BINARY_TBL, // table type
            3, // reserved rows
            3, // number of columns
            colnames, // column names
            colforms, // column forms
            NULL, // column units
            "BEST_PICTURE_WINNERS", // extension name
            &status
        );
        int year[3] = {2017, 2016, 2015};
        char *best_picture[3] = {"The Shape of Water", "Moonlight", "Spotlight"};
        int gross[3] = {195200000, 65300000, 98300000};
        fits_write_col(
            handle,
            TINT, // data type
            1, // col
            1, // first row
            1, // first element
            3, // number of elements
            year, // value to write
            &status
        );
        for (int i = 0; i < sizeof(best_picture) / sizeof(best_picture[0]); ++i) {
            // fits_write_col behaves a little strangely with VLAs
            // see https://heasarc.gsfc.nasa.gov/fitsio/c/c_user/node29.html
            fits_write_col(handle, TSTRING, 2, i+1, 1, 1, &best_picture[i], &status);
        }
        fits_write_col(handle, TINT, 3, 1, 1, 3, gross, &status);
        fits_close_file(handle, &status);
        if (status) {
            fits_report_error(stdout, status);
        }
    }
    
  2. Try to read it using the io.fits interface.
    In [1]: import astropy                                                                                                                                         
    
    In [2]: astropy.__version__                                                                                                                                    
    Out[2]: '3.1.1'
    
    In [3]: from astropy.io import fits                                                                                                                            
    
    In [4]: handle = fits.open('vla-example.fits')                                                                                                                 
    
    In [5]: t = handle[1].data                                                                                                                                     
    
    In [6]: t                                                                                                                                                      
    Out[6]: ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
        700                 type_pprinters=self.type_printers,
        701                 deferred_pprinters=self.deferred_printers)
    --> 702             printer.pretty(obj)
        703             printer.flush()
        704             return stream.getvalue()
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
        400                         if cls is not object \
        401                                 and callable(cls.__dict__.get('__repr__')):
    --> 402                             return _repr_pprint(obj, self, cycle)
        403 
        404             return _default_pprint(obj, self, cycle)
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
        695     """A pprint that just redirects to the normal repr function."""
        696     # Find newlines and replace them with p.break_()
    --> 697     output = repr(obj)
        698     for idx,output_line in enumerate(output.splitlines()):
        699         if idx:
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/fitsrec.py in __repr__(self)
        478         # Force use of the normal ndarray repr (rather than the new
        479         # one added for recarray in Numpy 1.10) for backwards compat
    --> 480         return np.ndarray.__repr__(self)
        481 
        482     def __getitem__(self, key):
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in _array_repr_implementation(arr, max_line_width, precision, suppress_small, array2string)                                                                                                                             
       1417     elif arr.size > 0 or arr.shape == (0,):
       1418         lst = array2string(arr, max_line_width, precision, suppress_small,
    -> 1419                            ', ', prefix, suffix=suffix)
       1420     else:  # show zero-length shape unless it is (0,)                                                                                                  
       1421         lst = "[], shape=%s" % (repr(arr.shape),)
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in array2string(a, max_line_width, precision, suppress_small, separator, prefix, style, formatter, threshold, edgeitems, sign, floatmode, suffix, **kwarg)                                                              
        688         return "[]"
        689 
    --> 690     return _array2string(a, options, separator, prefix)
        691 
        692 
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in wrapper(self, *args, **kwargs)
        468             repr_running.add(key)
        469             try:
    --> 470                 return f(self, *args, **kwargs)
        471             finally:
        472                 repr_running.discard(key)
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in _array2string(a, options, separator, prefix)
        503     lst = _formatArray(a, format_function, options['linewidth'],
        504                        next_line_prefix, separator, options['edgeitems'],
    --> 505                        summary_insert, options['legacy'])
        506     return lst
        507 
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in _formatArray(a, format_function, line_width, next_line_prefix, separator, edge_items, summary_insert, legacy)                                                                                                        
        816         return recurser(index=(),
        817                         hanging_indent=next_line_prefix,
    --> 818                         curr_width=line_width)
        819     finally:
        820         # recursive closures have a cyclic reference to themselves, which
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in recurser(index, hanging_indent, curr_width)
        770 
        771             for i in range(trailing_items, 1, -1):
    --> 772                 word = recurser(index + (-i,), next_hanging_indent, next_width)
        773                 s, line = _extendLine(
        774                     s, line, word, elem_width, hanging_indent, legacy)
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in recurser(index, hanging_indent, curr_width)
        724 
        725         if axes_left == 0:
    --> 726             return format_function(a[index])
        727 
        728         # when recursing, add a space to align with the [ added, and reduce the
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in __call__(self, x)
       1301         str_fields = [
       1302             format_function(field)
    -> 1303             for field, format_function in zip(x, self.format_functions)
       1304         ]
       1305         if len(str_fields) == 1:
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in <listcomp>(.0)
       1301         str_fields = [
       1302             format_function(field)
    -> 1303             for field, format_function in zip(x, self.format_functions)
       1304         ]
       1305         if len(str_fields) == 1:
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in __call__(self, arr)
       1269     def __call__(self, arr):
       1270         if arr.ndim <= 1:
    -> 1271             return "[" + ", ".join(self.format_function(a) for a in arr) + "]"
       1272         return "[" + ", ".join(self.__call__(a) for a in arr) + "]"
       1273 
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in <genexpr>(.0)
       1269     def __call__(self, arr):
       1270         if arr.ndim <= 1:
    -> 1271             return "[" + ", ".join(self.format_function(a) for a in arr) + "]"
       1272         return "[" + ", ".join(self.__call__(a) for a in arr) + "]"
       1273 
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/numpy/core/arrayprint.py in __call__(self, x)
       1143 
       1144     def __call__(self, x):
    -> 1145         return self.format % x
       1146 
       1147 
    
    TypeError: %d format: a number is required, not str
    
    In [7]: t['BEST_PICTURE']                                                                                                                                      
    Out[7]: 
    _VLF([chararray(['T', 'h', 'e', '', 'S', 'h', 'a', 'p', 'e', '', 'o', 'f', '',
               'W', 'a', 't', 'e', 'r'], dtype='<U1'),
          chararray(['M', 'o', 'o', 'n', 'l', 'i', 'g', 'h', 't'], dtype='<U1'),
          chararray(['S', 'p', 'o', 't', 'l', 'i', 'g', 'h', 't'], dtype='<U1')],
         dtype=object)
    
  3. Try to write it and look at the output
    In [8]: handle.writeto('output.fits')
    
    In [9]: # output.fits contains corrupted data, see attached.
    
  4. Try to read it using the table interface. (Here I’m starting a new ipython session for clarity.)
    In [1]: import astropy                                                                                                                                         
    
    In [2]: astropy.__version__                                                                                                                                    
    Out[2]: '3.1.1'
    
    In [3]: from astropy import table                                                                                                                              
    
    In [4]: t = table.Table.read('vla-example.fits')                                                                                                               
    
    In [5]: t                                                                                                                                                      
    Out[5]: 
    <Table length=3>
     YEAR                              BEST_PICTURE                              BOX_OFFICE_GROSS
    int64                                 object                                      int64      
    ----- ---------------------------------------------------------------------- ----------------
     2017 ['T' 'h' 'e' '' 'S' 'h' 'a' 'p' 'e' '' 'o' 'f' '' 'W' 'a' 't' 'e' 'r']        195200000
     2016                                  ['M' 'o' 'o' 'n' 'l' 'i' 'g' 'h' 't']         65300000
     2015                                  ['S' 'p' 'o' 't' 'l' 'i' 'g' 'h' 't']         98300000
    
  5. Try to write it back out to a FITS file using the table interface.
    In [6]: t.write('output.fits')                                                                                                                                 
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-6-ff1bebe517f2> in <module>
    ----> 1 t.write('output.fits')
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/table/table.py in write(self, *args, **kwargs)
       2592         serialize_method = kwargs.pop('serialize_method', None)
       2593         with serialize_method_as(self, serialize_method):
    -> 2594             io_registry.write(self, *args, **kwargs)
       2595 
       2596     def copy(self, copy_data=True):
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/registry.py in write(data, format, *args, **kwargs)
        558 
        559     writer = get_writer(format, data.__class__)
    --> 560     writer(data, *args, **kwargs)
        561 
        562 
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/connect.py in write_table_fits(input, output, overwrite)
        386     input = _encode_mixins(input)
        387 
    --> 388     table_hdu = table_to_hdu(input, character_as_bytes=True)
        389 
        390     # Check if output file already exists
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/convenience.py in table_to_hdu(table, character_as_bytes)
        495             col.null = fill_value.astype(table[col.name].dtype)
        496     else:
    --> 497         table_hdu = BinTableHDU.from_columns(np.array(table.filled()), header=hdr, character_as_bytes=character_as_bytes)
        498 
        499     # Set units and format display for output HDU
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/hdu/table.py in from_columns(cls, columns, header, nrows, fill, character_as_bytes, **kwargs)
        123         """
        124 
    --> 125         coldefs = cls._columns_type(columns)
        126         data = FITS_rec.from_columns(coldefs, nrows=nrows, fill=fill,
        127                                      character_as_bytes=character_as_bytes)
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/column.py in __init__(self, input, ascii)
       1373         elif isinstance(input, np.ndarray) and input.dtype.fields is not None:
       1374             # Construct columns from the fields of a record array
    -> 1375             self._init_from_array(input)
       1376         elif isiterable(input):
       1377             # if the input is a list of Columns
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/column.py in _init_from_array(self, array)
       1408             cname = array.dtype.names[idx]
       1409             ftype = array.dtype.fields[cname][0]
    -> 1410             format = self._col_format_cls.from_recformat(ftype)
       1411 
       1412             # Determine the appropriate dimensions for items in the column
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/column.py in from_recformat(cls, recformat)
        271         """Creates a column format from a Numpy record dtype format."""
        272 
    --> 273         return cls(_convert_format(recformat, reverse=True))
        274 
        275     @lazyproperty
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/column.py in _convert_format(format, reverse)
       2398 
       2399     if reverse:
    -> 2400         return _convert_record2fits(format)
       2401     else:
       2402         return _convert_fits2record(format)
    
    ~/Programming/matcha/post-pipeline/python/matcha/lib/python3.7/site-packages/astropy/io/fits/column.py in _convert_record2fits(format)
       2361         output_format = repeat + NUMPY2FITS[recformat]
       2362     else:
    -> 2363         raise ValueError('Illegal format `{}`.'.format(format))
       2364 
       2365     return output_format
    
    ValueError: Illegal format `object`.
    
1reaction
pllimcommented, Sep 18, 2018

complicated… side effects…

Sounds about right for FITS. 😬

Read more comments on GitHub >

github_iconTop Results From Across the Web

4.10 Variable-Length Arrays in Binary Tables - HEASARC
All the data in a variable length field is written into an area called the `heap' which follows the main fixed-length FITS binary...
Read more >
Less Familiar Objects — Astropy v5.1.1
In this chapter, we will discuss less frequently used FITS data structures. They include ASCII tables, variable length tables, and random access group...
Read more >
readFITSbintable: Read a FITS binary table in FITSio - Rdrr.io
Read a FITS binary table from an open connection to a FITS file. ... Other lengths are not supported. ... Vector of column...
Read more >
MRDFITS - L3HarrisGeospatial.com
FITS scaling keywords will be modified. ... binary tables. /POINTER_VAR- Use pointer arrays for variable length columns. ... first). ... So that the...
Read more >
Binary table extension to FITS. - NASA/ADS
Binary table extension to FITS reader can decide if it wants (or ... The data for the variable length arrays in a table...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found