io.fits errors with tables containing strings
See original GitHub issueThe following code contains a set of unit tests for creating FITS binary tables with strings. They pass under python 2.7 but three of the five fail under python 3.5. This is also available as a public gist but I don’t promise to keep that around unchanged forever, so the code is also copy and pasted below.
The tests that fail are:
- writing a numpy structured array with a column of strings using
astropy.io.fits.writeto
and reading back in withastropy.io.fits.getdata
: the data are munged. This works when writing the same data converted to a Table object, and also works when creating the BinTableHDU from columns by hand. This test is a demonstration of the failure reported in #5268 . - writing then reading a numpy structured array containing a 2D column of strings using
astropy.io.fits.writeto
and reading back in withastropy.io.fits.getdata
: the dtype seems to have mixed up the number of unicode bytes vs. the number of characters to write. - writing then reading an astropy.table.Table containing a 2D column of strings: same problem with the dtype
import unittest
from uuid import uuid4
import os
import numpy as np
from astropy.io import fits
from astropy.table import Table, Column
class _TestTable(unittest.TestCase):
'''
Base class for testing FITS tables.
Subclass this and add setupClass to define attributes to test:
data (numpy structured array)
table (astropy Table)
'''
def setUp(self):
self.testfile = 'test-{}.fits'.format(uuid4())
def tearDown(self):
if os.path.exists(self.testfile):
os.remove(self.testfile)
def test_fits(self):
fits.writeto(self.testfile, self.data)
dx = fits.getdata(self.testfile)
self.assertEqual(self.data['x'].dtype, dx['x'].dtype)
self.assertEqual(self.data['y'].dtype, dx['y'].dtype)
self.assertTrue(np.all(self.data['x'] == dx['x']), 'x: {} != {}'.format(self.data['x'], dx['x']))
self.assertTrue(np.all(self.data['y'] == dx['y']), 'y: {} != {}'.format(self.data['y'], dx['y']))
def test_table(self):
self.table.write(self.testfile)
tx = Table.read(self.testfile)
self.assertEqual(self.table['x'].dtype, tx['x'].dtype)
self.assertEqual(self.table['y'].dtype, tx['y'].dtype)
self.assertTrue(np.all(self.table['x'] == tx['x']), 'x: {} != {}'.format(self.table['x'], tx['x']))
self.assertTrue(np.all(self.table['y'] == tx['y']), 'y: {} != {}'.format(self.table['y'], tx['y']))
class TestTable1DColumn(_TestTable):
@classmethod
def setUpClass(cls):
dtype = [
('x', (str, 5)), #- 1D column of 5-character strings
('y', (str, 3)), #- !D column of 3-character strings
]
data = np.zeros(2, dtype=dtype)
data['x'] = ['abcde', 'xyz']
data['y'] = ['A', 'BCX']
table = Table(data)
cls.data = data
cls.table = table
def test_manual_fits(self):
'''Construct the BinTableHDU by hand'''
col1 = fits.Column(name='x', format='5A', array=self.data['x'])
col2 = fits.Column(name='y', format='3A', array=self.data['y'])
cols = fits.ColDefs([col1, col2])
tbhdu = fits.BinTableHDU.from_columns(cols)
tbhdu.writeto(self.testfile)
dx = fits.getdata(self.testfile)
self.assertEqual(self.data['x'].dtype, dx['x'].dtype)
self.assertEqual(self.data['y'].dtype, dx['y'].dtype)
self.assertTrue(np.all(self.data['x'] == dx['x']), 'x: {} != {}'.format(self.data['x'], dx['x']))
self.assertTrue(np.all(self.data['y'] == dx['y']), 'y: {} != {}'.format(self.data['y'], dx['y']))
class TestTable2DColumn(_TestTable):
@classmethod
def setUpClass(cls):
dtype = [
('x', (str, 5)), #- 1D column of 5-character strings
('y', (str, 3), (4,)), #- 2D column; each row is four 3-char strings
]
data = np.zeros(2, dtype=dtype)
data['x'] = ['abcde', 'xyz']
data['y'][0] = ['A', 'BC', 'DEF', '123']
data['y'][1] = ['X', 'YZ', 'PQR', '999']
table = Table(data)
cls.data = data
cls.table = table
#- I'm not sure how to create 2D column of strings by hand analogous to
#- the 1D test case; see https://github.com/astropy/astropy/issues/5279
# def test_manual_fits(self):
# '''Construct the BinTableHDU by hand
#
# When these data are successfully written under py2.7, they have keywords
# TTYPE1 = 'x '
# TFORM1 = '5A '
# TTYPE2 = 'y '
# TFORM2 = '12A '
# TDIM2 = '(3,4) '
#
# Trying to replicate that logic here with dim=(3,4) doesn't work
# '''
# col1 = fits.Column(name='x', format='5A', array=self.data['x'])
# col2 = fits.Column(name='y', format='12A', array=self.data['y'], dim=(3,4))
# cols = fits.ColDefs([col1, col2])
# tbhdu = fits.BinTableHDU.from_columns(cols)
# tbhdu.writeto(self.testfile)
# dx = fits.getdata(self.testfile)
# self.assertEqual(self.data['x'].dtype, dx['x'].dtype)
# self.assertEqual(self.data['y'].dtype, dx['y'].dtype)
# self.assertTrue(np.all(self.data['x'] == dx['x']), 'x: {} != {}'.format(self.data['x'], dx['x']))
# self.assertTrue(np.all(self.data['y'] == dx['y']), 'y: {} != {}'.format(self.data['y'], dx['y']))
if __name__ == '__main__':
### unittest.main()
t1 = unittest.defaultTestLoader.loadTestsFromTestCase(TestTable1DColumn)
t2 = unittest.defaultTestLoader.loadTestsFromTestCase(TestTable2DColumn)
unittest.TextTestRunner(verbosity=2).run(unittest.TestSuite([t1,t2]))
Running these pass under python 2.7 but fail under python 3.5 with:
(py3) py3 $ python test-fits-table.py
test_fits (__main__.TestTable1DColumn) ... WARNING: File may have been truncated: actual file length (8636) is smaller than the expected size (8640) [astropy.io.fits.file]
FAIL
test_manual_fits (__main__.TestTable1DColumn)
Construct the BinTableHDU by hand ... ok
test_table (__main__.TestTable1DColumn) ... ok
test_fits (__main__.TestTable2DColumn) ... WARNING: File may have been truncated: actual file length (8566) is smaller than the expected size (8640) [astropy.io.fits.file]
FAIL
test_table (__main__.TestTable2DColumn) ... FAIL
======================================================================
FAIL: test_fits (__main__.TestTable1DColumn)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test-fits-table.py", line 30, in test_fits
self.assertTrue(np.all(self.data['x'] == dx['x']), 'x: {} != {}'.format(self.data['x'], dx['x']))
AssertionError: False is not true : x: ['abcde' 'xyz'] != ['abcde' 'zBCX']
======================================================================
FAIL: test_fits (__main__.TestTable2DColumn)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test-fits-table.py", line 29, in test_fits
self.assertEqual(self.data['y'].dtype, dx['y'].dtype)
AssertionError: dtype('<U3') != dtype('<U12')
======================================================================
FAIL: test_table (__main__.TestTable2DColumn)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test-fits-table.py", line 37, in test_table
self.assertEqual(self.table['y'].dtype, tx['y'].dtype)
AssertionError: dtype('<U3') != dtype('<U12')
----------------------------------------------------------------------
Ran 5 tests in 0.100s
FAILED (failures=3)
EDIT: Added syntax highlighting.
Issue Analytics
- State:
- Created 7 years ago
- Comments:12 (9 by maintainers)
Top Results From Across the Web
Table Data — Astropy v5.2
The data in a FITS table HDU is basically a record array with added attributes. The metadata (i.e., information about the table data)...
Read more >astropy.io.fits FAQ
How do I create a multi-extension FITS file from scratch? Why is an image containing integer data being converted unexpectedly to floats? Why...
Read more >FITS File handling (astropy.io.fits)
The keyword and comment must both be strings, whereas the value can be a string or an integer, floating point number, complex number,...
Read more >mwrfits.pro
Please convert ; byte arrays to strings before calling MWRFITS ; --Complex ... 'MWRFITS Error: ASCII table cannot contain arrays' return endif ctypes[i] ......
Read more >IO tools (text, CSV, HDF5, …) — pandas 1.5.2 documentation
If your CSV file contains columns with a mixture of timezones, the default result will be an object-dtype column with strings, even with...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Wow. Just. Dang. Cripes. No wonder no one wants to take over this code.
@weaverba137 here?