ECSV reader fails on datatype=object column created with astropy < 4.3
See original GitHub issueDescription
Reading the following ECSV table
# %ECSV 0.9
# ---
# datatype:
# - {name: source, datatype: object}
# - {name: obstime, datatype: string}
# - {name: ra, datatype: float64}
# - {name: dec, datatype: float64}
# - {name: pa_v3, datatype: float64}
# schema: astropy-2.0
source obstime ra dec pa_v3
None 2016-01-18T15:25:23.776 133.3599835511599 -14.886980131752205 -161.3183607231063
which was written out by a recent (and current) release version of astropy now fails when read via the current astropy dev:
In [1]: import astropy
In [2]: print(astropy.__version__)
4.3.dev1662+gdf850cb7c
In [4]: from astropy.table import Table
In [5]: table = Table.read("jwst/lib/tests/data/v1_calc_truth.ecsv")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-2e4749b2facf> in <module>
----> 1 table = Table.read("jwst/lib/tests/data/v1_calc_truth.ecsv")
~/miniconda3/envs/dev/lib/python3.9/site-packages/astropy/table/connect.py in __call__(self, *args, **kwargs)
59 descriptions = kwargs.pop('descriptions', None)
60
---> 61 out = registry.read(cls, *args, **kwargs)
62
63 # For some readers (e.g., ascii.ecsv), the returned `out` class is not
~/miniconda3/envs/dev/lib/python3.9/site-packages/astropy/io/registry.py in read(cls, format, cache, *args, **kwargs)
525
526 reader = get_reader(format, cls)
--> 527 data = reader(*args, **kwargs)
528
529 if not isinstance(data, cls):
~/miniconda3/envs/dev/lib/python3.9/site-packages/astropy/io/ascii/connect.py in io_read(format, filename, **kwargs)
16 format = re.sub(r'^ascii\.', '', format)
17 kwargs['format'] = format
---> 18 return read(filename, **kwargs)
19
20
~/miniconda3/envs/dev/lib/python3.9/site-packages/astropy/io/ascii/ui.py in read(table, guess, **kwargs)
367 else:
368 reader = get_reader(**new_kwargs)
--> 369 dat = reader.read(table)
370 _read_trace.append({'kwargs': copy.deepcopy(new_kwargs),
371 'Reader': reader.__class__,
~/miniconda3/envs/dev/lib/python3.9/site-packages/astropy/io/ascii/core.py in read(self, table)
1336
1337 # Get the table column definitions
-> 1338 self.header.get_cols(self.lines)
1339
1340 # Make sure columns are valid
~/miniconda3/envs/dev/lib/python3.9/site-packages/astropy/io/ascii/ecsv.py in get_cols(self, lines)
184 col.dtype = header_cols[col.name]['datatype']
185 if col.dtype not in ECSV_DATATYPES:
--> 186 raise ValueError(f'datatype {col.dtype!r} of column {col.name!r} '
187 f'is not in allowed values {ECSV_DATATYPES}')
188
ValueError: datatype 'object' of column 'source' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')
Expected behavior
Under the release version (astropy 4.2.1) it reads in fine without issue.
This looks like an issue where numpy object types can no longer be serialized, and of course None
is an object for numpy.
A workaround fix is to modify by hand the ECSV file header, changing the column dtype from object
to string
, and all works fine. But this is a backwards incompatibility in the ECSV reader, and will mean that many (some? few? 😂 ) tables written out by the current release of astropy will no longer be able to be read in in version 4.3.
Perhaps casting to string and raising a big DeprecationWarning
might be better? Or something else? Clearly the general feature of serializing python objects is not possible in ECSV format, but clearly it was allowed before (and currently) in release versions.
The break seems to have happened in https://github.com/astropy/astropy/commit/e807dbff9a5c72bdc42d18c7d6712aae69a0bddc (EDIT: #11569)
System Details
>>> import platform; print(platform.platform())
macOS-10.15.7-x86_64-i386-64bit
>>> import sys; print("Python", sys.version)
Python 3.9.5 (default, May 18 2021, 12:31:01)
[Clang 10.0.0 ]
>>> import numpy; print("Numpy", numpy.__version__)
Numpy 1.20.3
>>> import astropy; print("astropy", astropy.__version__)
astropy 4.3.dev1662+gdf850cb7c
>>> import scipy; print("Scipy", scipy.__version__)
Scipy 1.6.3
>>> import matplotlib; print("Matplotlib", matplotlib.__version__)
Matplotlib 3.4.2
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (5 by maintainers)
Top GitHub Comments
Hi all, I just wanted to chime in on this. I encountered this issue as well, but nothing jumped out at me from checking the astropy release notes. So I’d advocate for making this issue more visible to users.
As a complement to the workaround code provided by @taldcroft, this shell command worked for me to fix incorrectly written ecsv’s (tested on MacOS):
This issue was labeled as Close?. Remove Close? label or this will be closed after 7 days. (This is currently a dry-run.)