No longer able to read BinnedTimeSeries with datetime column saved as ECSV after upgrading from 4.2.1 -> 5.0+
See original GitHub issueDescription
Hi, This commit merged in PR #11569 breaks my ability to read an ECSV file created using Astropy v 4.2.1, BinnedTimeSeries class’s write method, which has a datetime64 column. Downgrading astropy back to 4.2.1 fixes the issue because the strict type checking in line 177 of ecsv.py is not there.
Is there a reason why this strict type checking was added to ECSV? Is there a way to preserve reading and writing of ECSV files created with BinnedTimeSeries across versions? I am happy to make a PR on this if the strict type checking is allowed to be scaled back or we can add datetime64 as an allowed type.
Expected behavior
The file is read into a BinnedTimeSeries
object from ecsv file without error.
Actual behavior
ValueError is produced and the file is not read because ECSV.py does not accept the datetime64 column.
ValueError: datatype 'datetime64' of column 'time_bin_start' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')
Steps to Reproduce
The file is read using:
BinnedTimeSeries.read('<file_path>', format='ascii.ecsv')
which gives a long error.
The file in question is a binned time series created by astropy.timeseries.aggregate_downsample
. which itself is a binned version of an astropy.timeseries.TimeSeries
instance with some TESS data. (loaded via TimeSeries.from_pandas(Tess.set_index(‘datetime’)). I.e., it has a datetime64 index. The file was written using the classes own .write method in Astropy V4.2.1 from an instance of said class:
myBinnedTimeSeries.write('<file_path>',format='ascii.ecsv',overwrite=True)
I’ll attach a concatenated version of the file (as it contains private data). However, the relevant part from the header is on line 4:
# %ECSV 0.9
# ---
# datatype:
# - {name: time_bin_start, datatype: datetime64}
as you can see, the datatype is datetime64. This works fine with ECSV V0.9 but not V1.0 as some sort of strict type checking was added.
Full error log:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [3], in <module>
---> 49 tsrbin = BinnedTimeSeries.read('../Photometry/tsr_bin.dat', format='ascii.ecsv')
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/timeseries/binned.py:285, in BinnedTimeSeries.read(self, filename, time_bin_start_column, time_bin_end_column, time_bin_size_column, time_bin_size_unit, time_format, time_scale, format, *args, **kwargs)
230 """
231 Read and parse a file and returns a `astropy.timeseries.BinnedTimeSeries`.
232
(...)
279
280 """
282 try:
283
284 # First we try the readers defined for the BinnedTimeSeries class
--> 285 return super().read(filename, format=format, *args, **kwargs)
287 except TypeError:
288
289 # Otherwise we fall back to the default Table readers
291 if time_bin_start_column is None:
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/table/connect.py:62, in TableRead.__call__(self, *args, **kwargs)
59 units = kwargs.pop('units', None)
60 descriptions = kwargs.pop('descriptions', None)
---> 62 out = self.registry.read(cls, *args, **kwargs)
64 # For some readers (e.g., ascii.ecsv), the returned `out` class is not
65 # guaranteed to be the same as the desired output `cls`. If so,
66 # try coercing to desired class without copying (io.registry.read
67 # would normally do a copy). The normal case here is swapping
68 # Table <=> QTable.
69 if cls is not out.__class__:
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/registry/core.py:199, in UnifiedInputRegistry.read(self, cls, format, cache, *args, **kwargs)
195 format = self._get_valid_format(
196 'read', cls, path, fileobj, args, kwargs)
198 reader = self.get_reader(format, cls)
--> 199 data = reader(*args, **kwargs)
201 if not isinstance(data, cls):
202 # User has read with a subclass where only the parent class is
203 # registered. This returns the parent class, so try coercing
204 # to desired subclass.
205 try:
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/connect.py:18, in io_read(format, filename, **kwargs)
16 format = re.sub(r'^ascii\.', '', format)
17 kwargs['format'] = format
---> 18 return read(filename, **kwargs)
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/ui.py:376, in read(table, guess, **kwargs)
374 else:
375 reader = get_reader(**new_kwargs)
--> 376 dat = reader.read(table)
377 _read_trace.append({'kwargs': copy.deepcopy(new_kwargs),
378 'Reader': reader.__class__,
379 'status': 'Success with specified Reader class '
380 '(no guessing)'})
382 # Static analysis (pyright) indicates `dat` might be left undefined, so just
383 # to be sure define it at the beginning and check here.
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/core.py:1343, in BaseReader.read(self, table)
1340 self.header.update_meta(self.lines, self.meta)
1342 # Get the table column definitions
-> 1343 self.header.get_cols(self.lines)
1345 # Make sure columns are valid
1346 self.header.check_column_names(self.names, self.strict_names, self.guessing)
File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/ecsv.py:177, in EcsvHeader.get_cols(self, lines)
175 col.dtype = header_cols[col.name]['datatype']
176 if col.dtype not in ECSV_DATATYPES:
--> 177 raise ValueError(f'datatype {col.dtype!r} of column {col.name!r} '
178 f'is not in allowed values {ECSV_DATATYPES}')
180 # Subtype is written like "int64[2,null]" and we want to split this
181 # out to "int64" and [2, None].
182 subtype = col.subtype
ValueError: datatype 'datetime64' of column 'time_bin_start' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')
System Details
(For the version that does not work) Python 3.10.2 | packaged by conda-forge | (main, Feb 1 2022, 19:28:35) [GCC 9.4.0] Numpy 1.22.2 pyerfa 2.0.0.1 astropy 5.0.1 Scipy 1.8.0 Matplotlib 3.5.1
(For the version that does work) Python 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] Numpy 1.20.3 pyerfa 2.0.0.1 astropy 4.2.1 Scipy 1.7.0 Matplotlib 3.4.2
Issue Analytics
- State:
- Created 2 years ago
- Comments:20 (20 by maintainers)
Top GitHub Comments
I would be highly supportive of a backwards compatibility bugfix for V0.9, and then an API change for V5.1 that changes the spec. I would be willing to work on a PR for it.
@weaverba137 @emirkmo - sorry that the updates in ECSV reading are breaking back-compatibility, I am definitely sensitive to that. Perhaps we can do a bug-fix release which checks for ECSV 0.9 (as opposed to 1.0) and silently reads them without warnings. This will work for files written with older astropy.
@weaverba137 -
can you provide an example file with an[EDIT - I saw the example and read the discussion in the linked issue]. Going forward (astropy >= 5.0),object
column?object
columns are written (and read) as described at https://github.com/astropy/astropy-APEs/blob/main/APE6.rst#object-columns. This is limited to object types that can be serialized to standard JSON (without any custom representations).