question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

No longer able to read BinnedTimeSeries with datetime column saved as ECSV after upgrading from 4.2.1 -> 5.0+

See original GitHub issue

Description

Hi, This commit merged in PR #11569 breaks my ability to read an ECSV file created using Astropy v 4.2.1, BinnedTimeSeries class’s write method, which has a datetime64 column. Downgrading astropy back to 4.2.1 fixes the issue because the strict type checking in line 177 of ecsv.py is not there.

Is there a reason why this strict type checking was added to ECSV? Is there a way to preserve reading and writing of ECSV files created with BinnedTimeSeries across versions? I am happy to make a PR on this if the strict type checking is allowed to be scaled back or we can add datetime64 as an allowed type.

Expected behavior

The file is read into a BinnedTimeSeries object from ecsv file without error.

Actual behavior

ValueError is produced and the file is not read because ECSV.py does not accept the datetime64 column. ValueError: datatype 'datetime64' of column 'time_bin_start' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')

Steps to Reproduce

The file is read using:
BinnedTimeSeries.read('<file_path>', format='ascii.ecsv') which gives a long error.

The file in question is a binned time series created by astropy.timeseries.aggregate_downsample. which itself is a binned version of an astropy.timeseries.TimeSeries instance with some TESS data. (loaded via TimeSeries.from_pandas(Tess.set_index(‘datetime’)). I.e., it has a datetime64 index. The file was written using the classes own .write method in Astropy V4.2.1 from an instance of said class:
myBinnedTimeSeries.write('<file_path>',format='ascii.ecsv',overwrite=True)

I’ll attach a concatenated version of the file (as it contains private data). However, the relevant part from the header is on line 4:

# %ECSV 0.9
# ---
# datatype:
# - {name: time_bin_start, datatype: datetime64}

as you can see, the datatype is datetime64. This works fine with ECSV V0.9 but not V1.0 as some sort of strict type checking was added.

Full error log:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [3], in <module>
---> 49 tsrbin = BinnedTimeSeries.read('../Photometry/tsr_bin.dat', format='ascii.ecsv')

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/timeseries/binned.py:285, in BinnedTimeSeries.read(self, filename, time_bin_start_column, time_bin_end_column, time_bin_size_column, time_bin_size_unit, time_format, time_scale, format, *args, **kwargs)
    230 """
    231 Read and parse a file and returns a `astropy.timeseries.BinnedTimeSeries`.
    232 
   (...)
    279 
    280 """
    282 try:
    283 
    284     # First we try the readers defined for the BinnedTimeSeries class
--> 285     return super().read(filename, format=format, *args, **kwargs)
    287 except TypeError:
    288 
    289     # Otherwise we fall back to the default Table readers
    291     if time_bin_start_column is None:

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/table/connect.py:62, in TableRead.__call__(self, *args, **kwargs)
     59 units = kwargs.pop('units', None)
     60 descriptions = kwargs.pop('descriptions', None)
---> 62 out = self.registry.read(cls, *args, **kwargs)
     64 # For some readers (e.g., ascii.ecsv), the returned `out` class is not
     65 # guaranteed to be the same as the desired output `cls`.  If so,
     66 # try coercing to desired class without copying (io.registry.read
     67 # would normally do a copy).  The normal case here is swapping
     68 # Table <=> QTable.
     69 if cls is not out.__class__:

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/registry/core.py:199, in UnifiedInputRegistry.read(self, cls, format, cache, *args, **kwargs)
    195     format = self._get_valid_format(
    196         'read', cls, path, fileobj, args, kwargs)
    198 reader = self.get_reader(format, cls)
--> 199 data = reader(*args, **kwargs)
    201 if not isinstance(data, cls):
    202     # User has read with a subclass where only the parent class is
    203     # registered.  This returns the parent class, so try coercing
    204     # to desired subclass.
    205     try:

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/connect.py:18, in io_read(format, filename, **kwargs)
     16     format = re.sub(r'^ascii\.', '', format)
     17     kwargs['format'] = format
---> 18 return read(filename, **kwargs)

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/ui.py:376, in read(table, guess, **kwargs)
    374     else:
    375         reader = get_reader(**new_kwargs)
--> 376         dat = reader.read(table)
    377         _read_trace.append({'kwargs': copy.deepcopy(new_kwargs),
    378                             'Reader': reader.__class__,
    379                             'status': 'Success with specified Reader class '
    380                                       '(no guessing)'})
    382 # Static analysis (pyright) indicates `dat` might be left undefined, so just
    383 # to be sure define it at the beginning and check here.

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/core.py:1343, in BaseReader.read(self, table)
   1340 self.header.update_meta(self.lines, self.meta)
   1342 # Get the table column definitions
-> 1343 self.header.get_cols(self.lines)
   1345 # Make sure columns are valid
   1346 self.header.check_column_names(self.names, self.strict_names, self.guessing)

File ~/Apps/miniconda3/envs/py310_latest/lib/python3.10/site-packages/astropy/io/ascii/ecsv.py:177, in EcsvHeader.get_cols(self, lines)
    175 col.dtype = header_cols[col.name]['datatype']
    176 if col.dtype not in ECSV_DATATYPES:
--> 177     raise ValueError(f'datatype {col.dtype!r} of column {col.name!r} '
    178                      f'is not in allowed values {ECSV_DATATYPES}')
    180 # Subtype is written like "int64[2,null]" and we want to split this
    181 # out to "int64" and [2, None].
    182 subtype = col.subtype

ValueError: datatype 'datetime64' of column 'time_bin_start' is not in allowed values ('bool', 'int8', 'int16', 'int32', 'int64', 'uint8', 'uint16', 'uint32', 'uint64', 'float16', 'float32', 'float64', 'float128', 'string')

System Details

(For the version that does not work) Python 3.10.2 | packaged by conda-forge | (main, Feb 1 2022, 19:28:35) [GCC 9.4.0] Numpy 1.22.2 pyerfa 2.0.0.1 astropy 5.0.1 Scipy 1.8.0 Matplotlib 3.5.1

(For the version that does work) Python 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] Numpy 1.20.3 pyerfa 2.0.0.1 astropy 4.2.1 Scipy 1.7.0 Matplotlib 3.4.2

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:20 (20 by maintainers)

github_iconTop GitHub Comments

1reaction
emirkmocommented, Feb 18, 2022

I would be highly supportive of a backwards compatibility bugfix for V0.9, and then an API change for V5.1 that changes the spec. I would be willing to work on a PR for it.

1reaction
taldcroftcommented, Feb 18, 2022

@weaverba137 @emirkmo - sorry that the updates in ECSV reading are breaking back-compatibility, I am definitely sensitive to that. Perhaps we can do a bug-fix release which checks for ECSV 0.9 (as opposed to 1.0) and silently reads them without warnings. This will work for files written with older astropy.

@weaverba137 - can you provide an example file with an object column? [EDIT - I saw the example and read the discussion in the linked issue]. Going forward (astropy >= 5.0), object columns are written (and read) as described at https://github.com/astropy/astropy-APEs/blob/main/APE6.rst#object-columns. This is limited to object types that can be serialized to standard JSON (without any custom representations).

Read more comments on GitHub >

github_iconTop Results From Across the Web

datetime format change when save to csv file python
The answer to this problem is a two-step process: 1) Convert the timestamp column to date time ...
Read more >
Formatting Dates for CSV Imports - Administration - Echo360
Use the instructions below to change the date format in your spreadsheet. Basic steps: select the date column, right-click and select Format ...
Read more >
Read and write to files - Michael Fuchs Python
Read csv File from External URL; Skip Last 5 Rows While Importing csv; Read only first 5 or more rows; Read only specific...
Read more >
4 tricks you should know to parse date columns with Pandas ...
4 tricks you should know to parse date columns with Pandas read_csv() · 1. Reading date columns from a CSV file · 2....
Read more >
Import or export text (.txt or .csv) files - Microsoft Support
If Excel doesn't convert a particular column of data to the format that you want, then you can convert the data after you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found