uninitialized array leads to failure in tz_localize ("UserWarning: Inferring time-zone from CET in column __null_dask_index__ failed, using time-zone-agnostic")
See original GitHub issueWhat happened:
Cannot get proper tz while reading a dataframe written with tz=“CET” (get naive datetimeindex instead).
Issue probably due to use of np.empty
instead of np.zeros
in https://github.com/dask/fastparquet/blob/master/fastparquet/dataframe.py#L100 which leads to an exception in https://github.com/dask/fastparquet/blob/master/fastparquet/dataframe.py#L103 as pandas cannot infer the right timezone as the input is garbage (not initalised to zero).
Minimal Complete Verifiable Example:
Difficult to come with as depends on the content of np.empty
.
issue fixed when replacing np.empty
by np.zeros
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
pandas.Series.dt.tz_localize — pandas 1.5.2 documentation
This method takes a time zone (tz) naive Datetime Array/Index object and makes this time zone aware. It does not move the time...
Read more >Convert pandas timezone-aware DateTimeIndex to naive ...
Some context on the reason I am asking this: I want to work with timezone naive timeseries (to avoid the extra hassle with...
Read more >Parameterized dtype DatetimeTZDtype failing when used as a ...
I'm not sure if from_str_alias supports datetimes with timezones. I get this same error with infer_schema function call on a dataset that has...
Read more >14 Time Zone Handling — Pandas Doc - GitHub Pages
To supply the time zone, you can use the tz keyword to date_range and ... fail as it contains ambiguous times and the...
Read more >Pandas Series: dt.tz_localize() function - w3resource
This method takes a time zone (tz) naive Datetime Array/Index object and makes this time zone aware. It does not move the time...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
cc @jsignell - I forget the dask issue, but it seems we do do timezones, but probably this isn’t applied to the max/min statistics values.
I don’t think it will fix that - the timezone needs to be applied to the “statistics” values, not to the data values as here.