question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Natural gaps in timeseries observations

See original GitHub issue

I have hourly energy observations taken during business days only. So 120 hrs per week, not 168. With Sat and Sun missing always and holidays as well. Seasonality is daily, weekly, yearly.

Was trying to follow samples and use TimeSeries.from_dataframe with default settings. I got a lot of NaNs inserted into DateTimeIndex, that matches pandas.asfreq('H') behaviour. So with train/test split train, val = series.split_before(pd.Timestamp('20200101')) I receive

len(data[:'20200101']), len(train), len(data['20200101':]), len(val)
(11856, 17328, 5904, 9480)
data['20200101':]['load']
dt_iso
2020-01-09 00:00:00     801.0410
2020-01-09 01:00:00     790.4990
2020-01-09 02:00:00     770.1160
2020-01-09 03:00:00     770.8910
2020-01-09 04:00:00     774.4680
                         ...    
2021-01-29 19:00:00   1,026.1950
2021-01-29 20:00:00   1,007.2650
2021-01-29 21:00:00     990.8280
2021-01-29 22:00:00     953.9190
2021-01-29 23:00:00     904.5980
Name: load, Length: 5904, dtype: float64
val['load']
                          load
date                          
2020-01-01 00:00:00        nan
2020-01-01 01:00:00        nan
2020-01-01 02:00:00        nan
2020-01-01 03:00:00        nan
2020-01-01 04:00:00        nan
...                        ...
2021-01-29 19:00:00 1,026.1950
2021-01-29 20:00:00 1,007.2650
2021-01-29 21:00:00   990.8280
2021-01-29 22:00:00   953.9190
2021-01-29 23:00:00   904.5980

[9480 rows x 1 columns]
Freq: H

So one can see that data has exploded with NaNs.

Obviously darts.utils.statistics.plot_acf() and a darts.utils.statistics.check_seasonality() do not work with NaNs. plot_acf() gets me a straight line at zero, where should be AR lags up until 192. check_seasonality() reports [2021-03-05 17:39:16,120] INFO | darts.utils.statistics | The ACF has no local maximum for m < max_lag = 24.

If I supply fill_missing_dates=False for the TimeSeries.from_dataframe(): series = TimeSeries.from_dataframe(data, time_col='date', value_cols=['load'], fill_missing_dates=False) I get Could not infer frequency. Are some dates missing? Try specifying 'fill_missing_dates=True'

With freq='H' parameter to function above no luck also. The same loophole as with pandas DateTimeIndex with freq=‘H’

In statsmodels Sarimax models I was able to overcome datetime freq warning by converting index to PeriodIndex which supports gaps. data.index = pd.DatetimeIndex(data.index).to_period('H')

Please advise what are my options with Darts to be able to work with time series data with natural gaps?

Thanks a lot in advance for the attention.

P.S. Some pictures to illustrate time series and gaps Capture-1 Capture

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
hrzncommented, Sep 27, 2022

I would also like to know if there are any updates on built-in ways to work with natural gaps. I’m trying to do a stock prediction price, but the stock market only works on weekdays, so I have a similar problem to the stated on this issue.

Have you tried using a business day frequency (“B”) ?

0reactions
optionsraghucommented, Dec 17, 2022

Hi @hrzn @rmk17 , has this issue been take care of in the latest Darts please? I am also finding it very difficult to read a dataframe that has natural gaps into a TimeSeries object. Imputation on weekends do not make business sense. Is there a way we can tell TimeSeries to ignore the gaps? Pandas is able to look away at the gaps, I am sure Darts can too? Using ‘B’ as the business days also do not help BTW. Many thanks for loking.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How Gaps in Time-Series Data Affect Asteroseismic ...
The observed gaps are due to a combination of observing time allocation and weather. Note that while there is no day with complete...
Read more >
Problems in Analyzing Time Series with Gaps and Their ...
Technologies for the analysis of time series with gaps are considered. Some algorithms of signal extraction (purification) and evaluation of ...
Read more >
Regular data gaps in a time series - Cross Validated
I was wondering if, because the data isn't "continuous" I can really make anything analysis on it, since it seems to create a...
Read more >
Detecting gaps in time-series data in PostgreSQL
A client has a number of data feeds that are supposed to update at regular intervals. Like most things in the universe (thanks,...
Read more >
Using Deep Learning to Fill Spatio-Temporal Data Gaps in ...
For the analyses in this paper, which are focused on filling gaps in time-series data, a natural choice of architecture is recurrent neural...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found