question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AssertionError: filters should not remove entries all entries - check encoder/decoder lengths and lags

See original GitHub issue
  • PyTorch-Forecasting version:
  • PyTorch version: 1.8.1+rocm4.0.1
  • Python version: 3.8.8
  • Operating System: Ubuntu Groovy Gorilla

Expected behavior

I have data in a pandas dataframe that is characterized by the following:

Date          Load  Solar         Wind  time_idx month   grid
0 2019-05-01 00:00:00  21283.786430    0.0  2414.134105         0     5  ERCOT
1 2019-05-01 01:00:00  20502.851939    0.0  3097.408232         0     5  ERCOT
2 2019-05-01 02:00:00  19936.040922    0.0  2774.369595         0     5  ERCOT
3 2019-05-01 03:00:00  19905.404774    0.0  2889.462763         0     5  ERCOT
4 2019-05-01 04:00:00  20343.393833    0.0  2719.568595         0     5  ERCOT
               Load         Solar          Wind      time_idx
count  10991.000000  10991.000000  10991.000000  10991.000000
mean   25019.883604   3556.769244   1968.367736      6.990993
std     4809.103704   4194.475591   1298.381739      4.327012
min    15591.377319      0.000000      0.000000      0.000000
25%    21556.054368      0.000000    784.979067      3.000000
50%    24070.778150    513.663393   1883.507906      7.000000
75%    27143.606424   7780.020473   3005.799267     11.000000
max    44148.227377  11854.229384   5243.790771     14.000000

I then took the TimeSeriesDataSet from the Stallion example for the python temporal fusion transformer example and modified it to fit my data in the following way:

max_prediction_length = 6
max_encoder_length = 24
training_cutoff = data["time_idx"].max() - max_prediction_length

training = TimeSeriesDataSet(
    data[lambda x: x.time_idx <= training_cutoff],
    time_idx="time_idx",
    target="Load",
    group_ids=["grid"],
    min_encoder_length=max_encoder_length // 2,  # keep encoder length long (as it is in the validation set)
    max_encoder_length=max_encoder_length,
    min_prediction_length=1,
    max_prediction_length=max_prediction_length,
    static_categoricals=["grid"],
    # static_reals=["avg_population_2017", "avg_yearly_household_income_2017"],
    time_varying_known_categoricals=["month"],
    # variable_groups={"special_days": special_days},  # group of categorical variables can be treated as one variable
    time_varying_known_reals=["time_idx", "Solar", "Wind"],
    time_varying_unknown_categoricals=[],
    time_varying_unknown_reals=[],
    target_normalizer=GroupNormalizer(
        groups=["grid"], transformation="softplus"
    ),  # use softplus and normalize by group
    add_relative_time_idx=True,
    add_target_scales=True,
    add_encoder_length=True,
    allow_missings=True,
)

Actual behavior

Here is the error that I keep getting. I don’t have any lags in my time series data set and I don’t see how the encoders could be causing issues with there size.

Traceback (most recent call last):
  File "grid_Transformer.py", line 65, in <module>
    training = TimeSeriesDataSet(
  File "/home/prism/anaconda3/envs/transformer/lib/python3.8/site-packages/pytorch_forecasting/data/timeseries.py", line 435, in __init__
    self.index = self._construct_index(data, predict_mode=predict_mode)
  File "/home/prism/anaconda3/envs/transformer/lib/python3.8/site-packages/pytorch_forecasting/data/timeseries.py", line 1233, in _construct_index
    assert (
AssertionError: filters should not remove entries all entries - check encoder/decoder lengths and lags

I have attached the full code and the data file below. Thanks for any help or insights!

code_data.zip

  • A note on the full code is that when I include more data points I don’t get the error above but then it fills my ram and my computer kills it before the transformer can begin running.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
jdb78commented, Apr 29, 2021

I would assume not all time series have a minimum length of 13 (min_prediction_length + min_encoder_length)

0reactions
georgeblckcommented, Oct 26, 2022

I have the same problem. A solution would be appreciated

Read more comments on GitHub >

github_iconTop Results From Across the Web

AssertionError: filters should not remove entries all entries
AssertionError : filters should not remove entries all entries - check encoder/decoder lengths and lags.
Read more >
Pytorch-forecasting:: Univariate AssertionError: filters should ...
After some experiment, it seems that the training_df length (196) should be larger than or equal to (context_length + prediction_length).
Read more >
Source code for pytorch_forecasting.data.timeseries
Lags must be at not larger than the shortest time series as all time series will ... "filters should not remove entries all...
Read more >
Encoder-Decoder Model for Multistep Time Series Forecasting ...
Encoder-decoder models have provided state of the art results in sequence ... sales of all items, and the mean forecast to remove the...
Read more >
Changelog — Python 3.11.1 documentation
gh-99886: Fix a crash when an object which does not have a dictionary frees its instance values. gh-99891: Fix a bug in the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found