question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[ENH] Past predictions over train set

See original GitHub issue

Is your feature request related to a problem? Please describe.

I was wondering why can’t we get past predictions (by that I mean predictions made by the model on the train set) in sktime, actually one of the most fundamental ways to compare models is to compare their performance on train set relatively to test set performance and see if they’re over-fitting: if the model performance on train set is way better than it is one the test/validation set. But in sktime we can only (I guess) see future predictions because of the fh argument that starts at 1 (as the first prediction after train set) which means we can only compare y_pred to y_test.

Describe the solution you’d like

I think this needs a refactoring of how forecasters treat the ForecastHorizon. either forecasters should be able to accept negative values in their fh argument (kinda off) or maybe (as in statsmodels) the new forecasts should start at len(y_train).

Additional context

This makes a big difference for models that can be overfitted aka models with lots of hyper-parameters such as prophet where adding more and more seasonalities with high order can make your future predictions worse but your past predictions better. Actually all models that allow multiple seasonalities (BATS and TBAT) can over-fit easily but with this you’ll be able to see it with your eyes.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
aiwaltercommented, Sep 3, 2021

Yes then just do insample prediction

1reaction
fkiralycommented, Sep 3, 2021

@ilyasmoutawwakil, not sure whether in-sample forecasts are what you are looking fore - those shouldn’t be used for backtesting!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Scoring on train set with `predict_model` · Issue #1176 - GitHub
The idea is correct, but this will predict on all data. If you want to predict just on train data, I would first...
Read more >
Train delay analysis and prediction based on big data fusion
In the present study, we analyse the factors affecting train delays and propose a machine learning-based train delay model using three sets of ......
Read more >
DeepETA: How Uber Predicts Arrival Times Using Deep ...
At Uber, magical customer experiences depend on accurate arrival time predictions (ETAs). We use ETAs to calculate fares, estimate pickup ...
Read more >
Next-Frame Video Prediction with Convolutional LSTMs - Keras
Frame Prediction Visualizations​​ With our model now constructed and trained, we can generate some example frame predictions based on a new video ...
Read more >
A machine learning framework for sport result prediction
The training data and test data are then adjusted, the model is retrained with the new training data, and new matches are predicted....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found