How to make the model ignore covid for 2021 projections?
See original GitHub issueI have daily sales projetions for 2021. as you can see in the chart below, it predicts a spike (#2) similar to the one in 2020 (#1)
Spike #1 was the first covid lockdown. This is supermarket data so people were panic buying.
It looks like the model predicts more panic buying next year.
How can I tell it to ignore this anomaly?
The lockdown dates were used in the model as holidays:
lockdown1 = pd.DataFrame({
'holiday': 'lockdown1',
'ds': pd.date_range(start="2020-03-11",end="2020-05-18").to_list(),
})
m = Prophet(holidays=lockdown1)
However it seems to want to predict another spike in 2021. Close enough to these dates.
Can I stop it from doing this?
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Model-informed COVID-19 exit strategy with projections of ...
Here, we aimed to quantitatively assess an optimal COVID-19 exit strategy in the Republic of Korea, where only a small number of cumulative ......
Read more >A proficient approach to forecast COVID-19 spread via ...
This study aims to develop an assumption-free data-driven model to accurately forecast COVID-19 spread. Towards this end, we firstly ...
Read more >COVIDNearTerm: A simple method to forecast COVID-19 ...
We focus on models with comparable goals – those predicting population-level outcomes, and exclude models predicting individual risk, even if population-level ...
Read more >COVID-19
COVID -19 Projections ... After December 16, 2022, IHME will pause its COVID-19 modeling for the foreseeable future. Past estimates and COVID-related resources ......
Read more >Real-time COVID-19 forecasting - The Lancet
If data are inconsistent or do not reflect reality, models have no reliable ground truth from which to learn or be evaluated. Unfortunately,...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sorry for the slow reply!
The reason it didn’t work when adding the lockdown date as a holiday is because you added the entire set of lockdown dates as the dates for the holiday. Basically what this is doing is applying a single “lockdown1” holiday effect to every day in the range. Since the spike is only on the early days and then things are mostly normal after that, the fitted holiday effect is going to be the average effect across all of these dates that have been labeled as the holiday, and so it will fit a small holiday effect because there is no effect on most of the dates.
What you want to do instead is use the
upper_window
field to specify that the holiday effect spans multiple days:The difference is a bit subtle, but what you did says that there is a holiday
lockdown1
that has an effect on 1 day, and there are 69 instances of that holiday (every day between 3/11 and 5/18). The formulation above says that there is a single instance of the holiday (on 3/11), but it has an effect that spans an additional 68 days. Under the hood, each of those 68 days will be treated as having an independent effect size; so the effect on 3/11 will no longer be assumed equal to and averaged across the rest of the window. Since the effect is really limited just to the start of the lockdown you could probably get away with using a smaller upper_window which will save you some model complexity.This will then be able to fit the spike as a holiday effect instead of grabbing it with the yearly seasonality, and so it will not be propagated into 2021. You might need to reduce the yearly seasonality prior scale a bit but I think it will work either way.
Just want to also link to #1416 which has a lot of handling-COVID discussion.