question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to make the model ignore covid for 2021 projections?

See original GitHub issue

I have daily sales projetions for 2021. as you can see in the chart below, it predicts a spike (#2) similar to the one in 2020 (#1)

Spike #1 was the first covid lockdown. This is supermarket data so people were panic buying.

image

It looks like the model predicts more panic buying next year.

How can I tell it to ignore this anomaly?

The lockdown dates were used in the model as holidays:


lockdown1 = pd.DataFrame({
  'holiday': 'lockdown1',
  'ds': pd.date_range(start="2020-03-11",end="2020-05-18").to_list(),
})

m = Prophet(holidays=lockdown1)

However it seems to want to predict another spike in 2021. Close enough to these dates.

Can I stop it from doing this?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
blethamcommented, Jan 7, 2021

Sorry for the slow reply!

The reason it didn’t work when adding the lockdown date as a holiday is because you added the entire set of lockdown dates as the dates for the holiday. Basically what this is doing is applying a single “lockdown1” holiday effect to every day in the range. Since the spike is only on the early days and then things are mostly normal after that, the fitted holiday effect is going to be the average effect across all of these dates that have been labeled as the holiday, and so it will fit a small holiday effect because there is no effect on most of the dates.

What you want to do instead is use the upper_window field to specify that the holiday effect spans multiple days:

lockdown1 = pd.DataFrame({
  'holiday': 'lockdown1',
  'ds': pd.to_datetime(["2020-03-11"]),
  'upper_window': 68,
})

The difference is a bit subtle, but what you did says that there is a holiday lockdown1 that has an effect on 1 day, and there are 69 instances of that holiday (every day between 3/11 and 5/18). The formulation above says that there is a single instance of the holiday (on 3/11), but it has an effect that spans an additional 68 days. Under the hood, each of those 68 days will be treated as having an independent effect size; so the effect on 3/11 will no longer be assumed equal to and averaged across the rest of the window. Since the effect is really limited just to the start of the lockdown you could probably get away with using a smaller upper_window which will save you some model complexity.

This will then be able to fit the spike as a holiday effect instead of grabbing it with the yearly seasonality, and so it will not be propagated into 2021. You might need to reduce the yearly seasonality prior scale a bit but I think it will work either way.

0reactions
blethamcommented, Apr 3, 2021

Just want to also link to #1416 which has a lot of handling-COVID discussion.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Model-informed COVID-19 exit strategy with projections of ...
Here, we aimed to quantitatively assess an optimal COVID-19 exit strategy in the Republic of Korea, where only a small number of cumulative ......
Read more >
A proficient approach to forecast COVID-19 spread via ...
This study aims to develop an assumption-free data-driven model to accurately forecast COVID-19 spread. Towards this end, we firstly ...
Read more >
COVIDNearTerm: A simple method to forecast COVID-19 ...
We focus on models with comparable goals – those predicting population-level outcomes, and exclude models predicting individual risk, even if population-level ...
Read more >
COVID-19
COVID -19 Projections ... After December 16, 2022, IHME will pause its COVID-19 modeling for the foreseeable future. Past estimates and COVID-related resources ......
Read more >
Real-time COVID-19 forecasting - The Lancet
If data are inconsistent or do not reflect reality, models have no reliable ground truth from which to learn or be evaluated. Unfortunately,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found