Strategies for positive predictions
See original GitHub issueThere has been a lot of discussion on strategies for forcing predictions to be positive, usually in the context of forecasting count data that naturally must be positive. I will illustrate some approaches with the following example dataset:
# Make a test problem
import numpy as np
import pandas as pd
from fbprophet import Prophet
ds = pd.date_range('2020-01-01', '2020-06-01')
t = np.arange(153)
seasonality = 0.5 * np.cos(t * 2 * np.pi / 7) # weekly seasonality
trend = (-t + 30 * t * np.exp(-t / 40)) / 20
y = np.round(np.clip(trend * (1 + seasonality) * (1 + 0.2 * np.random.randn(153)), a_min=0, a_max=None))
df = pd.DataFrame({'ds': ds, 'y': y})
Approach 1: Clip predictions of a regular model The idea here is just to fit a usual model, and then clamp negative predictions up to 0. Note that we’ll use multiplicative seasonality here - we’d probably always want to use multiplicative seasonality in settings with positive predictions.
# Fit a usual prophet model, and clip
m = Prophet(seasonality_mode='multiplicative').fit(df)
future = m.make_future_dataframe(90)
fcst = m.predict(future)
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
fcst[col] = fcst[col].clip(lower=0.0)
fig = m.plot(fcst)
This approach guarantees positive predictions, but the uncertainty estimates are pretty unsatisfactory: the forecast has 0 uncertainty in the future, even though we’ve seen trend changes in the past. Is it reasonable to forecast no chance of the forecast coming above 0? In most settings, probably not. The reason there is no trend uncertainty being captured in the forecast is because all of the trend uncertainty is happening below 0, as can be seen in the components plot:
fig = m.plot_components(fcst)
So all of the trend uncertainty is lost when it is clamped up to 0.
Approach 2: Logistic growth
The logistic growth trend has a floor at 0, so the trend will stay positive. It does require specifying a maximum saturation value as well, which could be set to whatever the expected maximum of the forecast is. It also doesn’t ensure the forecast will be positive - it only ensures the trend will be positive. The forecast yhat
can still be pushed negative by seasonality. So we must also clip to 0 as above.
# Fit a logistic growth model, and clip
df['cap'] = 1.2 * df['y'].max()
m = Prophet(
growth='logistic',
seasonality_mode='multiplicative',
changepoint_prior_scale=0.5,
).fit(df)
future = m.make_future_dataframe(90)
future['cap'] = 1.2 * df['y'].max()
fcst = m.predict(future)
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
fcst[col] = fcst[col].clip(lower=0.0)
fig = m.plot(fcst)
The uncertainty estimate here isn’t really any more satisfactory than that above. This is because the purpose of the logistic growth trend is to saturate; the documentation page is, afterall, called “Saturating Forecasts” (https://facebook.github.io/prophet/docs/saturating_forecasts.html). And so here it naturally saturates at 0, but we may not wish for 0 to be quite so sticky.
Approach 3: Log transform
If we log transform y
, make a forecast, and then take the exp
of the forecast, it is guaranteed to be positive. This also changes the nature of the seasonality: additive seasonality in the log transform space corresponds to multiplicative seasonality in the original space (see https://github.com/facebook/prophet/issues/647#issuecomment-413027578)
# Log-transform the data
df['y'] = np.log(1 + df['y'])
m = Prophet(seasonality_mode='additive').fit(df)
future = m.make_future_dataframe(90)
fcst = m.predict(future)
# Invert the transform
m.history['y'] = np.exp(m.history['y']) - 1
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
fcst[col] = np.exp(fcst[col]) - 1
fig = m.plot(fcst)
This is no better than the other approaches: the components plot shows that there is trend uncertainty, however it is all being squashed out by the exp
inverse transform. Also, the exp
inverse can produce serious numerical issues, which isn’t an issue in this particular forecast but I’ve run into it in others. It’s not a very generally applicable approach.
Approach 4: Negative binomial / Poisson likelihood There has been a lot of discussion around using a negative binomial or Poisson likelihood to handle count data, instead of the Gaussian likelihood currently used (#337 and #1500 have a lot of discussion). I prototyped this by implementing a negative binomial likelihood in #1544. Patching in that PR produces this:
# Negative binomial likelihood
df['cap'] = 1.2 * df['y'].max()
m = Prophet(
growth='logistic',
seasonality_mode='multiplicative',
likelihood='NegBinomial',
changepoint_prior_scale=0.5,
).fit(df)
future = m.make_future_dataframe(90)
future['cap'] = 1.2 * df['y'].max()
fcst = m.predict(future)
fig = m.plot(fcst)
This produces results similar to what is seen above. Generally, the negative binomial likelihood seems like it would be appropriate for this type of small-integer data, but I’m not sold on it based on my experience so far. There is a lot of discussion on this in #1500, but on the example problems there its performance was underwhelming. It also involves a exp()
transform in the likelihood (a hinge transform), which can produce numerical issues, and did in one of the datasets there. So I don’t think it’s going to be a robust and reliable strategy in its current state.
Approach 5: A positive trend model The default Prophet trend is a piecewise linear function. The problem we’re dealing with here is that there is nothing to prevent the trend from going negative, and simply clamping it to 0 wipes out all of the future trend uncertainty.
Trend uncertainty is estimated with Monte Carlo sampling, by sampling future trends with the following simulation:
- At each future time, sample whether or not there will be a trend change from a Poisson distribution (whose rate is estimated during model fitting).
- If there is a trend change, sample the magnitude of the trend change from a Laplace distribution (whose scale is estimated during model fitting). Update the trend with that change, and continue forward in time. This procedure is described in Section 3.1.4 of the paper (https://peerj.com/preprints/3190.pdf). Suppose we modify this generative model to disallow trend changes that take the trend negative. Basically, when we are simulating a future trend, when it hits 0 and starts to go negative, we will instead add a new trend change that keeps it at 0. Then, positive future trend changes will still be able to take the trend positive again (it isn’t stuck below 0 like above with clipping), and negative future trend changes will simply have no effect. This can be implemented by modifying the function that Prophet uses for calculating the piecewise linear trend. This class implements it:
class ProphetPos(Prophet):
@staticmethod
def piecewise_linear(t, deltas, k, m, changepoint_ts):
"""Evaluate the piecewise linear function, keeping the trend
positive.
Parameters
----------
t: np.array of times on which the function is evaluated.
deltas: np.array of rate changes at each changepoint.
k: Float initial rate.
m: Float initial offset.
changepoint_ts: np.array of changepoint times.
Returns
-------
Vector trend(t).
"""
# Intercept changes
gammas = -changepoint_ts * deltas
# Get cumulative slope and intercept at each t
k_t = k * np.ones_like(t)
m_t = m * np.ones_like(t)
for s, t_s in enumerate(changepoint_ts):
indx = t >= t_s
k_t[indx] += deltas[s]
m_t[indx] += gammas[s]
trend = k_t * t + m_t
if max(t) <= 1:
return trend
# Add additional deltas to force future trend to be positive
indx_future = np.argmax(t >= 1)
while min(trend[indx_future:]) < 0:
indx_neg = indx_future + np.argmax(trend[indx_future:] < 0)
k_t[indx_neg:] -= k_t[indx_neg]
m_t[indx_neg:] -= m_t[indx_neg]
trend = k_t * t + m_t
return trend
def predict(self, df=None):
fcst = super().predict(df=df)
for col in ['yhat', 'yhat_lower', 'yhat_upper']:
fcst[col] = fcst[col].clip(lower=0.0)
return fcst
It is used the same way as the usual Prophet
class:
# Fit the ProphetPos model
m = ProphetPos(seasonality_mode='multiplicative').fit(df)
future = m.make_future_dataframe(90)
fcst = m.predict(future)
fig = m.plot(fcst)
With this model, unlike all of the other approaches, the trend has the possibility to become positive again. It also avoids some of the downsides of other approaches: there is no requirement to specify a cap
, and there are no numerically-unstable transforms.
Summary
In settings where the trend saturates to 0 and we don’t expect it to come back up, the simplest approach of fitting a default model and clipping to 0 (approach 1) may be the best. Logistic growth, log transform, and NB likelihood all come with potential for issues and didn’t really do anything better than simple clipping here. If the trend may come back up, then only the ProphetPos
approach can capture that well. I’ve tried this approach on a number of time series and so far have found it to be the best for the time series I’ve looked at, relative to these other strategies. I’d be very interested to hear if anyone else is able to try out ProphetPos
and see how it works!
Issue Analytics
- State:
- Created 3 years ago
- Reactions:34
- Comments:8 (3 by maintainers)
Top GitHub Comments
@bletham is there any plan to make ProphetPos into separate library/package/repo then? And push it to pypi maybe. If not, may I or other take it (including adding MIT License to the repo)? Thank you.
Another approach I’ve been testing is using the inverse hyperbolic sine IHS transformation which allows for zeros in the training data per this post https://davegiles.blogspot.com/2019/03/forecasting-after-inverse-hyperbolic.html. Estimation of theta can be determined using mle or a quick estimation can be done by scaling the data by mean or std.