question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

On confidence intervals and uncertainty intervals

See original GitHub issue

Given that this is a Bayesian method, it is strange that the uncertainty is summarized using confidence intervals as opposed to Bayesian uncertainty/probability/credibility intervals via, for instance, HDIs and also taking the mean instead of the median as the point estimate. @WillianFuks, I am curious to hear your thoughts on what is in compile_posterior_inferences:

https://github.com/WillianFuks/tfcausalimpact/blob/master/causalimpact/inferences.py#L52

Having computed samples of the target time series, it should be straightforward to summarize them pointwise via, say, hdi in arviz. Or do I miss something?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
IvanUkhovcommented, Mar 30, 2021

I ended up writing a custom function for summarizing inferences, as I wanted to have medians and HDIs instead of means and quantile intervals. I will leave it here in case it can be helpful some time in the future:

def _summarize_original(
    data_before: pd.DataFrame,
    data_after: pd.DataFrame,
    posterior: tp.distributions.Distribution,
    predictive: tp.distributions.Distribution,
    standardization: Tuple[float, float],
    alpha: float = 0.05,
    draw_count: int = 1000,
    random_state: int = 42,
) -> pd.DataFrame:
    from causalimpact.inferences import build_cum_index
    from causalimpact.misc import maybe_unstandardize

    def _hdi(data: np.ndarray) -> np.ndarray:
        from arviz.stats import hdi
        return np.array([hdi(data, 1 - alpha) for data in data.T]).T

    y_before = predictive.sample(draw_count, seed=random_state)
    y_before = maybe_unstandardize(np.squeeze(y_before.numpy()), standardization)
    y_after = posterior.sample(draw_count, seed=random_state)
    y_after = maybe_unstandardize(np.squeeze(y_after.numpy()), standardization)

    pre_preds_means = np.median(y_before, axis=0)
    pre_preds_lower, pre_preds_upper = _hdi(y_before)
    pre_preds_means = pd.Series(pre_preds_means, index=data_before.index)
    pre_preds_lower = pd.Series(pre_preds_lower, index=data_before.index)
    pre_preds_upper = pd.Series(pre_preds_upper, index=data_before.index)

    post_preds_means = np.median(y_after, axis=0)
    post_preds_lower, post_preds_upper = _hdi(y_after)
    post_preds_means = pd.Series(post_preds_means, index=data_after.index)
    post_preds_lower = pd.Series(post_preds_lower, index=data_after.index)
    post_preds_upper = pd.Series(post_preds_upper, index=data_after.index)

    complete_preds_means = pd.concat([pre_preds_means, post_preds_means])
    complete_preds_lower = pd.concat([pre_preds_lower, post_preds_lower])
    complete_preds_upper = pd.concat([pre_preds_upper, post_preds_upper])

    data = pd.concat([data_before, data_after])
    point_effects_means = data.iloc[:, 0] - complete_preds_means
    point_effects_upper = data.iloc[:, 0] - complete_preds_lower
    point_effects_lower = data.iloc[:, 0] - complete_preds_upper

    z_after = np.cumsum(data_after.iloc[:, 0].values - y_after, axis=1)
    post_cum_effects_means = np.median(z_after, axis=0)
    post_cum_effects_lower, post_cum_effects_upper = _hdi(z_after)
    index = build_cum_index(data_before.index, data_after.index)
    post_cum_effects_lower = pd.Series(
        np.concatenate([[0], post_cum_effects_lower]).ravel(),
        index=index,
    )
    post_cum_effects_means = pd.Series(
        np.concatenate([[0], post_cum_effects_means]).ravel(),
        index=index,
    )
    post_cum_effects_upper = pd.Series(
        np.concatenate([[0], post_cum_effects_upper]).ravel(),
        index=index,
    )

    data = dict(
        complete_preds_means=complete_preds_means,
        complete_preds_lower=complete_preds_lower,
        complete_preds_upper=complete_preds_upper,
        point_effects_means=point_effects_means,
        point_effects_lower=point_effects_lower,
        point_effects_upper=point_effects_upper,
        post_cum_effects_means=post_cum_effects_means,
        post_cum_effects_lower=post_cum_effects_lower,
        post_cum_effects_upper=post_cum_effects_upper,
    )
    return pd.DataFrame(data)
0reactions
IvanUkhovcommented, Mar 9, 2021

I close this then. I think switching to quantile intervals was a good move. Next time around, one can consider HDIs, but probably it would not make much of a difference unless very skewed distributions are expected.

Thank you, and sorry for this much noise here 🙂

Read more comments on GitHub >

github_iconTop Results From Across the Web

Are confidence intervals better termed “uncertainty intervals”?
Let's use the term “uncertainty interval” instead of “confidence interval.” The uncertainty interval tells us how much uncertainty we have. As ...
Read more >
Difference between uncertainty intervals and confidence ...
An uncertainty interval refers to confidence interval, the difference between the two being only philosophical rather than mathematical. Confidence interval ...
Read more >
Confidence intervals, compatability intervals, uncertainty ...
I recommended the term “uncertainty intervals,” on the grounds that the way confidence intervals are used in practice is to express ...
Read more >
12.4.1 Confidence intervals
The confidence interval describes the uncertainty inherent in this estimate, and describes a range of values within which we can be reasonably sure...
Read more >
Introduction to Uncertainty
Measurements and uncertainties. ➢Uncertainty. ➢Types of uncertainty. ➢Standard uncertainty. ➢Confidence Intervals. ➢Expanded Uncertainty. ➢Examples.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found