question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multivariate Anomaly Dectector (Error when running tutorial)

See original GitHub issue

I get the error below when I run the tutorial kats_202_detection.ipynb https://github.com/facebookresearch/Kats/blob/master/tutorials/kats_202_detection.ipynb Any clue ?

KeyError: Timestamp('2019-12-23 23:59:58.142906')

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-265-b73347c84449> in <module>
      3 d = MultivariateAnomalyDetector(multi_anomaly_ts, params, training_days=60)
      4 display(params)
----> 5 anomaly_score_df = d.detector()
      6 
      7 d.plot()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\kats\detectors\outlier.py in detector(self)
    300         while fcstTime < self.df.index.max():
    301             # forecast for fcstTime+ 1
--> 302             pred_df = self._generate_forecast(fcstTime)
    303             # calculate anomaly scores
    304             anomaly_scores_t = self._calc_anomaly_scores(pred_df)

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

3reactions
pboschcommented, Aug 24, 2021

@loicduffar @Chima-21 Do you use Windows or Linux? I encountered the issue myself in Windows using Miniconda (4.10.3) and using WSL or a native Ubuntu doesn’t produce the issue. In WSL/Ubuntu it works with native Python and Miniconda.

The culprit is in in calculating the granularity:

       if len(time_diff.unique()) == 1:  # check constant frequenccy
            freq = time_diff.unique()[0].astype("int")
            self.granularity_days = freq / (24 * 3600 * (10 ** 9))
        else:
            raise RuntimeError(
                "Frequency of metrics is not constant."
                "Please check for missing or duplicate values"
            )

In WSL/Ubuntu I get a straight 1.0 for the example data. In Windows I get -2.149413925925926e-05. More precisely, the issue is in the astype call. It seems to default to int32 on Windows, which causes an overflow. Using int64 instead of int solves the problem.

I would recommend in this instance, and in general, to use explicit types instead of assuming that the default type is correct. Int on Windows systems usually defaults to 32bit while float usually defaults to 64bit. By marking them explicit with int64 and float64, whether it is with numpy or pandas, you would avoid that issue completely and make it a bit more robust. A quick search for astype in the repository shows that it’s mostly implicit, so it could be that this kind of problem occurs in other places as well.

0reactions
michaelbrundagecommented, Sep 14, 2022

Renaming the issue to the root cause.

Most likely, we should mark Kats as requiring 64-bit. I think we don’t intend to support legacy 32-bit Python installations.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot the Anomaly Detector multivariate API
This article provides guidance on how to troubleshoot and remediate common error messages when you use the Azure Cognitive Services Anomaly ...
Read more >
Multivariate Data Anomaly Detection On Python(2020-11-20)
Session 4: Multivariate Data Anomaly Detection On Python(2020-11-20). 3.2K views 2 years ago.
Read more >
Anomaly Detection in Python — Part 2; Multivariate ...
A Guide on how to Perform Anomaly detection for Business Analysis or a Machine Learning Pipeline on multivariate data along with relevant ...
Read more >
azure-docs/best-practices-multivariate.md at main
This article will provide guidance around recommended practices to follow when using the multivariate Anomaly Detector (MVAD) APIs. In this tutorial, you'll ...
Read more >
Multivariate Time Series Anomaly Detection using VAR ...
Now by using the selected lag, fit the VAR model and find the squared errors of the data. The squared errors are then...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found