Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Outlier detection

See original GitHub issue

Is your feature request related to a problem? Please describe. I would like to detect outliers in time series data.

Describe the solution you’d like I have seen the Time Series Annotation enhancement proposal, however we actually could just have classes which inherit from _SeriesToSeriesTransformer and fill outliers with np.nan values. Then the outlier correction could happen by means of the Imputer I recently added.

I would like to add different outlier detection classes (maybe with a common parent class), which could be placed in sktime.transformations.series.outlier.

Describe alternatives you’ve considered Implementing the annotation module from scratch, because for probabilistic outlier detection we would need to return an additional column (so a pd.DataFrame) with the probabilities for each time point. This could however be solved with above solution in having a threshold argument which takes a probability value to decide if a point is an outlier or not.

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:8

Top GitHub Comments

1reaction

aiwaltercommented, Feb 22, 2021

I would propose to start simple and I implement in a first step a transformer (e.g. a HampelFilter). I actually planned to wrap adtk or at least a part of it, however it seems the MPL-2.0 License is not compatible.

1reaction

fkiralycommented, Feb 21, 2021

Hm, the “obvious” way would be a graphical composition formalism like in ADTK or mlr or MLJ, but that’s an entire module and complex. Would be a great thing to have though (@aiwalter, any ambitions?).

A simpler (but slightly more clunky and less principled) way would be RemoverTrafo(forecaster, remove=myoutlierdetector)?

Where RemoverTrafo.transform(y) computes myoutlierdetector.transform(y) to get the timestamps that are to be removed; the output of RemoverTrafo.transform(y) is y with rows corresponding to those timestamps removed.