question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] ColumnEnsembleClassifier fails fitting KNN classifiers

See original GitHub issue

Describe the bug I am trying to classify a multivariate dataset using a ColumnEnsembleClassifier. I use the same classifier for all dimensions. It fails to fit the dataset when I use KNN classifiers, but works when I use TimeSeriesForestClassifier.

I tried KNN with different metrics (‘msm, ‘dtw’ ‘euclidean’ -> imported from scipy’), all of them fail I tried it with AtrialFibrillation and BasicMotions datasets, it fails for both.

To Reproduce

from sktime.utils.data_io import load_from_arff_to_dataframe as load_arff
X_train, y_train = load_arff(path_to_dataset_TRAIN.arff')

from sktime.classification.compose import ColumnEnsembleClassifier
from sktime.classification.distance_based._time_series_neighbors import KNeighborsTimeSeriesClassifier
from sktime.classification.compose import TimeSeriesForestClassifier

########## This one works fine ##########
clf= ColumnEnsembleClassifier(estimators=[
                                       ('TSF_0',TimeSeriesForestClassifier(verbose=0,n_jobs=-1),[0]),
                                       ('TSF_1',TimeSeriesForestClassifier(verbose=0,n_jobs=-1),[1])
                                     ])
clf.fit(X_train, y_train)
ColumnEnsembleClassifier(estimators=[
                                       ('TSF_0',TimeSeriesForestClassifier(n_jobs=-1),[0]),
                                       ('TSF_1',TimeSeriesForestClassifier(n_jobs=-1),[1])
                                     ])


########## This one fails ########## 
clf= ColumnEnsembleClassifier(estimators=[
                                       ('1NN-MSM_0', KNeighborsTimeSeriesClassifier(metric='msm'), [0]),
                                       ('1NN-MSM_1', KNeighborsTimeSeriesClassifier(metric='msm'), [1])
                                     ])
clf.fit(X_train, y_train)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\Desktop\OVGU\DKE Subjects\Master Thesis\Code\run.py in <module>
----> 1 clf.fit(X_train, y_train)

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sktime\classification\compose\_column_ensemble.py in fit(self, X, y)
    157         for name, estimator, column in self._iter(replace_strings=True):
    158             estimator = clone(estimator)
--> 159             estimator.fit(_get_column(X, column), transformed_y)
    160             estimators_.append((name, estimator, column))
    161

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sktime\classification\distance_based\_time_series_neighbors.py in fit(self, X, y)
    243             check_array.__code__ = _check_array_ts.__code__
    244
--> 245         fx = self._fit(X)
    246
    247         if hasattr(check_array, "__wrapped__"):

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sklearn\neighbors\_base.py in _fit(self, X, y)
    362             if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
    363                 X, y = self._validate_data(X, y, accept_sparse="csr",
--> 364                                            multi_output=True)
    365
    366             if is_classifier(self):

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    413             if self._get_tags()['requires_y']:
    414                 raise ValueError(
--> 415                     f"This {self.__class__.__name__} estimator "
    416                     f"requires y to be passed, but the target y is None."
    417                 )

ValueError: This KNeighborsTimeSeriesClassifier estimator requires y to be passed, but the target y is None.

Expected behavior ColumnEnsembleClassifier should fit the training data

Versions

System: python: 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] executable: C: …\…\.virtualenvs\Code-OH4xsw-D\Scripts\python.exe machine: Windows-10-10.0.18362-SP0

Python dependencies: pip: 20.3.1 setuptools: 50.3.2 sklearn: 0.24.0 sktime: 0.5.0 statsmodels: 0.12.1 numpy: 1.19.4 scipy: 1.5.4 Cython: 0.29.21 pandas: 1.1.5 matplotlib: 3.3.3 joblib: 1.0.0 numba: 0.52.0 pmdarima: None tsfresh: 0.17.0

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
isma3ilsamircommented, Jan 5, 2021

Unfortunately It doesn’t, but instead of

fx = self._fit(X,y)

I use now

fx = self._fit(X,y.ravel())

and the warning doesn’t show anymore Also now Elastic Ensemble works fine

1reaction
isma3ilsamircommented, Jan 2, 2021

I found a very dirty solution to help make the code run. Yet it still shows a warning when I use RandomizedSearchCV with KNN using msm distance, ColumnEnsembler of KNN classifiers using msm distance or ElasticEnsemble.

The Dirty Solution Changed line 245 in file sktime\classification\distance_based_time_series_neighbors.py From: fx = self._fit(X) To: fx = self._fit(X,y)

The Warning I get now is: site-packages\sktime\classification\distance_based\_time_series_neighbors.py:245: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel(). fx = self._fit(X,y)

But the code runs through !

Read more comments on GitHub >

github_iconTop Results From Across the Web

[BUG] ColumnEnsembleClassifier fails fitting KNN classifiers
I am trying to classify a multivariate dataset using a ColumnEnsembleClassifier. I use the same classifier for all dimensions. It fails to fit...
Read more >
KNeighborsTimeSeriesClassifier — sktime documentation
This class is a KNN classifier which supports time series distance measures. It has hardcoded string references to numba based distances in sktime.distances ......
Read more >
Training error in KNN classifier when K=1 - Cross Validated
Training error here is the error you'll have when you input your training set to your KNN as test set. When K =...
Read more >
CS340 Machine learning Lecture 4 K-nearest neighbors
K=1 yields zero training error, but badly overfits. Train error ... model on a validation set (not used to fit the model) ......
Read more >
k-NN (k-Nearest Neighbor) Classifier
Training error of a classifier f. Training Data. • What about test error? Can't compute it. • How can we know classifier is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found