Incremental.predict failing for scipy.sparse input
See original GitHub issueFrom https://examples.dask.org/machine-learning/text-vectorization.html
In [14]: predictions = pipe.predict(df['text'])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-14-2acfe3e16558> in <module>
----> 1 predictions = pipe.predict(df['text'])
/usr/local/Caskroom/miniconda/base/envs/dask-dev/lib/python3.8/site-packages/sklearn/utils/metaestimators.py in <lambda>(*args, **kwargs)
111
112 # lambda, but not partial, allows help() to work with update_wrapper
--> 113 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs) # noqa
114 else:
115
/usr/local/Caskroom/miniconda/base/envs/dask-dev/lib/python3.8/site-packages/sklearn/pipeline.py in predict(self, X, **predict_params)
468 for _, name, transform in self._iter(with_final=False):
469 Xt = transform.transform(Xt)
--> 470 return self.steps[-1][1].predict(Xt, **predict_params)
471
472 @available_if(_final_estimator_has("fit_predict"))
~/gh/dask/dask-ml/dask_ml/wrappers.py in predict(self, X)
315 if isinstance(X, da.Array):
316 if meta is None:
--> 317 meta = _get_output_dask_ar_meta_for_estimator(
318 _predict, self._postfit_estimator, X
319 )
~/gh/dask/dask-ml/dask_ml/wrappers.py in _get_output_dask_ar_meta_for_estimator(model_fn, estimator, input_dask_ar)
663 # sklearn fails if input array has size size
664 # It requires at least 1 sample to run successfully
--> 665 ar = np.zeros(
666 shape=(1, input_dask_ar.shape[1]),
667 dtype=input_dask_ar.dtype,
TypeError: The `like` argument must be an array-like that implements the `__array_function__` protocol.
Here are some values.
ipdb> pp input_dask_ar
dask.array<_transformer, shape=(nan, 1048576), dtype=float64, chunksize=(nan, 1048576), chunktype=scipy.csr_matrix>
ipdb> pp input_dask_ar._meta
<0x0 sparse matrix of type '<class 'numpy.float64'>'
with 0 stored elements in Compressed Sparse Row format>
ipdb> pp type(input_dask_ar._meta)
<class 'scipy.sparse.csr.csr_matrix'>
cc @VibhuJawa if you have a chance to look. Maybe we check if the array implements __array_function__
. If it doesn’t then… I’m not sure what to do.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Out-of-core processing of sparse CSR arrays - Stack Overflow
In the latest release of scipy has a sparse.savenz . For csr format it uses np.savez to save dict(data=matrix.data, indices=matrix.
Read more >Incremental Prediction Model of Disk Failures Based on the ...
This paper presents an incremental learning disk failure prediction model using the density metric of edge samples.
Read more >(PDF) Incremental Prediction Model of Disk Failures Based on the ...
This paper presents an incremental learning disk failure prediction model using the density ... the edge samples of the training set are sparse,...
Read more >cuML API Reference — cuml 22.12.00 documentation
If True, the imputer mask will be a sparse matrix. If False, the imputer mask will be a numpy array. error_on_newboolean, default=None. If...
Read more >Tracking cell lineages in 3D by incremental deep learning | eLife
These sparse annotations were sufficient to generate 3D optical flow predictions for the entire dataset (Figure 4—video 1), which significantly ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That might be an acceptable option for now. I wonder if we can catch the
TypeError
and reraise with a more informative message, indicating that the user can setpredict_meta
, etc. to handle that?@VibhuJawa - Awaiting the solution for the same problem.