question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: Buffer dtype mismatch, expected 'int' but got 'long'

See original GitHub issue

I’m trying to fit a logistic regression on a sparse matrix, but I’m failing due to some ValueError:

model = LogisticRegression(
    C=1,
    solver='sag',
    random_state=0,
    tol=0.0001,
    max_iter=100,
    verbose=1,
    warm_start=True,
    n_jobs=64,
    penalty='l2',
    dual=False,
    multi_class='ovr',
)
model.fit(train_data.inputs, train_data.targets)
/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/base.py:253: UserWarning: Trying to unpickle estimator StandardScaler from version 0.20.0 when using version 0.20.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
[Parallel(n_jobs=64)]: Using backend ThreadingBackend with 64 concurrent workers.
Traceback (most recent call last):
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-5936f410e945>", line 58, in <module>
    model.fit(train_data_cadd.inputs, train_data_cadd.targets)
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/comet_ml/monkey_patching.py", line 244, in wrapper
    return_value = original(*args, **kwargs)
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/logistic.py", line 1363, in fit
    for class_, warm_start_coef_ in zip(classes_, warm_start_coef))
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 930, in __call__
    self.retrieve()
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/logistic.py", line 792, in logistic_regression_path
    is_saga=(solver == 'saga'))
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/sag.py", line 305, in sag_solver
    dataset, intercept_decay = make_dataset(X, y, sample_weight, random_state)
  File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 84, in make_dataset
    seed=seed)
  File "sklearn/utils/seq_dataset.pyx", line 259, in sklearn.utils.seq_dataset.CSRDataset.__cinit__
ValueError: Buffer dtype mismatch, expected 'int' but got 'long'
>>> sklearn.__version__
Out[5]: '0.20.3'
>>> train_data.inputs
Out[6]: 
<28034374x904 sparse matrix of type '<class 'numpy.float32'>'
	with 2223406363 stored elements in Compressed Sparse Row format>

Is there some way I can still train my data?

_Originally posted by @Hoeze in https://github.com/scikit-learn/scikit-learn/issues/10758#issuecomment-476755852_

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
rthcommented, Mar 26, 2019

Thanks for the report @Hoeze and the reproducible example!

It is a known issue reported in https://github.com/scikit-learn/scikit-learn/issues/11355

Closing this as a duplicate (to avoid splitting discussions), but a contribution to address this issue would be very welcome.

0reactions
yujianllcommented, May 19, 2021

@Hoeze Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cython: Buffer type mismatch, expected 'int' but got 'long'
You are using Cython's int type, which is just C int . I think on Mac (or most architectures) it is int 32-bit....
Read more >
ValueError: Buffer dtype mismatch, expected 'long' but got ...
I am trying to run test diagnostics on a trained model. The model throws an issue ValueError: Buffer dtype mismatch, expected 'long' but...
Read more >
ValueError: Buffer dtype mismatch, expected 'long_t' but got ...
This error comes from cnp.long_t being different than the C Cython type identifier long. (with cimport numpy as cnp). Strangely, this error does ......
Read more >
DigitalSlideArchive/HistomicsTK - Gitter
I think this is caused by a mismatch in the 'int' size between numpy and the compiler on your system. We should go...
Read more >
Cython: Buffer type mismatch, expected 'int' but got 'long'-numpy
I'm having trouble passing in this memoryview of integers into this (rather trivial) function. Python is giving me this error: ValueError: Buffer dtype...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found