question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

least_angle.py in lars_path - shapes not aligned on properly formatted data with specific alpha

See original GitHub issue

Encountered a similar error to https://github.com/scikit-learn/scikit-learn/issues/5873 - when running the following code:

a_selection = RandomizedLasso(alpha=0.025, normalize=False, n_jobs=1, random_state=42) a_selection.fit(X=x_sub, y=y_sub)

I can’t share the data as it’s from clinical trials, but what I have noticed is that the error disappears (for this particular fit) when I remove the alpha parameter. The code takes 2 days to complete so I am worried a different alpha will break a different dataset input. The data frame is properly formatted, the row number fits the labels and there are no NaNs. The error I get is:

~/miniconda3/lib/python3.6/site-packages/sklearn/linear_model/randomized_l1.py in fit(self, X, y) 110 n_jobs=self.n_jobs, verbose=self.verbose, 111 pre_dispatch=self.pre_dispatch, random_state=self.random_state, --> 112 sample_fraction=self.sample_fraction, **params) 113 114 if scores_.ndim == 1:

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/memory.py in __call__(self, *args, **kwargs) 281 282 def __call__(self, *args, **kwargs): --> 283 return self.func(*args, **kwargs) 284 285 def call_and_shelve(self, *args, **kwargs):

~/miniconda3/lib/python3.6/site-packages/sklearn/linear_model/randomized_l1.py in _resample_model(estimator_func, X, y, scaling, n_resampling, n_jobs, verbose, pre_dispatch, random_state, sample_fraction, **params) 52 verbose=max(0, verbose - 1), 53 **params) ---> 54 for _ in range(n_resampling)): 55 scores_ += active_set 56

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self, iterable) 756 # was dispatched. In particular this covers the edge 757 # case of Parallel used with an exhausted iterator. --> 758 while self.dispatch_one_batch(iterator): 759 self._iterating = True 760 else:

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in dispatch_one_batch(self, iterator) 606 return False 607 else: --> 608 self._dispatch(tasks) 609 return True 610

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in _dispatch(self, batch) 569 dispatch_timestamp = time.time() 570 cb = BatchCompletionCallBack(dispatch_timestamp, len(batch), self) --> 571 job = self._backend.apply_async(batch, callback=cb) 572 self._jobs.append(job) 573

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in apply_async(self, func, callback) 107 def apply_async(self, func, callback=None): 108 """Schedule a func to be run""" --> 109 result = ImmediateResult(func) 110 if callback: 111 callback(result)

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in __init__(self, batch) 324 # Don't delay the application, to avoid keeping the input 325 # arguments in memory --> 326 self.results = batch() 327 328 def get(self):

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self) 129 130 def __call__(self): --> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items] 132 133 def __len__(self):

~/miniconda3/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in <listcomp>(.0) 129 130 def __call__(self): --> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items] 132 133 def __len__(self):

~/miniconda3/lib/python3.6/site-packages/sklearn/linear_model/randomized_l1.py in _randomized_lasso(X, y, weights, mask, alpha, verbose, precompute, eps, max_iter) 171 copy_Gram=False, alpha_min=np.min(alpha), 172 method='lasso', verbose=verbose, --> 173 max_iter=max_iter, eps=eps) 174 175 if len(alpha) > 1:

~/miniconda3/lib/python3.6/site-packages/sklearn/linear_model/least_angle.py in lars_path(X, y, Xy, Gram, max_iter, alpha_min, method, copy_X, eps, copy_Gram, verbose, return_path, return_n_iter, positive) 442 443 # TODO: this could be updated --> 444 residual = y - np.dot(X[:, :n_active], coef[active]) 445 temp = np.dot(X.T[n_active], residual) 446

ValueError: shapes (49,17) and (16,) not aligned: 17 (dim 1) != 16 (dim 0)

Versions: Linux-4.10.0-32-generic-x86_64-with-debian-stretch-sid Python 3.6.1 |Continuum Analytics, Inc.| (default, May 11 2017, 13:09:58) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] NumPy 1.12.1 SciPy 0.19.1 Scikit-Learn 0.18.2

Happy to provide any additional information if needed.

Lastly, I get a message that RandomizedLasso with be deprecated - what will replace it’s functionality? Setting n_jobs parameter to anything beyond 1 breaks the code, which was already reported - hope the replacement will fix that.

Many thanks!

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:28 (13 by maintainers)

github_iconTop GitHub Comments

3reactions
j-schneidercommented, Jun 17, 2019

Came across the same problem, and it was always with multiple drops. I changed https://github.com/scikit-learn/scikit-learn/blob/b7c41636907defd0ca210ed2e8e17fd4735567a0/sklearn/linear_model/least_angle.py#L701 To include “n_active -= 1” within the preceding loop and put https://github.com/scikit-learn/scikit-learn/blob/b7c41636907defd0ca210ed2e8e17fd4735567a0/sklearn/linear_model/least_angle.py#L733-L734 into another loop: for ii in drop_idx: temp = Cov_copy[ii] - np.dot(Gram_copy[ii], coef) Cov = np.r_[temp, Cov]

Let me use it again without errors (as part of Autofeat). Hope this helps someone, or that someone sees why it shouldn’t be done.

1reaction
FelixNeutatzcommented, Feb 18, 2019

@adrinjalali as you recommended, I created a minimal reproducible example: https://github.com/FelixNeutatz/LassoLarsCVBug

Read more comments on GitHub >

github_iconTop Results From Across the Web

Getting 'ValueError: shapes not aligned' on SciKit Linear ...
This is an extremely common problem when dealing with categorical data. There are differing opinions on how to best handle this.
Read more >
sklearn.linear_model.lars_path
Compute Least Angle Regression or Lasso path using LARS algorithm [1]. ... X is None then the Gram matrix must be specified, i.e.,...
Read more >
ValueError: shapes (1,10) and (2,) not aligned: 10 (dim 1) != 2 ...
Generally linear models are fit with an intercept # unless there is some problem-specific reason not to. # x_pred = np.linspace(x.min(), ...
Read more >
RAVEN User Manual
2 Manual Formats . ... formats with a specific meaning are reported: ... shape parameters, denoted by α and β, that appear as...
Read more >
Regularized Statistical Analysis of Anatomy - DTU Informatics
Sparse Decomposition and Modeling of Anatomical Shape Variation. ... Section 2.8 discusses least angle regression, a.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found