Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NMF n_components are not properly getting reflected in output when using Grid Search CV

See original GitHub issue

When I executed the example code from this link and analyzed the grid search output (grid.cv_results_[‘params’]), n_components are not properly getting reflected in output.

Posting a small snippet of output of grid.cv_results_[‘params’]:

{'classify__C': 1000,
  'reduce_dim': NMF(alpha=0.0, beta_loss='frobenius', init=None, l1_ratio=0.0, max_iter=200,
    **n_components=None,** random_state=None, shuffle=False, solver='cd',
    tol=0.0001, verbose=0),
  'reduce_dim__n_components': 2},
 {'classify__C': 1000,
  'reduce_dim': NMF(alpha=0.0, beta_loss='frobenius', init=None, l1_ratio=0.0, max_iter=200,
    **n_components=None**, random_state=None, shuffle=False, solver='cd',
    tol=0.0001, verbose=0),
  'reduce_dim__n_components': 4},
 {'classify__C': 1000,
  'reduce_dim': NMF(alpha=0.0, beta_loss='frobenius', init=None, l1_ratio=0.0, max_iter=200,
    **n_components=None**, random_state=None, shuffle=False, solver='cd',
    tol=0.0001, verbose=0),
  'reduce_dim__n_components': 8},

where reduce_dim__n_components are updating for NMF but not the actual n_components in NMF

Thanks, Pat

Issue Analytics

State:
Created 6 years ago
Comments:25 (24 by maintainers)

Top GitHub Comments

1reaction

amuellercommented, Dec 19, 2017

Yes, but I’m concerned about the side-effects. Let’s say someone did

est = RandomForestClassifier(n_estimators=100)
grid1 = {'clf': est, 'clf__max_depth': [1, 2, 3]}
grid2 = {'clf': est, 'clf__max_leaf_nodes': [2, 3, 6]}

and then ran a grid-search for each. If you run grid1 first, the estimator in grid2 would have max_depth=3. That’s very surprising to me.

0reactions

jnothmancommented, Dec 22, 2017

another thing we should consider is whether the bug, or its fix, will affect mutable parameters other than estimators

Top Results From Across the Web

Reconstructing new data using sklearn NMF components Vs ...

You can check the implementation to find the differences. · scikit-learn implementation is calculating the dot between transformed data and the ...

sklearn.model_selection.GridSearchCV

Either estimator needs to provide a score function, or scoring must be passed. param_griddict or list of dictionaries. Dictionary with parameters names (...

LDA in Python – How to grid search best topic models?

Python's Scikit Learn provides a convenient interface for topic modeling using algorithms like Latent Dirichlet allocation(LDA), ...

Can't reproduce results from GridSearchCV?

The score from your GridsearchCV is biased. You can use cross-validation either for estimating accuracy, or for choosing hyperparameters; but not both.

Link prediction based on non-negative matrix factorization

Results · We also testified the reliability of our algorithm NMF-LP on networks with node attributes. · In the process of the non-negative...