question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RandomForestRegressor doesn't accept max_samples=1.0

See original GitHub issue

Describe the bug

This example from the doc works:

from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
X, y = make_regression(n_features=4, n_informative=2,
                       random_state=0, shuffle=False)
regr = RandomForestRegressor(max_depth=2, random_state=0)
regr.fit(X, y)
print(regr.predict([[0, 0, 0, 0]]))

Just changing one line to this:

regr = RandomForestRegressor(max_depth=2, random_state=0, max_samples=1.0)

doesn’t work anymore:

ValueError: `max_samples` must be in range (0, 1) but got value 1.0

I believe max_samples=None (the default) and max_samples=1.0 should behave the same.

Steps/Code to Reproduce

see above

Versions

System:
    python: 3.8.5 (default, Jan 27 2021, 15:41:15)  [GCC 9.3.0]
executable: /usr/bin/python3
   machine: Linux-5.8.0-53-generic-x86_64-with-glibc2.29

Python dependencies:
          pip: 20.0.2
   setuptools: 45.2.0
      sklearn: 0.24.2
        numpy: 1.17.4
        scipy: 1.6.3
       Cython: None
       pandas: None
   matplotlib: 3.4.2
       joblib: 1.0.1
threadpoolctl: 2.1.0
Built with OpenMP: True

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
UnixJunkiecommented, May 28, 2021

max_samples=0.0 should crash (you cannot train a model without training samples), max_samples=1.0 should work just fine (you can train a model using all the available training samples)

On Fri, May 28, 2021 at 3:02 PM murata-yu @.***> wrote:

Yes, max_samples=1.0 and max_samples=None should behave the same . But, max_samples is not expected to be greater equal than 1.0 . So, I think it is correct tha an error occured at max_samples=1.0 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scikit-learn/scikit-learn/issues/20156#issuecomment-850159999, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFUFABWJVGAEGHFE7TXDNTTP4WWRANCNFSM45VVMK4Q .

1reaction
UnixJunkiecommented, May 28, 2021

Well spotted, please send a PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

FIX Fix RandomForestRegressor doesn't accept max_samples ...
Hi, there. I add tests on 3f8a2a6. I have some questions to discuss. FOREST_CLASSIFIERS have predict_proba , but FOREST_REGRESSORS don't have predict_proba ...
Read more >
sklearn.ensemble.RandomForestRegressor
A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses...
Read more >
python - How to select base-model samples in Random Forest ...
The task you are trying to solve is setting the probabilities/weights for the bootstrap algorithm so that the samples with higher weights ...
Read more >
Ensemble Learning Explained! Part 2 | by Vignesh Madanan
This process is repeated until the error function doesn't change, or max limit of the estimator is reached. AdaBoostRegressor(base_estimator = ...
Read more >
ajayannamaneni20/python-random-forests-assignment - Jovian
In this assignment, you'll continue building on the previous assignment to predict the price of a house using information like its location, area, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found