Default parameters for RANSACRegressor not being initialized
See original GitHub issueI’m testing the AutoML approach with Fedot
on a dataset with 11 rows and 66 columns (never mind the many more columns than rows in this case). The default parameters for the strategy (ransac_non_lin_reg
) aren’t being initialized. For example, min_samples
is None
. And I’m getting an error:
Traceback (most recent call last):
....
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\api\main.py", line 206, in fit
return self._obtain_model(is_composing_required)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\api\main.py", line 156, in _obtain_model
self.current_model, self.best_models, self.history = compose_fedot_model(**execution_params)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\api\api_utils.py", line 190, in compose_fedot_model
chain_gp_composed = gp_composer.compose_chain(data=train_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\gp_composer\gp_composer.py", line 140, in compose_chain
best_chain = self.optimiser.optimise(metric_function_for_nodes,
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\param_free_gp_optimiser.py", line 123, in optimise
self._evaluate_individuals(new_population, objective_function, timer=t)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\gp_optimiser.py", line 383, in _evaluate_individuals
evaluate_individuals(individuals_set=individuals_set, objective_function=objective_function, timer=timer,
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\gp_operators.py", line 85, in evaluate_individuals
ind.fitness = calculate_objective(ind.chain, objective_function, is_multi_objective)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\gp_operators.py", line 100, in calculate_objective
calculated_fitness = objective_function(ind)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\cross_validation.py", line 20, in cross_validation
chain.fit(train_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\chain.py", line 173, in fit
train_predicted = self._fit(input_data=copied_input_data, use_cache=use_cache)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\chain.py", line 142, in _fit
train_predicted = self.root_node.fit(input_data=input_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 241, in fit
secondary_input = self._input_from_parents(input_data=input_data, parent_operation='fit')
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 272, in _input_from_parents
parent_results, target = _combine_parents(parent_nodes, input_data,
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 303, in _combine_parents
prediction = parent.fit(input_data=input_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 241, in fit
secondary_input = self._input_from_parents(input_data=input_data, parent_operation='fit')
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 272, in _input_from_parents
parent_results, target = _combine_parents(parent_nodes, input_data,
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 303, in _combine_parents
prediction = parent.fit(input_data=input_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 174, in fit
return super().fit(input_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 96, in fit
self.fitted_operation, operation_predict = self.operation.fit(data=input_data,
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\operation.py", line 86, in fit
fitted_operation = self._eval_strategy.fit(train_data=data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\evaluation\regression.py", line 65, in fit
operation_implementation.fit(train_data)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\evaluation\operation_implementations\data_operations\sklearn_filters.py", line 25, in fit
self.operation.fit(input_data.features, input_data.target)
File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\sklearn\linear_model\_ransac.py", line 281, in fit
raise ValueError("`min_samples` may not be larger than number "
ValueError: `min_samples` may not be larger than number of samples: n_samples = 11.
In this case, min_samples
is much larger than the number of rows, and RANSACRegressor doesn’t allow that. How do I make sure min_samples
is properly initialized?
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
Top Results From Across the Web
sklearn.linear_model.RANSACRegressor
RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set. Read more...
Read more >'residual_metric' error in RANSACRegressor model in python
I want to do a regression model and with RANSACRegressor i want to delete abnormal value,this is my code:.
Read more >scikit-learn/_ransac.py at main - GitHub
By default a. ``sklearn.linear_model.LinearRegression()`` estimator is assumed and. `min_samples` is chosen as ``X.shape[1] + 1``. This parameter is highly.
Read more >sklearn.linear_model.RANSACRegressor - lijiancheng0614
RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set. More information...
Read more >scikit-learn - linear_model.RANSACRegressor() - 编程狮
RANSAC is an iterative algorithm for the robust estimation of parameters ... By default the threshold is chosen as the MAD (median absolute...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Dear Bruno, hi! Thank you very much for your message Our team and I also just recently discovered this error and have already corrected it. This bug should no longer be in the master branch.
Check this closed pull request for more information As proof of this, you can check this unit test test_ransac_with_invalid_params_fit_correctly, which checks the occurrence of this error
A brief explanation: FEDOT in its structure has a json file that stores default hyperparameter values for different operations. We added such parameters for the RANSAC algorithm, where the value of the hyperparameter min_samples is given as a relative number (varies from 0 to 1). For now, it’s 0.4. So no matter how many features/columns and rows are in the dataset, FEDOT will now be able to adequately initialize the RANSAC operation.
We will prepare a new version of FEDOT soon, and you can use all the above changes by installing FEDOT via “pip install”. But if you want to try a quicker fix, use the framework version from the master branch.
So, now it works 😃
You сan set verbose_level=1 in the ‘Fedot’ class constructor.