Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Default parameters for RANSACRegressor not being initialized

See original GitHub issue

I’m testing the AutoML approach with Fedot on a dataset with 11 rows and 66 columns (never mind the many more columns than rows in this case). The default parameters for the strategy (ransac_non_lin_reg) aren’t being initialized. For example, min_samples is None. And I’m getting an error:

Traceback (most recent call last):
  ....
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\api\main.py", line 206, in fit
    return self._obtain_model(is_composing_required)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\api\main.py", line 156, in _obtain_model
    self.current_model, self.best_models, self.history = compose_fedot_model(**execution_params)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\api\api_utils.py", line 190, in compose_fedot_model
    chain_gp_composed = gp_composer.compose_chain(data=train_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\gp_composer\gp_composer.py", line 140, in compose_chain
    best_chain = self.optimiser.optimise(metric_function_for_nodes,
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\param_free_gp_optimiser.py", line 123, in optimise
    self._evaluate_individuals(new_population, objective_function, timer=t)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\gp_optimiser.py", line 383, in _evaluate_individuals
    evaluate_individuals(individuals_set=individuals_set, objective_function=objective_function, timer=timer,
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\gp_operators.py", line 85, in evaluate_individuals
    ind.fitness = calculate_objective(ind.chain, objective_function, is_multi_objective)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\composer\optimisers\gp_comp\gp_operators.py", line 100, in calculate_objective
    calculated_fitness = objective_function(ind)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\cross_validation.py", line 20, in cross_validation
    chain.fit(train_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\chain.py", line 173, in fit
    train_predicted = self._fit(input_data=copied_input_data, use_cache=use_cache)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\chain.py", line 142, in _fit
    train_predicted = self.root_node.fit(input_data=input_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 241, in fit
    secondary_input = self._input_from_parents(input_data=input_data, parent_operation='fit')
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 272, in _input_from_parents
    parent_results, target = _combine_parents(parent_nodes, input_data,
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 303, in _combine_parents
    prediction = parent.fit(input_data=input_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 241, in fit
    secondary_input = self._input_from_parents(input_data=input_data, parent_operation='fit')
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 272, in _input_from_parents
    parent_results, target = _combine_parents(parent_nodes, input_data,
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 303, in _combine_parents
    prediction = parent.fit(input_data=input_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 174, in fit
    return super().fit(input_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\chains\node.py", line 96, in fit
    self.fitted_operation, operation_predict = self.operation.fit(data=input_data,
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\operation.py", line 86, in fit
    fitted_operation = self._eval_strategy.fit(train_data=data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\evaluation\regression.py", line 65, in fit
    operation_implementation.fit(train_data)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\fedot\core\operations\evaluation\operation_implementations\data_operations\sklearn_filters.py", line 25, in fit
    self.operation.fit(input_data.features, input_data.target)
  File "C:\Users\username\Miniconda3\envs\myenv\lib\site-packages\sklearn\linear_model\_ransac.py", line 281, in fit
    raise ValueError("`min_samples` may not be larger than number "
ValueError: `min_samples` may not be larger than number of samples: n_samples = 11.

In this case, min_samples is much larger than the number of rows, and RANSACRegressor doesn’t allow that. How do I make sure min_samples is properly initialized?

Issue Analytics

State:
Created 2 years ago
Comments:6

Top GitHub Comments

2reactions

Dreamlonecommented, Aug 11, 2021

Dear Bruno, hi! Thank you very much for your message Our team and I also just recently discovered this error and have already corrected it. This bug should no longer be in the master branch.

Check this closed pull request for more information As proof of this, you can check this unit test test_ransac_with_invalid_params_fit_correctly, which checks the occurrence of this error

A brief explanation: FEDOT in its structure has a json file that stores default hyperparameter values for different operations. We added such parameters for the RANSAC algorithm, where the value of the hyperparameter min_samples is given as a relative number (varies from 0 to 1). For now, it’s 0.4. So no matter how many features/columns and rows are in the dataset, FEDOT will now be able to adequately initialize the RANSAC operation.

We will prepare a new version of FEDOT soon, and you can use all the above changes by installing FEDOT via “pip install”. But if you want to try a quicker fix, use the framework version from the master branch.

So, now it works 😃

1reaction

nicl-nnocommented, Aug 30, 2021

You сan set verbose_level=1 in the ‘Fedot’ class constructor.

Top Results From Across the Web

sklearn.linear_model.RANSACRegressor

RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set. Read more...

'residual_metric' error in RANSACRegressor model in python

I want to do a regression model and with RANSACRegressor i want to delete abnormal value,this is my code:.

scikit-learn/_ransac.py at main - GitHub

By default a. ``sklearn.linear_model.LinearRegression()`` estimator is assumed and. `min_samples` is chosen as ``X.shape[1] + 1``. This parameter is highly.

sklearn.linear_model.RANSACRegressor - lijiancheng0614

RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set. More information...

scikit-learn - linear_model.RANSACRegressor() - 编程狮

RANSAC is an iterative algorithm for the robust estimation of parameters ... By default the threshold is chosen as the MAD (median absolute...