[Enhancement] Add support for sample_weight in the fit function
See original GitHub issueThe scikit-learn KMeans algorithm allows support for supplying a weight for each sample in the fit function. See the docs here.
Is this possible to add into the algorithm? i.e. can we have the minimum and maximum bounds account for the sum of all weights instead of the count of all samples? I haven’t read into the MinCostFlow
algorithm so I don’t know how feasible this would be.
Issue Analytics
- State:
- Created a year ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Sample Weight Support for Regression Problems [ENH] #37
Currently it seems like Boruta-Shap does not support this (unless I'm missing something). Enhancement. Add support for sample_weights.
Read more >Using sample_weight in Keras for sequence labelling
and pass that to the fit function through the sample_weight parameter after having added the sample_weight_mode="temporal" option in compile() .
Read more >Customizing what happens in `fit()` - Keras
This is the function that is called by fit() for every batch of data. ... If you want to support the fit() arguments...
Read more >Version 0.16.1 — scikit-learn 1.2.0 documentation
Add support for sample weights in scorer objects. Metrics with sample weight support will automatically benefit from it. By Noel Dawe and Vlad...
Read more >Python API Reference — xgboost 1.7.2 documentation
See Global Configuration for the full list of parameters supported in the global ... When eval_metric is also passed to the fit() function,...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I’ve had a rethink…
I’ve had a look at how scikit-learn defines
sample_weights
:Which I think its different to what I said:
and how you described it:
All of the above is possible - it’s just about figuring out what to weight. Feel free to have a shot at it. I will also have a longer think about what is needed
@joshlk I think I have a similar need. In the problem I’m trying to solve, size_max is the sum of the weights of a cluster instead the size of a cluster. A point of X is the centroid of a polygon and the weight of that polygon (point of X) is the sum of its vertices. Do you think the algorithm can be easily modified to handle this?