Add MSLE(Mean Squared Logarithmic Error) to TreeRegressors
See original GitHub issueDescribe the workflow you want to enable
from sklearn.tree import DecisionTreeRegressor
model = DecisionTreeRegressor(criterion = 'msle')
There are some differences between MSE and MSLE. First, MSLE only cares about the percentual differences. And second, MSLE penalizes underestimates more than overestimates.
Especially, in business, I frequently meet the situation which should handle underestimation. When we forecast delivery time, for example, data analysts want to calculate delivery time conservatively to avoid VOC. Sometimes, they just added +10 minutes to their model’s result. I remind them that they could convert their label y to log scale and put them in the model to compute error as logarithmic error, but they didn’t just because it seems difficult.
I think, if there is more criterion option, logarithmic error, in the basic model like trees, analysts could handle this problem easily.
Describe your proposed solution
Add logarithmic error to sklearn/tree/_criterion.c Add logarithmic error to sklearn/tree/_classes.py
CRITERIA_REG = {"mse": _criterion.MSE, "friedman_mse": _criterion.FriedmanMSE, "mae": _criterion.MAE, "msle": _criterion.MSLE}
Add documentation about MSLE or comparison between MSE and MSLE
Describe alternatives you’ve considered, if relevant
Just add documentation about MSLE and MSE, and share how to compute logarithmic error with MSE criterion in trees.
Additional context
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:9 (8 by maintainers)
https://orsociety.tandfonline.com/doi/full/10.1057/jors.2014.103 has more than 200 citations, was published over 5 years ago, MSLE is an improvement over MAPE
It might be helpful to have MSLE in our metrics module before trying to make it available as a tree criterion.