Uniform handling of y_train in forecasting performance metrics
See original GitHub issueIs your feature request related to a problem? Please describe.
Some forecasting performance metrics require y_train
, e.g. MASE. This adds some complication to higher-level functionality that expects a common interface for metrics, like the evaluate
function or ForecastingGridSearchCV
, and in unit testing (see #672).
Current problem
This currently fails because y_train
is not passed internally when calling scoring
.
from sktime.forecasting.all import *
from sktime.forecasting.model_evaluation import evaluate
y = load_airline()
f = NaiveForecaster()
cv = SlidingWindowSplitter()
scoring = MASE()
out = evaluate(f, cv, y, scoring=scoring)
Possible solutions
- Change interface for all performance metrics to optionally accept
y_train
, but only those that require it use it. This requires wrapping metrics from scikit-learn. - Add case distinctions in higher-level functionality to separately handle those metrics that require
y_train
and those that do not. This requires adding arequires_y_train
attribute to metric classes. - Adapt metrics interface at run time to, making case distinctions inside adapter, exposing uniform interface to higher-order functionality (suggested by @fkiraly). This also requires adding a
requires_y_train
attribute to metric classes.
Describe the solution you’d like
from sktime.forecasting.all import *
from sktime.forecasting.model_evaluation import evaluate
y = load_airline()
fh = np.arange(1, 10)
y_train, y_test = temporal_train_test_split(y, fh=fh)
f = NaiveForecaster()
f.fit(y_train)
y_pred = f.predict(fh)
# uniform interface
scoring = MASE()
scoring.requires_y_train = True
scoring = check_scoring(scoring)
scoring(y_test, y_pred, y_train)
>>> 3.577770878609128
scoring = sMAPE()
scoring.requires_y_train = False
scoring = check_scoring(scoring)
scoring(y_test, y_pred, y_train)
>>> 0.1780237534499896
Here’s a rough implementation of the adapter-based solution:
class _MetricAdapter:
"""
Adapter for performance metrics to uniformly handle
y_train requirement of some metrics.
"""
def __init__(self, metric):
# wrap metric object
self.metric = metric
def __call__(self, y_true, y_pred, y_train, *args, **kwargs):
"""Compute metric, uniformly handling those metrics that
require `y_train` and those that do not.
"""
# if y_train is required, pass it on
if self.metric.requires_y_train:
return self.metric(y_true, y_pred, y_train, *args, **kwargs)
# otherwise, ignore y_train
else:
return self.metric(y_true, y_pred, *args, **kwargs)
def __getattr__(self, attr):
# delegate attribute queries to the wrapped metric object
return getattr(self.metric, attr)
def __repr___(self):
return repr(self.metric)
def _adapt_scoring(scoring):
"""Helper function to adapt scoring to uniformly handle y_train requirement"""
return MetricAdapter(scoring)
def check_scoring(scoring):
"""
Validate `scoring` object.
Parameters
----------
scoring : object
Callable metric object.
Returns
-------
scoring : object
Validated `scoring` object, or sMAPE() if `scoring` is None.
Raises
------
TypeError
If `scoring` is not a callable object.
"""
from sktime.performance_metrics.forecasting import sMAPE
from sktime.performance_metrics.forecasting._classes import MetricFunctionWrapper
if scoring is None:
return sMAPE()
if not callable(scoring):
raise TypeError("`scoring` must be a callable object")
valid_base_class = MetricFunctionWrapper
if not isinstance(scoring, valid_base_class):
raise TypeError(f"`scoring` must inherit from `{valid_base_class.__name__}`")
return _adapt_scoring(scoring)
Issue Analytics
- State:
- Created 3 years ago
- Comments:16
Top Results From Across the Web
Uniform handling of y_train in forecasting performance metrics
Some forecasting performance metrics require y_train, e.g. MASE. ... *args, **kwargs): """Compute metric, uniformly handling those metrics ...
Read more >Time Series Analysis for Business Forecasting
Performance measure (or indicator, or objective): Measuring business performance is the top priority for managers. Management by objective works if you know ...
Read more >Evaluating time series forecasting models: an empirical study ...
Abstract. Performance estimation aims at estimating the loss that a predictive model will incur on unseen data. This process is a fundamental ...
Read more >A Suite of Metrics for Assessing the ... - ScienceDirect.com
solar forecasts with uniform forecasting improvements, ... these two approaches by using both physical and historical data as inputs to train statistical.
Read more >Time Series Forecast Error Metrics you should know
Percentage Error Metrics solve this problem. They are scale independent and used to compare forecast performance between different time ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@mloning and @fkiraly I’ve started a draft PR (#858).
I’ll be updating how the performance_metric tests yet, but wanted to start the PR to get some initial feedback.
@mloning No problem bringing it back up. I am with you in terms of not being as comfortable with the functions in
_functions.py
all requiringy_train
.Also thanks for the clarification on kwargs, it helps to hear about the background on these things so I can keep that in mind on future work! I definately get that we want to keep hyperparameters/config args in the constructor and the
__call__
method. I had been thinking that ruled out kwargs in general.If we are good to use kwargs for things that don’t fall into that hyper-parameter/confi args bucket, then I’m with @fkiraly and think we use that to catch
y_train
when it is passed to function.Higher-level functionality like
evaluate
would just always passy_train
.I’ll start implementing and can make any tweaks once I get a draft PR.