Skip to content

[ENH] Warning when fh range does not start at 1 #8525

@mabuimo

Description

@mabuimo

The use of a range for the fh argument of the splitter is problematic and not transparent to the user as it allows the range to begin at 0 without any warning. Starting at 0 will include the last point of the training set in the test. It is very easy for the user to forget to explicitly declare 1 as the beginning of the range.

Example:

benchmark2 = ForecastingBenchmark()

scorers = [MeanSquaredError(square_root=True)]
single_fold_splitter = SingleWindowSplitter(window_length = 52, fh=range(1,5))
 # Starting at 0 would include the last value of the in-sample data (training set) in the test set (!)


benchmark2.add_task(
        y_sample_sorted,
        single_fold_splitter,
        scorers
    )

benchmark2.add_estimator(
   ChronosForecaster("amazon/chronos-bolt-tiny"), estimator_id="chronos_bolt"
)
benchmark2.add_estimator(
   ChronosForecaster("amazon/chronos-bolt-base"), estimator_id="chronos_base"
)
benchmark2.add_estimator(
   TinyTimeMixerForecaster()
)
benchmark2.add_estimator(
   MOIRAIForecaster(checkpoint_path=f"sktime/moirai-1.0-R-small"), estimator_id="moirai_small"
)
benchmark2.add_estimator(
   MOIRAIForecaster(checkpoint_path=f"sktime/moirai-1.0-R-large"), estimator_id="moirai_large"
)
results2_df = benchmark2.run(local_output_path2)

The proposed enhancement is to display a warning as a minimum, or implement a hard restriction so that starting the range at 0 is not allowed to enforeced best practices in cross-validation.

This issue is related to:
#4716
sktime/tutorial_haicon_prologue_day#1

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAdding new functionalitymodule:forecastingforecasting module: forecasting, incl probabilistic and hierarchical forecasting

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions