[ENH] add unequal-length time series support for tsfresh-based methods #3187

jsquaredosquared · 2025-12-21T06:30:17Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

The TSFresh transformer and time series methods that use it do not support unequal-length time series, but according to the tsfresh FAQ it should be capable of doing so.

This pull request replaces the _from_3d_numpy_to_long function with a _from_collection_to_long function capable of handling both collection types (3D numpy arrays and lists of 2D numpy arrays).

Does your contribution introduce a new dependency? If yes, which one?

No new dependencies.

Any other comments?

PR checklist

For all contributions

I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you after the PR has been merged.
The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.

aeon-actions-bot · 2025-12-21T06:30:38Z

Thank you for contributing to `aeon`

I have added the following labels to this PR based on the title: [ enhancement ].
This PR changes too many different packages (>3) for automatic addition of labels, please manually add package labels if relevant.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

Run pre-commit checks for all files
Run mypy typecheck tests
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)
Disable numba cache loading
Regenerate expected results for testing
Push an empty commit to re-run CI checks

Adityakushwaha2006 · 2025-12-21T15:26:39Z

Hey @jsquaredosquared I pulled this PR on a local branch to try and identify the issues with it currently, The work seems well implemented.
Im partly confident that ive identified the cause of the Cl checks failing.
Let me know if you'd need help in figuring it out :D

jsquaredosquared · 2025-12-22T12:30:32Z

Hey @jsquaredosquared I pulled this PR on a local branch to try and identify the issues with it currently, The work seems well implemented. Im partly confident that ive identified the cause of the Cl checks failing. Let me know if you'd need help in figuring it out :D

Thanks for the offer :D
Can you help me with the last failing check?

P.S., I forgot to add you as a co-author to the original commit 🥲. Is there another way I can credit you?

Adityakushwaha2006 · 2025-12-22T13:41:03Z

Theres no need for credit , this entire PR has only been your code :)
As for the Cl fail , it seems that its not updated with the most recent main version .
There should be a button to update your branch , you'll need to rebase it and then run the checks again.

jsquaredosquared · 2025-12-22T14:05:19Z

All right, but thanks again for your help 😄 👍

SebastianSchmidl

Nice work 👍🏼 . Could you update the tests for the different tsfresh estimators to also include variable-length inputs? Please also run the tests with tsfresh installed locally.

MatthewMiddlehurst · 2026-01-02T17:23:12Z

Looks good, thanks. Good to add a unequal length test for the transformer at least I agree.

Could you see what other estimators use tsfresh i.e. FreshPRINCE in classification. May just be that.

It would be good to check this has not altered the output. Could you compare this against the old version for equal length data and make sure the features are the same for a couple of datasets. Post the code and output here. Alternatively if there is a test that already compares against output can link here.

jsquaredosquared · 2026-02-02T22:45:50Z

Could you see what other estimators use tsfresh i.e. FreshPRINCE in classification. May just be that.

I think I have updated all classifiers, clusterers, and regressors that use tsfresh. Please let me know if I have missed one.

It would be good to check this has not altered the output. Could you compare this against the old version for equal length data and make sure the features are the same for a couple of datasets. Post the code and output here. Alternatively if there is a test that already compares against output can link here.

import numpy as np
import pandas as pd

from aeon.testing.data_generation import (
    make_example_3d_numpy,
    make_example_3d_numpy_list,
)


# Old function
def _from_3d_numpy_to_long(arr):
    # Converting the numpy array to a long format DataFrame
    n_cases, n_channels, n_timepoints = arr.shape

    # Creating a DataFrame from the numpy array with multi-level index
    df = pd.DataFrame(arr.reshape(n_cases * n_channels, n_timepoints))
    df["case_index"] = np.repeat(np.arange(n_cases), n_channels)
    df["dimension"] = np.tile(np.arange(n_channels), n_cases)
    df = df.melt(
        id_vars=["case_index", "dimension"], var_name="time_index", value_name="value"
    )

    # Adjusting the column order and renaming columns
    df = df[["case_index", "time_index", "dimension", "value"]]
    df = df.rename(columns={"case_index": "index", "dimension": "column"})
    df["column"] = "dim_" + df["column"].astype(str)
    return df


# New function
def _from_collection_to_long(collection):
    n_cases = len(collection)
    n_channels = collection[0].shape[0]
    n_timepoints = np.array([arr.shape[1] for arr in collection])

    index = np.repeat(np.arange(n_cases), n_channels * n_timepoints)
    timepoints = [np.arange(timepoints_i) for timepoints_i in n_timepoints]
    time_index = np.concatenate([np.tile(arr, n_channels) for arr in timepoints])
    column = np.concatenate(
        [
            np.repeat(np.arange(n_channels), timepoints_i)
            for timepoints_i in n_timepoints
        ]
    )
    value = np.concatenate([arr.flatten() for arr in collection])

    df = pd.DataFrame(
        {"index": index, "time_index": time_index, "column": column, "value": value}
    )
    df["column"] = "dim_" + df["column"].astype(str)

    return df


X, y = make_example_3d_numpy()  # or replace with any dataset.

# Old function returns things in a different order, so need to rearrange to allow direct comparison.
# This does not change the actual values.
Xt_old = (
    _from_3d_numpy_to_long(X)
    .sort_values(
        by=[
            "column",
            "index",
            "time_index",
        ]
    )
    .reset_index(drop=True)
)
Xt_new = _from_collection_to_long(X)
(Xt_old == Xt_new).all(axis=None)

feat: add unequal-length time series support for tsfresh-based methods

07ecfc9

jsquaredosquared requested review from MatthewMiddlehurst, TonyBagnall, chrisholder and dguijo as code owners December 21, 2025 06:30

aeon-actions-bot bot added the enhancement New feature, improvement request or other non-bug code enhancement label Dec 21, 2025

docs: modify docstrings to be consistent

5463ccd

jsquaredosquared closed this Dec 21, 2025

update x_inner_type from numpy3D to np-list

ba70c76

jsquaredosquared reopened this Dec 21, 2025

jsquaredosquared marked this pull request as draft December 21, 2025 07:04

fix: add missing X_inner_type tag

44d1a24

Merge branch 'main' into tsfresh-unequal

b86ac6f

jsquaredosquared marked this pull request as ready for review December 22, 2025 14:03

SebastianSchmidl reviewed Dec 27, 2025

View reviewed changes

Merge branch 'main' into tsfresh-unequal

593cfa3

test: include unequal length time series in tsfresh extractor test

4401568

jsquaredosquared force-pushed the tsfresh-unequal branch from ecdb1c9 to 4401568 Compare February 2, 2026 22:23

Merge branch 'main' into tsfresh-unequal

e4829fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] add unequal-length time series support for tsfresh-based methods #3187

[ENH] add unequal-length time series support for tsfresh-based methods #3187

jsquaredosquared commented Dec 21, 2025 •

edited

Loading

Uh oh!

aeon-actions-bot bot commented Dec 21, 2025

Uh oh!

Adityakushwaha2006 commented Dec 21, 2025

Uh oh!

jsquaredosquared commented Dec 22, 2025

Uh oh!

Adityakushwaha2006 commented Dec 22, 2025

Uh oh!

jsquaredosquared commented Dec 22, 2025

Uh oh!

SebastianSchmidl left a comment •

edited

Loading

Uh oh!

MatthewMiddlehurst commented Jan 2, 2026

Uh oh!

jsquaredosquared commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ENH] add unequal-length time series support for tsfresh-based methods #3187

Are you sure you want to change the base?

[ENH] add unequal-length time series support for tsfresh-based methods #3187

Conversation

jsquaredosquared commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

Any other comments?

PR checklist

For all contributions

Uh oh!

aeon-actions-bot bot commented Dec 21, 2025

Thank you for contributing to aeon

Uh oh!

Adityakushwaha2006 commented Dec 21, 2025

Uh oh!

jsquaredosquared commented Dec 22, 2025

Uh oh!

Adityakushwaha2006 commented Dec 22, 2025

Uh oh!

jsquaredosquared commented Dec 22, 2025

Uh oh!

SebastianSchmidl left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MatthewMiddlehurst commented Jan 2, 2026

Uh oh!

jsquaredosquared commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jsquaredosquared commented Dec 21, 2025 •

edited

Loading

Thank you for contributing to `aeon`

SebastianSchmidl left a comment •

edited

Loading