Multi-Window Finder #1062

seanlaw · 2025-01-17T04:34:09Z

It was brought to my attention that Keogh's lab wrote a nice short paper about identifying the "right" window sizes (beyond Pan Matrix Profiles) and I think it should be pretty straightforward to implement:

Paper
Code (Jupyter Notebook) and Data Sets

It would be great to see a notebook reproducer of this

NimaSarajpoor · 2025-01-21T18:23:47Z

A few things got my attention after taking a look at the paper/code. Going to share it to just highlight it for future readers:

(1) The paper proposes an algorithm which, at its core, uses a function that takes a time series T and window size m as inputs, and returns a real value as output. Although the authors use the term dist (distance) for the returned value, it is better to use another term as the returned value can be negative.

(2) The paper seems to not mention z-normalization. So, one might be curious to explore if the proposed algorithm still works when the subsequences are substantially different regarding their average but are similar after z-normalization (e.g. subsequences {0.1, 0.2, 0.3} and {100, 200, 300})

(3) Algorithm 1-lines 3-4 shows the following pseudo-code:

MA=moving-avg(T, w) //Algroithm 2
moving-dist ← 𝑆𝑢𝑚(𝐿𝑜𝑔(𝑎𝑏𝑠 (MA−𝑚𝑒𝑎𝑛(MA)))

However, the code shows the following line:

np.log(abs(moving_avg - (moving_avg).mean()).sum())

Note that sum and log are swapped. Maybe that's just a typo in placing parentheses. Or, there might be a certain reason behind such change. IMO, the paper's version makes sense as it probably tries to affect the extremely small or extremely large value in 𝑎𝑏𝑠 (MA−𝑚𝑒𝑎𝑛(MA)). The code's version however just takes a log of a positive value and this does not affect the final outcome AFAIU.

(4) Algorithm 1-lines 8-11 shows:

for i in local-min do
    𝑟𝑒𝑠 ← 𝑤𝑠 [𝑖]/(𝑖 +1)
end for
𝑤 = 𝑚𝑒𝑎𝑛(res)

The code shows:

for i in range(3):
    reswin.append(window_sizes[b[i]]/ (i+1))
reswin = np.array(reswin)
winTime = 0.8 * reswin[0] + 0.15 * reswin[1] + 0.05 * reswin[2]

why3 and 0.8, 0.15, 0.05?

seanlaw · 2025-01-21T18:35:06Z

Note that sum and log are swapped.

I noticed this too and this inconsistency scares me. I think we really need to take special care when trying this out and really understand/test everything before adding it

seanlaw added enhancement New feature or request notebook reproducer labels Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Window Finder #1062

Multi-Window Finder #1062

seanlaw commented Jan 17, 2025 •

edited

Loading

NimaSarajpoor commented Jan 21, 2025

seanlaw commented Jan 21, 2025

Multi-Window Finder #1062

Multi-Window Finder #1062

Comments

seanlaw commented Jan 17, 2025 • edited Loading

NimaSarajpoor commented Jan 21, 2025

seanlaw commented Jan 21, 2025

seanlaw commented Jan 17, 2025 •

edited

Loading