Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIssing bad captures (est. 2025-01-11) #17

Open
Mr0grog opened this issue Feb 14, 2025 · 1 comment
Open

MIssing bad captures (est. 2025-01-11) #17

Mr0grog opened this issue Feb 14, 2025 · 1 comment

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Feb 14, 2025

This is a running list of example bad captures that we are failing to flag/ignore.

@Mr0grog
Copy link
Member Author

Mr0grog commented Feb 14, 2025

Unfortunately there’s not much useful there to classify as different from a normal/legit 500 error. We may need to look contextually at versions before/after it to see if it’s temporary.

Ironically this is a particular case that the old method (which tried to skip over any errors for the baseline version) handled better. Maybe we could bring that back in a slightly modified form:

  • We currently look up to 3 versions/30 days back when the baseline version would otherwise be a bad/questionable capture
  • We could additionally/alternatively look up to, say, 5 days back (something much shorter than 30) when the baseline would otherwise be an error (but otherwise looks like a good/valid capture).

But the contextuality thing is good, too. To iterate through versions and their surrounding context, Itertools has a nice pairwise iterator, and we probably want something like that but with N items (3 or 5) instead of 2. Toolz has sliding_window, which is close to ideal for this (but we probably want to prefix the list with N/2 items, e.g. in the sequence (1,2,3,4) we should start with (None, 1, 2) instead of (1, 2, 3)). See also the official Itertools recipes in the Python docs for a nice example of a simple implementation.

@Mr0grog Mr0grog moved this to Inbox in Web Monitoring Feb 17, 2025
@Mr0grog Mr0grog moved this from Inbox to Prioritized in Web Monitoring Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Prioritized
Development

No branches or pull requests

1 participant