Fix aggregate_events leaking non-matching sub-events into buckets#2013
Open
jichaowang02-lang wants to merge 1 commit into
Open
Fix aggregate_events leaking non-matching sub-events into buckets#2013jichaowang02-lang wants to merge 1 commit into
jichaowang02-lang wants to merge 1 commit into
Conversation
`event_strip()` iterated over `event[topfield]` while calling `.remove()` on the same list. Removing the current element advances the iterator past the next one, so a non-matching sub-event immediately following a removed one is never inspected and wrongly survives in the bucket. When a correlation rule's field is a sub-field (e.g. `child.data`), every bucket then retains foreign children (bucket `a` kept `["a", "c"]`), corrupting the threshold / outlier / match analyses that re-extract sub-fields from the bucketed events. Iterate over a snapshot copy (`event[topfield][:]`) so removal can't skip elements. Adds a regression test asserting each bucket keeps only its matching children.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SpiderFootCorrelator.aggregate_eventscorrupts its result buckets when acorrelation rule's
fieldis a sub-field (e.g.child.data,source.data,entity.data): each bucket retains sub-events that do not match the bucketvalue, which then skews every downstream analysis that re-extracts sub-fields
(threshold counts, outlier counts,
match_all_to_first_collection).Root cause
event_strip()mutates the list it is iterating:Removing the current element shifts the remaining elements down one index while
the iterator still advances, so the element immediately after each removed
one is never inspected and survives.
For
field = "child.data"and one event with childrena, b, c, d:a["a"]["a", "c"]b["b"]["b", "d"]c["c"]["b", "c"]d["d"]["b", "d"](Reproduced against the real
aggregate_events.)Fix
Iterate over a snapshot copy so removal can't skip elements:
Testing
Adds
test_aggregate_events_does_not_leak_non_matching_subevents, assertingeach bucket keeps only its matching children. flake8-clean on the changed lines.