-
Notifications
You must be signed in to change notification settings - Fork 634
New LLM feature: Starred events to forensic report (stored as a Story) #3332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Two notes:
|
|
comment from the sideline, I believe keeping the limit to 1000 events is plenty, even if we increase the possible number to use for llm in general, having somewhat of a limit here could be just fine. |
| this.isSummaryLoading = false | ||
| }) | ||
| }, | ||
| generateForensicReport() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to call it a Forensic Report? The term could be misunderstood, so maybe something more specific to what this function is doing: keyEventReport?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, in that case I opt we go for generateStarredEventsReport, and to call the feature llm_starred_events_report etc. Agreed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
round 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new placement makes sense to me. However, please remove all mentions of any AI/LLM context, since this utils should be used independently. (e.g. the default title)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename variables as discussed. llm_starred_events_report etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the prod deployment we also need add the prompt file to the # Fetch default Timesketch config files step in the contrib/deploy_timesketch.sh and ps1 files.
| this.warningSnackBar('This feature is currently limited to a 1000 starred events, try setting a timerange filter. ' + | ||
| 'This limit will be increased soon.', 10000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about making this limit configurable in the timesketch.conf? This would allow admins to increase it themself if the like to any time.
In general: Lets remove the "This limit will be increased soon." part from the message. It is out of the control of the TS operator or user and we don't have a timeline for when we can increase it. So not having this, is long-term more reliable than having this and never increase the limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove this last part of the message. Making it configurable for an admin I'm not a fan of at the moment, because if they increase the limit to a size that won't fit inside the chosen LLM provider's context window, the feature will break. So better if we can control this setting for now until we can include such granular settings.
| sketch: The Sketch object containing events to analyze. | ||
| **kwargs: Additional arguments including: | ||
| - form: Form data containing query and filter information. | ||
| - datastore: OpenSearch instance for querying. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the backend, you actually don't need the datastore object to query. You only need the sketch_id and can then use the wrapper I mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which wrapper "mentioned above" are you referring to? I can't seem to find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your generate_prompt function takes a datastore object as input argument. But it does not need that, since you can just load from timesketch.lib.datastore.opensearch import OpenSearchDataStore as datastore and execute datastore.search() on the backend from within the function without it being handed over as argument. There is only one datastore to be used in Timesketch and that is not going to change anytime soon.
| if not form: | ||
| raise ValueError("Missing 'form' data in kwargs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If form is a required argument, why is it in **kwargs and not in the normal args list?
| ) | ||
|
|
||
| if events_df is None or events_df.empty: | ||
| return "No events to analyze for forensic report." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will be the response handled by the llm API endpoint. iirc it will then still call the "process results" function? Something you want to address here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be fixed by introducing a custom SkipLLMError exception, I have a commit that will address that will address this issue for this feature as well as for llm_summarize feature, which I'll send once the changes in #3379 have been merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3379 was merged.
| timeline_ids=timeline_ids, | ||
| ) | ||
|
|
||
| total_events_count = len(events_df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlikely since the icon only shows when there are stared events, but maybe catch if there are no results.
This PR adds a new LLM-powered feature to automatically generate forensic reports (stored as a Story) from starred events.
New button in EventList.vue (only visible when viewing starred events)

Example LLM generated report from starred events
