Threat Hunting with Polars

Threat hunting with Polars and flaws.cloud AWS CloudTrail datasets. Check out threat hunting notebook in nbviewer or rerun the hunt yourself in Jupyter lab.

Normalized datasets and alerts can be found as parquet files in the results directory. You can load these for further exploration using your OLAP database of choice.

Motivation

Polars is a OLAP query engine written in Rust. It's highly memory efficient, uses Apache Arrow as its memory model, and consistently tops database speed benchmarks against distributed OLAP engines e.g. PySpark and Snowflake.

At Tracecat, we use Polars as an alternative to jq or grep for quick-and-dirty threat hunting.

Why Polars for log analysis?

Ridiculously fast and efficient string operations
Piped query language
Highly parallelized window operations
Powerful aggregation functions to compute metrics
Small binary with zero dependencies (~70ms import time)

If your logs fit in memory and you are using Python / Jupyter Notebooks as part of your threat hunting process, Polars should be your goto tool for threat hunting.

Note: for every 1GB of gzipped JSON logs on disk, you can expect Polars in-memory data model to take up approximately ~500MB of RAM.

Getting Started

Prerequisites

Requires python>3.9, pip, and git lfs to be installed.s

First clone the repository and download datasets from git lfs (large file system).

git clone [email protected]:TracecatHQ/hunts.git
cd hunts
git lfs fetch
git lfs pull

Create a new python environment using pip or conda (optional), then install the required dependencies via pip install -r requirements.txt.

Finally, spin up Jupyter lab using jupyter lab to view the aws_flaws.ipynb and aws_flaws_2.ipynb notebooks inside the notebooks directory.

Contact Us

Interested in our work bringing low-cost, but powerful data engineering tools to cybersecurity? We'd love to hear your thoughts over email [email protected] or find us in the Tracecat Discord community!

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
notebooks		notebooks
results		results
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Threat Hunting with Polars

Motivation

Why Polars for log analysis?

Getting Started

Prerequisites

Contact Us

License

About

Languages

License

TracecatHQ/hunts

Folders and files

Latest commit

History

Repository files navigation

Threat Hunting with Polars

Motivation

Why Polars for log analysis?

Getting Started

Prerequisites

Contact Us

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages