Guidance for AI coding agents (Claude Code, Cursor, Copilot, etc.) working in panther-analysis. Human contributors should also find this useful — it consolidates the rules, conventions, and gotchas that keep the repo healthy.
This is the canonical source. Tool-specific entrypoints (CLAUDE.md, .cursorrules, etc.) should reference this file rather than duplicate its contents.
These are non-negotiable. Violating them creates security, legal, or release-process problems.
- Never commit customer data, customer names, internal hostnames, real user emails, real IPs, real account IDs, API keys, tokens, or any other sensitive information.
- Sample logs in unit tests must be redacted/synthesized. Use placeholder values like
123456789012for AWS account IDs,user@example.comfor emails,192.0.2.x(TEST-NET-1) for IPs. - If you're adapting a real-world detection from an incident, scrub identifiers and rephrase any specifics that could fingerprint the source.
- Internal Panther context (Jira tickets, Slack threads, customer specifics) belongs in PR descriptions or commits only if it does not leak protected information — when in doubt, leave it out.
- The default working branch is
develop.mainis the released branch and is updated by Panther's release process. - When opening a PR with
gh pr create, pass--base developexplicitly. Do not rely on the GitHub default. - Do not push directly to
developormain. Always go through a PR.
- All external contributors must sign the Contributor License Agreement before a PR can merge. CI will block unsigned contributions.
- If
make lintormake testfails, fix the underlying issue. Do not add# pylint: disable=..., delete failing tests, or use--no-verifyon commits to bypass hooks.
| Path | Purpose |
|---|---|
rules/ |
Streaming detection rules (analyze logs in real time) |
policies/ |
Cloud resource configuration / compliance checks |
queries/ |
Scheduled queries and signals for threat hunting |
correlation_rules/ |
Multi-step / multi-signal attack patterns |
data_models/ |
Field normalization across log sources (UDM-style) |
global_helpers/ |
Reusable Python utilities (per-platform: panther_aws_helpers, panther_okta_helpers, etc.) |
lookup_tables/ |
Reference data (CIDR ranges, account allowlists, etc.) |
packs/ |
YAML manifests grouping detections for deployment |
templates/ |
Starter templates for new detections — copy from here |
style_guides/ |
Detailed style guides (read these) |
indexes/ |
Auto-generated indexes — do not hand-edit |
deprecated.txt |
Tracks deleted detection IDs for customer cleanup |
Every detection is two files with the same basename:
foo_bar.py— Python detection logicfoo_bar.yml— metadata, configuration, and unit tests
The .yml Filename: field must match the .py filename exactly. Both files must be committed together.
make install # pipenv sync --dev
make install-pre-commit-hooks # one-time
pipenv shell # activate venvAlways run all three locally before opening a PR. CI runs the same checks.
make fmt # isort + black (line length 100)
make lint # pylint + bandit + isort check + black check
make test # global helper unit tests + data model unit tests + pat testpat is the CLI used to test, validate, and upload detections. Always invoke through pipenv.
| Task | Command |
|---|---|
| Run all detection tests | pipenv run panther_analysis_tool test |
| Test one directory | pipenv run panther_analysis_tool test --path rules/aws_cloudtrail_rules/ |
| Test one rule by ID | pipenv run panther_analysis_tool test --filter RuleID=AWS.CloudTrail.Example |
| Run a single named test case | pipenv run pat test --filter RuleID=<id> --test-names "Specific test name" |
| Filter by severity | pipenv run pat test --filter Severity=CRITICAL (comma-separate for multiple: High,Critical) |
| Filter by log type | pipenv run pat test --filter LogTypes=AWS.GuardDuty |
| Filter by analysis type | pipenv run pat test --filter AnalysisType=rule (or policy, or rule,policy) |
| Enforce minimum coverage | pipenv run pat test --minimum-tests 2 (requires both a true and false case) |
| Debug a single test (print/breakpoints work) | pipenv run pat debug <RuleID> "<unit test name>" |
| Validate against a live instance (required for correlation rules) | pipenv run pat validate --api-token ... --api-host ... |
| Build zip of detections | pipenv run pat zip |
| Upload to a Panther instance | pipenv run pat upload --api-token ... --api-host ... |
Files cannot be passed as test arguments — only metadata attributes (
RuleID,LogTypes, etc.).
Gotchas:
pat testruns unit tests defined in the.ymlTests:block. It does not make network calls and does not validate against a live Panther instance.- Correlation rules cannot be fully tested with
pat test. Usepat validateagainst a Panther instance — seestyle_guides/CORRELATION_RULES_STYLE_GUIDE.md. - When iterating on a single rule, always scope with
--pathor--filter. The full test suite is large. make testalready wrapspat testplus the helper/data-model unit tests — prefer it as the final gate.
If local pipenv is broken, make docker-build && make docker-test runs everything in a container.
Don't write from scratch. Copy the appropriate file from templates/:
templates/example_rule.py+example_rule.ymltemplates/example_policy.py+example_policy.ymltemplates/example_scheduled_rule.py+example_scheduled_rule.yml
| Field | Notes |
|---|---|
AnalysisType |
rule, policy, scheduled_rule, scheduled_query, or correlation_rule |
Filename |
Must exactly match the .py filename |
RuleID / PolicyID |
Format: LogFamily.LogType.DetectionName (e.g. AWS.CloudTrail.IAMCompromisedKeyQuarantine). Globally unique. |
DisplayName |
Human-readable, title case |
Enabled |
Boolean |
LogTypes (rules) / ResourceTypes (policies) |
List |
Severity |
Info, Low, Medium, High, or Critical — see Alert Severity Guidelines |
Description, Runbook, Reference |
Strongly recommended; Reference should link to threat research, not generic API docs |
Tests |
Always at the bottom of the file; include positive AND negative cases |
RuleID, Filename, and DisplayName must be recognizably the same detection. Litmus test: given the RuleID, a reader should be able to guess the filename, and vice versa.
DisplayName: "AWS Compromised IAM Key Quarantine"
RuleID: "AWS.CloudTrail.IAMCompromisedKeyQuarantine"
Filename: aws_iam_compromised_key_quarantine.pyReports:
MITRE ATT&CK:
- TA0006:T1556 # Modify Authentication Process
- TA0006:T1556.006 # subtechnique format: TA####:T####.###
Tags:
- Modify Authentication ProcessA comment with the technique name on the same line is required. make lint-mitre validates the mapping.
Use safe field access. Never use event['field'].
# Good
event.get("field", "")
event.deep_get("nested", "field", default="")
# Bad — raises AttributeError when fields are missing
event["field"]
event["nested"]["field"]deep_get is built into Panther's normalized event class. Don't import it from panther_base_helpers — call it as a method on event.
Always specify a default in get/deep_get to defend against missing fields.
Reuse existing alert_context helpers — check global_helpers/ for one matching your log type before writing a new one. Extend rather than replace:
from panther_aws_helpers import aws_rule_context
def alert_context(event):
return aws_rule_context(event) | {"another_field": event.get("another_field", "")}Use dynamic functions (title, severity, dedup, description, reference, runbook, destinations) when the alert should adapt to event content. See templates/example_rule.py for the full surface.
- Include both a positive case (
ExpectedResult: true) and a negative case (ExpectedResult: false). - Cover edge cases: missing fields, empty values, malformed input.
- Use realistic but fully redacted sample logs — see §1.1.
- Place the
Tests:block at the very bottom of the.ymlfile. - Add a
Mocks:block when the detection calls helpers that hit external state.
- Python 3.11.
- Black, line length 100.
- isort with
--profile=black. - Pylint and bandit must pass.
- Type hints encouraged but not required for trivial detections.
- Keep comments minimal — explain why, not what. Well-named functions and clear logic beat narrative comments.
Panther also supports a YAML-only "Simple Detection" paradigm. Instead of a Python rule(), the .yml file contains a Detection: block of declarative match expressions — no .py file at all. Use this when the logic is purely a set of field comparisons; reach for Python when you need branching, lookups, or stateful caching.
AnalysisType: rule
RuleID: AWS.RootAccount.PublicIPUsage
Enabled: true
LogTypes: [AWS.CloudTrail]
Severity: High
Detection:
- KeyPath: userIdentity.type
Condition: Equals
Value: Root
- KeyPath: sourceIPAddress
Condition: IsIPAddressPublic
- KeyPath: errorCode
Condition: IsNull
AlertTitle: "Root account [{userIdentity.accountId}] used from public IP [{sourceIPAddress}]"
GroupBy:
- KeyPath: sourceIPAddress
DedupPeriodMinutes: 60
Tests:
- Name: Public IP root usage
ExpectedResult: true
Log: { ... }KeyPath: foo— top-level fieldKeyPath: foo.bar.baz— dot notation for nested fieldsKeyPath: foo[*].bar— wildcard array accessKeyPath: foo.bar[2]— specific indexDeepKey: [foo, bar, baz]— list form (useKeyPathfor consistency)
- Key/Value —
KeyPath+Condition+Value - Key/Values —
KeyPath+Condition: IsIn+Values: [...] - Multi-key — compare two fields:
Condition: IsGreaterThan+Values: [{KeyPath: a}, {KeyPath: b}] - List comprehension —
Condition: AnyElement/AllElements/OnlyOneElement/NoElement+ nestedExpressions: - Existence —
Condition: Exists/DoesNotExist/IsNull/IsNotNull/IsNullOrEmpty - Absolute —
Condition: AlwaysTrue/AlwaysFalse
- Equality:
Equals,DoesNotEqual,IEquals(case-insensitive variants prefixI) - String:
StartsWith,EndsWith,Contains(+DoesNot…andI…variants) - Numeric:
IsGreaterThan,IsGreaterThanOrEqual,IsLessThan,IsLessThanOrEqual - IP address:
IsIPAddress,IsIPv4Address,IsIPv6Address,IsIPAddressPublic,IsIPAddressPrivate,IsIPAddressInCIDR - List membership:
IsIn,IsNotIn(withValues:)
Default is All (AND). Explicit forms:
Detection:
- Any: # OR
- {KeyPath: eventName, Condition: StartsWith, Value: List}
- {KeyPath: eventName, Condition: StartsWith, Value: Describe}
- All: [...] # AND (explicit)
- OnlyOne: [...] # XOR
- None: [...] # NOT ANDThe same match-expression grammar drives InlineFilters: on Python rules — use it to filter events out before the Python rule() runs. Inline filters support a slightly reduced condition set (no list comprehension; see Panther docs).
Detection:— the match-expression block (replacesrule())AlertTitle:— title template with{field}interpolation (replacestitle())AlertContext:— list of key/value pairs to attach (replacesalert_context())GroupBy:— list ofKeyPaths used for dedup (replacesdedup())DynamicSeverities:— list of{ChangeTo, Conditions}blocks (replacesseverity())
A "signal" is a rule that labels matching events with its RuleID but does not generate an alert. Useful for security-relevant audit events that other rules or correlation rules consume.
AnalysisType: rule
RuleID: Panther.LoginSignal
LogTypes: [Panther.Audit]
Severity: Info
CreateAlert: false # <-- the key flag
Enabled: trueConventions:
CreateAlert: falseSeverity: Info- For Python signals: only define
rule()— skiptitle,dedup,alert_context, etc. - Skip alert-related metadata (
DedupPeriodMinutes,Threshold,Runbook). - Reuse existing signals before creating new ones (especially for correlation rule subrules).
Most "alert when N distinct X within a window" detections do not need a manual cache. Use the built-in unique-value thresholding feature first, and reach for the DynamoDB cache only when state must persist beyond a single dedup window or the logic isn't expressible as "count distinct values."
Add a unique(event) -> str function to a Python rule. Panther applies the YAML Threshold: to the estimated count of distinct values returned by unique() within each DedupPeriodMinutes window, instead of the raw event count. The unique counter resets at the end of every dedup window automatically — no TTLs, no DynamoDB calls, no mocks.
Use it when you'd otherwise be tempted to write "track seen values in a string set":
- "5+ unique source IPs hitting the same user" →
unique()returnssourceIPAddress,Threshold: 5,dedup()returns the username - "Same MFA phone enrolled by multiple users" →
unique()returnsuser_id,dedup()returns the phone number - "User accessing many distinct workspaces" →
unique()returns the workspace ID,dedup()returns the actor
# rules/auth0_rules/auth0_same_phone_mfa_multiple_users.py — abbreviated
def rule(event):
return (
event.deep_get("data", "type") == "gd_enrollment_complete"
and event.deep_get("data", "description") == "Guardian - Enrollment complete (sms)"
and bool(event.deep_get("data", "details", "authenticator", "phone_number"))
)
def unique(event):
return event.deep_get("data", "user_id", default="")
def dedup(event):
return str(event.deep_get("data", "details", "authenticator", "phone_number", default=""))With Threshold: 2 in the YAML, this fires when ≥2 distinct user_ids enroll the same phone within DedupPeriodMinutes.
Rules of thumb:
unique()returns the field whose cardinality you're thresholding.dedup()(orGroupBy:) returns the field that groups the alert — what stays constant.- The standard
Threshold:field now means "distinct unique-values," not raw event count. - Counts are an estimate (HyperLogLog-style) — fine for "≥ N" detections, not for exact accounting.
- Minimum
DedupPeriodMinutesis 5 (API/CLI) or 15 (Console UI); the unique counter resets each window. - Existing examples in the repo:
auth0_same_phone_mfa_multiple_users.py,databricks_access_to_multiple_workspaces.py,k8s_secret_enumeration.py,snowflake_stream_password_spray.py. unique()is Python-only — Simple Detections cannot use it.unique()works inpat testwithout mocks; no cache stub required.
Use the manual cache only for state that unique() cannot express:
- Persistence beyond a single dedup window (e.g. "first time we've ever seen this value").
- Counters with custom reset logic or arithmetic.
- Cross-rule shared state.
- Storing structured values (not just distinct-value counts).
Caching only works in the Panther Console — local pat test requires mocked cache calls (see §8.2). Helpers live in panther_detection_helpers.caching.
from panther_detection_helpers.caching import add_to_string_set, get_string_set
def rule(event):
if event.get("eventName") != "AssumeRole":
return False
role_arn = event.deep_get("requestParameters", "roleArn")
if not role_arn:
return False
key = f"{role_arn}-UniqueSourceIPs"
ip = event.get("sourceIPAddress", "")
seen = get_string_set(key)
if not seen:
add_to_string_set(key, ip, epoch_seconds=event.event_time_epoch() + 7 * 24 * 3600)
return False
return ip not in seenAPIs: get_string_set, put_string_set, add_to_string_set, remove_from_string_set, reset_string_set.
Note: this exact "alert on a never-before-seen IP" pattern is not a
unique()use case (unique()resets per window; here we want indefinite memory).
from panther_detection_helpers.caching import increment_counter, set_key_expiration, reset_counter
def rule(event):
if event.get("errorCode") != "AccessDenied":
return False
key = f"{event.deep_get('userIdentity', 'arn')}-AccessDeniedCounter"
count = increment_counter(key)
if count == 1:
set_key_expiration(key, event.event_time_epoch() + 3600)
if count >= 10:
reset_counter(key)
return True
return FalseAPIs: get_counter, increment_counter, reset_counter, set_key_expiration.
Note: a plain "≥10 access-denied events from the same ARN per hour" rule should usually use
Threshold: 10+DedupPeriodMinutes: 60+GroupBy: userIdentity.arn— no cache needed. Reach forincrement_counteronly when you need custom reset behavior.
- Default TTL is 90 days — don't rely on it; set explicit expirations.
- Use
event.event_time_epoch()for TTL math, nottime.time()— replayed or delayed events would otherwise expire incorrectly. - Don't put timestamps in cache keys — they break reproducibility.
- Reaching for the cache when
Threshold+unique()would do the job. - Calling cache APIs before checking event relevance — wastes latency.
- Adding to a string set before checking for the value already being present — corrupts the dedup logic.
- Forgetting TTL — leads to unbounded cache growth.
- Forgetting to mock cache calls in unit tests (see §8.2).
When --minimum-tests 2 (or higher) is enforced, every detection must have:
- ≥ N tests
- ≥ 1 test that returns
true - ≥ 1 test that returns
false
Use the Mocks: block to stub out cache calls or external helpers in unit tests:
Tests:
- Name: Hits the threshold
ExpectedResult: true
Mocks:
- objectName: get_counter
returnValue: 10
- objectName: increment_counter
returnValue: 11
Log: { ... }Use real-looking field shapes from actual log schemas — but fully redact identifiers (§1.1). Use 123456789012 for AWS account IDs, 192.0.2.x for IPs, user@example.com for emails.
The Python detection runtime ships with a small set of third-party libraries pre-installed:
jsonpath-ng— JSONPath queriespolicyuniverse— AWS ARN and IAM policy parsingrequests— HTTP
Don't add new runtime dependencies casually — they require platform changes.
event.deep_walk("a", "b", "c")— walks through arrays of dicts, returning a flattened list of values at the leaf path. Useful when intermediate nodes are lists.event.event_time_epoch()— the event's normalized timestamp in epoch seconds. Use this for cache TTLs.p_*fields — Panther-added metadata (e.g.p_log_type,p_event_time).p_any_*fields contain extracted indicators (IPs, domains, etc.).
- Order conditions in
rule()by selectivity (most restrictive first) to leverage Python short-circuiting. - Return early — exit
rule()as soon as a precondition fails. - Don't implement thresholds in Python; use the
Threshold:YAML field. Panther aggregates. - Don't make
title()so unique that it fragments alerts — dedup depends on it.
| Window | When to use |
|---|---|
| 15 min | High-frequency events (login failures, API errors) |
| 60 min | Standard security events (privilege changes, data access) — default |
| 180 min | Compliance-style events |
| 720 min | Low-frequency events (account creation) |
| 1440 min | Rare events (root account usage) |
Read style_guides/CORRELATION_RULES_STYLE_GUIDE.md. Highlights:
- Files live in
correlation_rules/. Subrules and signals go in the appropriate logtype directory (e.g.rules/aws_cloudtrail_rules/). - Test with
pat validateagainst a live Panther instance —pat testcannot fully exercise correlation logic. - Reuse existing signal rules (e.g.
AWS Console Login); don't duplicate. - Sequence/group/transition IDs should be meaningful (
GHASChange, notTR.1). - When writing transition descriptions inside an ID, capitalize the verb:
"GitHub Advanced Security Change NOT FOLLOWED BY repo archived". - Strip boilerplate UI-template comments (e.g.
# Create a list of rules to correlate) before committing. MinMatchCount: 1is the default — omit it.LookbackWindowMinutesshould be ≥ 1.5×RateMinutes.Match Onfields must be scalar; for cross-LogType matches, project transformed values intop_alert_contextfrom each subrule.
The Panther MCP server (docs) exposes a Panther deployment as an MCP toolset to AI clients (Claude Code, Cursor, Claude Desktop, Goose). When configured, agents can query the data lake, inspect schemas, list/get detections, look up alerts, and read global helpers from a real Panther instance — turning detection authoring from "write blindly against docs" into "ground every choice in actual data."
If your client already has Panther MCP tools available (look for tool names prefixed with mcp__panther-…__), prefer them over guessing field names or fabricating sample logs.
| Situation | MCP tool(s) |
|---|---|
"What fields exist on log type X?" — before writing event.deep_get(...) |
get_log_type_schema_details, list_log_type_schemas |
| "Show me a real event for this log type" — for realistic test cases (then redact!) | query_data_lake |
| "What table/column do I query in the data lake?" | list_databases, list_database_tables, get_table_schema |
| Check whether a similar detection already exists | list_detections, get_detection |
| Reuse an existing helper instead of writing a new one | list_global_helpers, get_global_helper |
| Validate a scheduled-query rule against real data | query_data_lake, get_scheduled_query |
| Confirm a data model normalization | list_data_models, get_data_model |
| Investigate an alert that motivated the detection | get_alert, get_alert_events, summarize_alert_events |
| Triage existing alert volume / FP rate before tuning severity | list_alerts, get_rule_alert_metrics, get_severity_alert_metrics |
A good MCP-augmented authoring flow:
- Schema first.
get_log_type_schema_detailsfor the target log type — confirm exact field paths and casing before writing anyevent.deep_get(...). - Sample real events.
query_data_lakewith a tightLIMITto pull a handful of representative rows. Use them to shaperule()logic and as the basis for unit tests. - Check for prior art.
list_detections --filter LogTypes=...andlist_global_helpersso you don't duplicate existing rules or re-implement an existing helper. - Author the detection in
panther-analysis(Python + YAML) using the verified field names. - Sanity-check the alert volume. Once deployed,
get_rule_alert_metricsshows whether the rule is firing too often / not at all.
- Public repo discipline still applies. Anything pulled from
query_data_lakeis real customer/tenant data. Never paste raw query results into a unit test, commit, PR description, or comment. Redact account IDs, emails, IPs, hostnames, and identifiers before they leave the MCP tool result — see §1.1. - Read-only by default. For authoring work, scope your API token to read-only (
Query Data Lake,View Rules,Read Alerts). Don't grantManage Rulesor write scopes unless you specifically need them, and never to a token used by an autonomous agent. - Production vs. demo instances. If you have multiple Panther environments wired up (e.g.
mcp__panther-tr__*andmcp__panther-aod__*), be deliberate about which one you query — production data is more sensitive and rate limits matter. patis still the source of truth for tests. MCP can fetch a live detection viaget_detection, but the canonical version lives in this repo. Don't "edit on the live instance" — change the file, runmake test, and let the upload pipeline propagate.- Data-lake queries cost money and time. Add
LIMITclauses, narrow the time range withp_event_timepredicates, and avoidSELECT *on wide tables.
If the MCP server isn't already configured:
- Mint an API token in your Panther instance (Settings → API Tokens) with the minimum scopes you need.
- Install via Docker or
uvxper the README. Required env vars:PANTHER_INSTANCE_URL,PANTHER_API_TOKEN. - Register the server in your MCP client (e.g.
~/.claude.jsonfor Claude Code,.cursor/mcp.jsonfor Cursor). - Verify with a low-impact call like
list_log_type_schemasbefore granting broader permissions.
- Branch from
develop. Name branches descriptively (e.g.aws-rds-instance-public-access). - Commit
.pyand.ymltogether — never split a detection across PRs. - Run
make fmt && make lint && make testlocally. - Open the PR against
develop(gh pr create --base develop ...). - Use the PR template: Background, Changes, Testing.
- Wait for CODEOWNERS review. If you have merge perms, merge after approval; otherwise comment requesting a code owner merge.
Don't:
- Don't open PRs against
main. - Don't bundle unrelated detections into one PR.
- Don't include the giant
panther-analysis-*.zipartifacts at the repo root in your diff (they are build outputs). - Don't hand-edit files under
indexes/— they're generated.
- Add the deleted detection's
RuleID/PolicyIDtodeprecated.txtso customers can runmake remove-deprecatedto drop it from their instances. - Run
make check-deprecated(Panther internal) to validate the file. - Tag retained-but-discouraged detections with the
Deprecatedtag rather than deleting them outright when downstream users may still rely on them.
event['field']will crash on missing fields. Always useevent.get/event.deep_getwith a default.- Importing
deep_getfrompanther_base_helpersis unnecessary and discouraged — it's a method onevent. Filename:mismatches between.ymland the actual.pyfilename will failpat testwith a confusing error.- Severity casing in
Tests:filters is uppercase (Severity=CRITICAL) but in YAML metadata it's title case (Severity: Critical). Both are accepted bypat. - Real PII in test logs is the most common review blocker for community PRs. Redact before pushing.
- Forgetting
--base developsilently retargets the PR atmain. Check the base branch on the PR page after creation. - Large refactors of
global_helpers/can break dozens of detections. Run the fullmake testsuite, not a filtered subset. - Correlation rule tests pass
pat testbut fail at upload because real validation requires a live instance — usepat validate. - Adding new helper modules requires adding corresponding
*_test.pyfiles;make global-helpers-unit-testenforces this.
style_guides/STYLE_GUIDE.md— full Python/metadata style guidestyle_guides/CORRELATION_RULES_STYLE_GUIDE.md— correlation-rule specificsstyle_guides/RUNBOOK.md— runbook authoring guidanceCONTRIBUTING.md— the human contributor flow (CLA, PR process)templates/— starter detections- Panther Detections docs
- Panther MCP server docs and
mcp-pantherrepo — live-instance tooling for AI clients - Anatomy of a High Quality SIEM Rule