Skip to content

Commit c72030c

Browse files
authored
Merge pull request #62 from timescale/feat/lynxdb-backend
feat(convert): add LynxDB backend
2 parents 19a204d + b7f9755 commit c72030c

25 files changed

Lines changed: 1437 additions & 7 deletions

README.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,15 @@ rsigma convert rules/ -t postgres -O table=okta_events -O json_field=data -O tim
173173
# Sliding window correlation format (per-row detection using window functions)
174174
rsigma convert rules/ -t postgres -f sliding_window
175175

176+
# Convert to LynxDB search queries
177+
rsigma convert rules/ -t lynxdb
178+
179+
# Convert to LynxDB with a pipeline (custom index)
180+
rsigma convert rules/ -t lynxdb -p pipeline.yml
181+
182+
# LynxDB minimal format (search expression only, for the API q parameter)
183+
rsigma convert rules/ -t lynxdb -f minimal
184+
176185
# List available conversion backends
177186
rsigma list-targets
178187

@@ -260,7 +269,7 @@ From there, the AST can go in three directions depending on what you need:
260269

261270
- **Evaluation:** `rsigma-eval` compiles rules into optimized matchers (`compiler.rs`), runs stateless detection through `Engine`, and tracks stateful correlation (`correlation.rs`: sliding windows, group-by, chaining, suppression) across events. Processing pipelines handle field mapping, transformations, conditions, and finalizers before compilation. Events are accessed through a trait with implementations for JSON, key-value, and plain text.
262271

263-
- **Conversion:** `rsigma-convert` transforms rules into backend-native query strings through a pluggable `Backend` trait. A condition walker traverses the AST and delegates to the backend for each node. `TextQueryConfig` exposes ~90 configuration fields for text-based backends. The PostgreSQL/TimescaleDB backend is the primary concrete implementation, generating SQL for historical threat hunting.
272+
- **Conversion:** `rsigma-convert` transforms rules into backend-native query strings through a pluggable `Backend` trait. A condition walker traverses the AST and delegates to the backend for each node. `TextQueryConfig` exposes ~90 configuration fields for text-based backends. Concrete implementations include PostgreSQL/TimescaleDB (SQL for historical threat hunting) and LynxDB (SPL2-compatible search queries for log analytics).
264273

265274
- **Editor support:** `rsigma-lsp` provides an LSP server over stdio (via `tower-lsp`) with real-time diagnostics (lint + parse + compile errors), completions, hover documentation, document symbols, and code actions. Works with VSCode, Neovim, Helix, Zed, and any LSP-capable editor.
266275

@@ -320,8 +329,9 @@ Feature-gated items are marked with \* in the diagram.
320329
│ │ │ backends/ ──> │ │ Helix, Zed, ... │
321330
│ correlation.rs ──> │ │ TextQueryTest, │ └────────────────────┘
322331
│ sliding windows, │ │ PostgreSQL/ │
323-
│ group-by, chaining, │ │ TimescaleDB │
324-
│ suppression, events │ └─────────────────────┘
332+
│ group-by, chaining, │ │ TimescaleDB, │
333+
│ suppression, events │ │ LynxDB │
334+
│ │ └─────────────────────┘
325335
│ │
326336
│ rsigma.* custom │
327337
│ attributes │

crates/rsigma-cli/src/commands/convert.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,14 @@ fn get_backend(
1111
"postgres" | "postgresql" | "pg" => {
1212
Box::new(rsigma_convert::backends::postgres::PostgresBackend::from_options(options))
1313
}
14+
"lynxdb" => Box::new(rsigma_convert::backends::lynxdb::LynxDbBackend::new()),
1415
"test" => Box::new(rsigma_convert::backends::test::TextQueryTestBackend::new()),
1516
"test_mandatory_pipeline" => {
1617
Box::new(rsigma_convert::backends::test::MandatoryPipelineTestBackend::new())
1718
}
1819
_ => {
1920
eprintln!("Unknown target: {target}");
20-
eprintln!("Available targets: postgres, test");
21+
eprintln!("Available targets: postgres, lynxdb, test");
2122
process::exit(1);
2223
}
2324
}
@@ -100,6 +101,7 @@ pub(crate) fn cmd_convert(
100101
pub(crate) fn cmd_list_targets() {
101102
println!("Available conversion targets:");
102103
println!(" postgres - PostgreSQL/TimescaleDB (aliases: postgresql, pg)");
104+
println!(" lynxdb - LynxDB log analytics engine");
103105
println!(" test - Backend-neutral test backend");
104106
}
105107

crates/rsigma-cli/tests/cli_convert.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ fn convert_invalid_target() {
217217
assert!(!output.status.success());
218218
assert_snapshot!(String::from_utf8_lossy(&output.stderr), @"
219219
Unknown target: nonexistent_backend
220-
Available targets: postgres, test
220+
Available targets: postgres, lynxdb, test
221221
");
222222
}
223223

@@ -294,6 +294,7 @@ fn list_targets() {
294294
assert_snapshot!(String::from_utf8_lossy(&output.stdout), @"
295295
Available conversion targets:
296296
postgres - PostgreSQL/TimescaleDB (aliases: postgresql, pg)
297+
lynxdb - LynxDB log analytics engine
297298
test - Backend-neutral test backend
298299
");
299300
}
@@ -324,6 +325,6 @@ fn list_formats_invalid_target() {
324325
assert!(!output.status.success());
325326
assert_snapshot!(String::from_utf8_lossy(&output.stderr), @"
326327
Unknown target: nonexistent
327-
Available targets: postgres, test
328+
Available targets: postgres, lynxdb, test
328329
");
329330
}

crates/rsigma-convert/README.md

Lines changed: 99 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,15 @@ The crate provides a generic conversion framework that any backend can plug into
1717
- **Deferred expressions** through the `DeferredExpression` trait and `DeferredTextExpression` for backends that need post-query appendages (e.g. Splunk `| regex`, `| where`).
1818
- **Test backend** with `TextQueryTestBackend` and `MandatoryPipelineTestBackend` for backend-neutral foundation testing.
1919
- **PostgreSQL/TimescaleDB backend** with native `ILIKE`, regex (`~*`), CIDR (`inet`/`cidr`), full-text search (`tsvector`/`tsquery`), JSONB field access, correlation via CTEs and window functions, and TimescaleDB-specific output formats (continuous aggregates, `time_bucket` queries, view generation).
20+
- **LynxDB backend** generating SPL2-compatible `FROM <index> | search ...` queries with glob wildcards, deferred `| where` clauses for regex and CIDR, `CASE()` case-sensitive matching, and correct parenthesization for LynxDB's non-standard boolean precedence (`NOT > OR > AND`).
2021

2122
## Backends
2223

2324
| Backend | Target names | Description |
2425
|---------|-------------|-------------|
2526
| Test | `test` | Backend-neutral text queries for foundation testing |
2627
| PostgreSQL | `postgres`, `postgresql`, `pg` | Native PostgreSQL SQL with TimescaleDB support |
28+
| LynxDB | `lynxdb` | SPL2-compatible search queries for LynxDB log analytics engine |
2729

2830
## Usage
2931

@@ -89,6 +91,61 @@ for result in &output.queries {
8991
}
9092
```
9193

94+
### LynxDB backend
95+
96+
```rust
97+
use rsigma_parser::parse_sigma_yaml;
98+
use rsigma_convert::{convert_collection, Backend};
99+
use rsigma_convert::backends::lynxdb::LynxDbBackend;
100+
101+
let yaml = r#"
102+
title: Detect Whoami
103+
logsource:
104+
category: process_creation
105+
product: windows
106+
detection:
107+
selection:
108+
CommandLine|contains: 'whoami'
109+
condition: selection
110+
level: medium
111+
"#;
112+
113+
let collection = parse_sigma_yaml(yaml).unwrap();
114+
let backend = LynxDbBackend::new();
115+
116+
let output = convert_collection(&backend, &collection, &[], "default").unwrap();
117+
for result in &output.queries {
118+
for query in &result.queries {
119+
println!("{query}");
120+
// Output: FROM main | search CommandLine=*"whoami"*
121+
}
122+
}
123+
```
124+
125+
### LynxDB output formats
126+
127+
| Format | Description |
128+
|--------|-------------|
129+
| `default` | Full query with index prefix: `FROM main \| search ...` |
130+
| `minimal` | Search expression only (no `FROM` prefix), useful for the LynxDB API `q` parameter |
131+
132+
### LynxDB index selection
133+
134+
The target index defaults to `main`. Set it via pipeline state:
135+
136+
```yaml
137+
# In a pipeline YAML
138+
transformations:
139+
- type: set_state
140+
key: index
141+
value: security_logs
142+
```
143+
144+
```bash
145+
rsigma convert -r rules/ -t lynxdb -p pipeline.yml
146+
# Output: FROM security_logs | search ...
147+
```
148+
92149
### PostgreSQL output formats
93150

94151
| Format | Description |
@@ -231,7 +288,7 @@ For text-based query backends (the vast majority), create a `TextQueryConfig` wi
231288
3. Override specific methods for backend-specific behavior (e.g. deferred regex for Splunk, SQL-specific CIDR handling for PostgreSQL).
232289
4. Register your backend in the CLI's `get_backend()` registry.
233290

234-
See `backends/test.rs` for a complete reference implementation and `backends/postgres.rs` for a production backend with SQL-specific overrides.
291+
See `backends/test.rs` for a complete reference implementation, `backends/postgres.rs` for a production backend with SQL-specific overrides, and `backends/lynxdb/` for a `TextQueryConfig`-based backend with deferred expressions and custom precedence handling.
235292

236293
## PostgreSQL Backend Details
237294

@@ -308,6 +365,47 @@ This matches the nested traversal behavior of the evaluation engine (`rsigma-eva
308365
rsigma convert -r rules/ -t postgres -O table=okta_events -O json_field=data -O timestamp_field=time
309366
```
310367

368+
## LynxDB Backend Details
369+
370+
The LynxDB backend (`LynxDbBackend`) generates SPL2/Lynx Flow queries for the [LynxDB](https://github.com/proximax-storage/lynxdb) log analytics engine. It produces `FROM <index> | search <predicates>` queries with deferred `| where` clauses for operations that LynxDB's search syntax does not natively support.
371+
372+
| Sigma Modifier | LynxDB Query |
373+
|----------------|-------------|
374+
| `contains` | `field=*"value"*` |
375+
| `startswith` | `field="value"*` |
376+
| `endswith` | `field=*"value"` |
377+
| `re` | `\| where field=~"pattern"` (deferred) |
378+
| `cidr` | `\| where cidrmatch("cidr", field)` (deferred) |
379+
| `cased` (exact) | `field=CASE("value")` |
380+
| wildcards (`*`) | `field="va*lue"` (glob) |
381+
| wildcards (`?`) | `\| where field=~"va.lue"` (deferred, converted to regex) |
382+
| `exists` | `field=*` |
383+
| `null` | `NOT field=*` |
384+
| keywords | `"value"` (unbound search) |
385+
386+
### Boolean precedence
387+
388+
LynxDB's parser uses non-standard boolean operator precedence: `NOT > OR > AND`. This differs from most query languages where AND binds tighter than OR. The backend explicitly parenthesizes AND groups to preserve Sigma's intended logic:
389+
390+
```
391+
Sigma: A AND B OR C (intended: (A AND B) OR C)
392+
Query: (A AND B) OR C (explicit parens prevent misparse as A AND (B OR C))
393+
```
394+
395+
### Deferred expressions
396+
397+
Regex patterns, CIDR matches, and single-character wildcard (`?`) patterns cannot be expressed in LynxDB's `search` syntax and are instead emitted as `| where` pipeline stages appended after the search clause:
398+
399+
```
400+
FROM main | search status=500 | where Path=~"/api/.*"
401+
```
402+
403+
When a detection contains only deferred expressions, the search clause uses `*` (match all) followed by the deferred stages:
404+
405+
```
406+
FROM main | search * | where SourceIP=~"^10\.0\." | where cidrmatch("192.168.1.0/24", DestIP)
407+
```
408+
311409
## License
312410

313411
MIT License.

0 commit comments

Comments
 (0)