Description
Lucene suffix wildcard field queries semantically represent a prefix match. For example:
This should behave like values where app starts with frontend..
When HyperDX renders this kind of Lucene query to ClickHouse SQL, it appears to compile to a contains-style predicate instead of a prefix predicate, e.g.:
For ClickHouse-backed log tables, this is much slower than preserving prefix semantics with something like:
startsWith(app, 'frontend.')
-- or
app LIKE 'frontend.%'
Why this matters
On large ClickHouse log tables ordered by a low-cardinality service/application field plus timestamp, prefix predicates can significantly reduce the scan range. Compiling a suffix-only Lucene wildcard into a contains predicate prevents effective primary key / ordering pruning and can make common saved searches much slower.
Example
Lucene query:
app:frontend.* AND environment:prod
Expected ClickHouse rendering:
startsWith(app, 'frontend.') AND environment = 'prod'
or:
app LIKE 'frontend.%' AND environment = 'prod'
Actual behavior appears to be closer to:
app ILIKE '%frontend.%' AND environment = 'prod'
Suggested behavior
When a Lucene field term has only a trailing wildcard, e.g. field:value*, render it as a ClickHouse prefix predicate instead of a contains predicate.
Leading wildcard queries such as field:*value or contains wildcard queries such as field:*value* can continue to use contains-style predicates.
Notes
This is not about implicit full-text search. It is specifically about explicit field queries with a suffix-only wildcard.
Description
Lucene suffix wildcard field queries semantically represent a prefix match. For example:
This should behave like values where
appstarts withfrontend..When HyperDX renders this kind of Lucene query to ClickHouse SQL, it appears to compile to a contains-style predicate instead of a prefix predicate, e.g.:
app ILIKE '%frontend.%'For ClickHouse-backed log tables, this is much slower than preserving prefix semantics with something like:
Why this matters
On large ClickHouse log tables ordered by a low-cardinality service/application field plus timestamp, prefix predicates can significantly reduce the scan range. Compiling a suffix-only Lucene wildcard into a contains predicate prevents effective primary key / ordering pruning and can make common saved searches much slower.
Example
Lucene query:
Expected ClickHouse rendering:
or:
Actual behavior appears to be closer to:
Suggested behavior
When a Lucene field term has only a trailing wildcard, e.g.
field:value*, render it as a ClickHouse prefix predicate instead of a contains predicate.Leading wildcard queries such as
field:*valueor contains wildcard queries such asfield:*value*can continue to use contains-style predicates.Notes
This is not about implicit full-text search. It is specifically about explicit field queries with a suffix-only wildcard.