[DO NOT MERGE] ESQL: PoC for name qualifiers#126531
[DO NOT MERGE] ESQL: PoC for name qualifiers#126531alex-spies wants to merge 32 commits intoelastic:mainfrom
Conversation
| basic | ||
| required_capability: name_qualifiers | ||
|
|
||
| FROM sample_data main |
There was a problem hiding this comment.
We may want to require the AS keyword, at least in the beginning - if we allow qualifiers in the FROM clause at all:
FROM sample_data AS main
| required_capability: name_qualifiers | ||
|
|
||
| ROW main x = 2 | ||
| | EVAL main y = main x * 3, other x = main y - 5 |
There was a problem hiding this comment.
Should users be allowed to assign qualifiers in EVAL, RENAME or ROW? If so, should attributes with qualifiers still shadow existing attributes on name conflict, or should the query just fail?
| | SORT event_duration | ||
| | LIMIT 1 | ||
| | RENAME message AS main message | ||
| | KEEP main message, event_duration |
There was a problem hiding this comment.
Should referring to qualified names use a whitespace as separator, like here?
I.e. should it be KEEP main message, ... or rather KEEP main->message with some kind of qualifier separator like ->?
We already know that using . as separator is highly ambiguous, even though KEEP main.message looks the cleanest due to similarity to SQL. I don't think using . is realistic due to a high amount of ambiguous edge cases.
| // TODO: Should ON without predicate even be allowed with qualifiers? | ||
| | LOOKUP JOIN languages_lookup right ON language_code |
There was a problem hiding this comment.
Should the LOOKUP JOIN ... ON some_attribute syntax be allowed together with qualifiers? It seems like it can be, but we have to be clear if the right hand side join key gets included into the output or not. In this PoC, it's not included.
| FROM employees main | ||
| | KEEP main emp_no, main languages | ||
| | WHERE main emp_no == 10001 | ||
| | LOOKUP JOIN languages_lookup right ON main languages == right language_code |
There was a problem hiding this comment.
Enabling this syntax is one of the aims of this PR.
|
|
||
| FROM employees main | ||
| | KEEP main emp_no, main languages | ||
| | WHERE main emp_no == 10001 |
There was a problem hiding this comment.
Should it be allowed to omit the qualifier when it's not ambiguous?
This should be possible in principle; we need to carefully examine and decide how to deal in case of ambiguities, e.g. when there is both an attribute called field (no qualifier) and another one called qualified field (qualifier present).
| FROM employees | ||
| | KEEP emp_no, languages | ||
| | WHERE emp_no == 10001 | ||
| | LOOKUP JOIN languages_lookup right ON languages == right language_code |
There was a problem hiding this comment.
This could be a common situation: fields from the main index (-pattern) have no qualifier, but we consider this fine.
| FROM employees main | ||
| | KEEP main emp_no, main languages | ||
| | WHERE main emp_no == 10001 | ||
| | LOOKUP JOIN languages_lookup ON main languages == language_code |
There was a problem hiding this comment.
Should the ON ... == ... syntax be allowed when we don't assign a qualifier to the lookup fields?
| // TODO: Ambiguity in case that the qualifier is METADATA? Should we use the AS keyword? | ||
| indexPatternAndMetadataFields: | ||
| indexPattern (COMMA indexPattern)* metadata? | ||
| indexPattern (COMMA indexPattern)* qualifier=indexString? metadata? |
There was a problem hiding this comment.
The main grammar changes: index patterns can have qualifiers (assigned to the attributes from the index); attribute names can have qualifiers when referring to them or defining them in EVAL.
| shadowing of previous attributes is not allowed: | ||
| - do: | ||
| catch: /qualified attribute \[main data\] already exists/ | ||
| esql.query: | ||
| body: | ||
| query: 'from test main | eval main data = 10' | ||
| --- | ||
| internal shadowing is not allowed: | ||
| - do: | ||
| catch: /qualified attribute \[qualified y\] already exists/ | ||
| esql.query: | ||
| body: | ||
| query: 'row x = 0 | eval qualified y = 1, qualified y = 2 * qualified y' | ||
| --- | ||
| shadowing in rename is not allowed: | ||
| - do: | ||
| catch: /qualified attribute \[qualified y\] already exists/ | ||
| esql.query: | ||
| body: | ||
| query: 'row x = 0, qualified y | rename x as qualified y' |
There was a problem hiding this comment.
This is not implemented yet, but this could be the behavior if we disallow shadowing of attributes when they have qualifiers.
|
To get this to production-readiness, we'd need to split this up into smaller portions. Then there's a bit more work to do:
|
| required_capability: name_qualifiers | ||
|
|
||
| FROM sample_data main | ||
| | SORT mian event_duration |
There was a problem hiding this comment.
@alex-spies This seems like a typo, why is this not a syntax error? SORT main event_duration
mian => main
There was a problem hiding this comment.
Correct! Test is disabled right now (see the -Ignore suffix) and was intended as an exploration of how KEEP with qualifiers + wildcards may look like.
This PoC aims to demonstrate how qualifiers could work in ES|QL. In particular it enables the following syntax and makes it work in a csv test:
The added csv tests are probably the most important thing to look at; otherwise, this PR is quite noisy due to the grammar change + required plumbing.
Except for 3 ignored tests, the newly added csv tests pass in single node scenarios; multi-node scenarios fail, but only because I didn't update some serialization code, yet. There are 3 yaml tests that demonstrate how name conflicts with qualifiers would make queries invalid, but the validation is not yet in place.
I'm adding a couple of remarks by highlighting the relevant parts of the added tests.