Skip to content

fix: Implement AggregateUDFImpl::reverse_expr for StringAgg #17165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nuno-faria
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

The SortExec is missing from the physical plan of string_agg functions. In the current branch it works since the AggregateExec ends up sorting the results, but on 49.0.0 its broken. This PR helps avoid future regressions by implementing AggregateUDFImpl::reverse_expr for StringAgg, which brings back the SortExec.

What changes are included in this PR?

  • Implemented AggregateUDFImpl::reverse_expr for StringAgg.
  • Added sqllogictests to check the plans.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Aug 13, 2025
03)----DataSourceExec: partitions=1, partition_sizes=[1]

query TT
explain select string_agg(k, ',' order by v desc) from t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is single aggregation enough to exercise the reverse_expr API?
Or maybe we need two functions, one using reverse ordering of the other?

I checked that the test does not pass without changes in string_agg.rs and passes with them.
This is likely related to

if let Some(reverse_aggr_expr) = aggr_expr.reverse_expr() {
call site, but it looks it should never matter when there is only one function.

What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without reverse_expr the requirement in that function will stay None, which means get_finer_aggregate_exprs_requirement returns an empty list. Maybe @berkaysynnada can clarify the behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we hitting this loop exit path as expected?

// If the common requirement is finer than the current expression's,
// we can skip this expression. If the latter is finer than the former,
// adopt it if it is satisfied by the equivalence properties. Otherwise,
// defer the analysis to the reverse expression.
let forward_finer = determine_finer(&requirement, &aggr_req);
if let Some(finer) = forward_finer {
if !finer {
continue;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like it, determine_finer returns true. Should it exit there? If it did I think we would have the same problem, i.e., requirement would remain None.

@findepi findepi requested a review from berkaysynnada August 13, 2025 10:08
@findepi
Copy link
Member

findepi commented Aug 13, 2025

This LGTM but i clearly don't understand fully how this works.
In my understanding the reverse_expr should by no means be needed, nor used, if there is only one aggregate function in a query. Either I am wrong, or there is some other problem remaining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants