fix(aggregate): show aliased expr in explain by kumarUjjawal · Pull Request #21739 · apache/datafusion

kumarUjjawal · 2026-04-20T05:12:01Z

Which issue does this PR close?

Closes Aliased aggregation expressions not visible in physical explain output #19685.

Rationale for this change

Physical explain output only showed the alias for aliased aggregates. That made it hard to understand the plan, especially when the aggregate had a filter, explicit RESPECT NULLS, or a custom UDAF display.

What changes are included in this PR?

Show the full aggregate expression in physical explain for user-written aggregate aliases.
Keep internal aliases like count(*) compact in physical explain.
Replace the old hidden metadata approach with an explicit is_internal flag on Alias.
Preserve that flag through planner rewrites, tree rewrites, and proto round-trip.
Add tests for aliased aggregate explain output, including:
- normal aliased aggregates
- quoted aliases
- explicit RESPECT NULLS
- custom human display
- count(*)
- nested internal alias display
Add an upgrade note for the public Alias API change.

Are these changes tested?

Yes

Are there any user-facing changes?

Yes.

Physical explain output is clearer for aliased aggregate expressions.
Alias now has a new is_internal field.

This is a public API change for users who build or pattern match Alias directly. The upgrade guide has been updated with the needed changes.

kumarUjjawal · 2026-04-20T05:18:31Z

cc @pepijnve

kumarUjjawal · 2026-04-20T07:59:22Z

It has much more cases than I realized initially 😢

kumarUjjawal · 2026-04-20T10:44:43Z

~~I am working on another approach which is looking better than this~~

I updated the code with new changes along with pr body

kumarUjjawal · 2026-04-20T12:40:46Z

I tried two approaches for this.

The first approach was by hiding the bit in alias metadata. It fixed the display problem, but it also mixed planner-only state with user metadata. That made the behavior harder to reason about and forced extra handling in places like equality, hashing, serialization, and rewrite logic. In practice, a simple display fix startedaffecting alias identity and metadata flow in too many places.

The current approach adds an explicit internal flag to alias. This is a small API break, but it makes the model much clearer: whether an alias is user written or planner generated is now part of the type itself, not hidden in metadata. That keeps the display logic direct, avoids hidden state leaking into unrelated code paths, and makes future maintenance safer because the intent is visible and compiler checked.

Would love to hear your thoughts @pepijnve

pepijnve · 2026-04-20T13:37:32Z

@kumarUjjawal could you give an example of where/when the internal flag is necessary?

pepijnve · 2026-04-20T13:49:44Z

I think I might have found it in your second commit. Where this change was reverted.

Wouldn't we want the left version though? The physical plan is actually executing count(1), but you don't see that at all in the physical plan even though the logical plan does show it.

pepijnve · 2026-04-20T13:31:49Z

-            .map(|expr| expr.human_display())
+            .map(|expr| {
+                let human_display = expr.human_display();
+                if human_display.is_empty() {


Should human_display be Option<String>?

Yeah I could do that.

pepijnve · 2026-04-20T13:34:40Z

    }

+    #[tokio::test]
+    async fn test_aggregate_explain_shows_aliased_expression() -> Result<()> {


These might be more concise as SLTs

kumarUjjawal · 2026-04-20T16:22:23Z

Wouldn't we want the left version though? The physical plan is actually executing count(1), but you don't see that at all in the physical plan even though the logical plan does show it.

I think the current behavior is intentional.

My thinking:

user-written aliases should inline the aggregate expression
internal planner aliases should stay compact

count() is in the second group, so keeping aggr=[count()] is expected. The bug here is about user aliases like sum(...) as agg disappearing in physical explain, not about exposing every internal rewrite.

So while the physical plan does execute the lowered count(1) form, showing count(1) as count() in explain would expose planner internals and would regress the compact count() output we already preserve elsewhere. If we want physical explain to show lowered forms more generally, I think that should be doable but will require some changes.

kumarUjjawal · 2026-04-20T16:27:22Z

@kumarUjjawal could you give an example of where/when the internal flag is necessary?

A good example could be:

Internally, COUNT() is lowered to count(1) so it can use the normal aggregate path. But the user did not write count(1), they wrote count(). So the planner needs a way to preserve that user-facing name without treating it like a real user alias.

That is what the internal flag is for: it marks aliases that exist only because the planner rewrote the expression.

Without that bit, physical explain cannot tell these two cases apart:

user alias: sum(a) AS total
planner-generated alias: lowered count(1) wrapped as count(*)

Those should display differently. For example, with:

SELECT COUNT(*) AS total_rows FROM t

there are really two alias layers:

internal: count(1) -> count(*)
user: count(*) -> total_rows

The internal flag lets explain show count() as total_rows, instead of either exposing the lowered form count(1) as count() as total_rows or collapsing everything to just total_rows.

pepijnve · 2026-04-21T07:56:07Z

This might just be personal preference speaking, but in a physical explain plan I'm looking for what the engine is actually doing (how is it being executed), not what I wrote as query. It doesn't make sense for me that the logical plan (which is more declarative in nature) shows the lowered version, while the physical plan (which is more imperative) does not. If anything it should be the other way around that logical hides this detail, and physical shows it.

So if I got to choose I would prefer the left side plan of the diff image above rather than the right one.

kumarUjjawal · 2026-04-21T10:53:54Z

This might just be personal preference speaking, but in a physical explain plan I'm looking for what the engine is actually doing (how is it being executed), not what I wrote as query. It doesn't make sense for me that the logical plan (which is more declarative in nature) shows the lowered version, while the physical plan (which is more imperative) does not. If anything it should be the other way around that logical hides this detail, and physical shows it.

So if I got to choose I would prefer the left side plan of the diff image above rather than the right one.

I think I like this approach too, showing lowered expressions in physical explain can be more useful. Let me see how the code looks for this.

github-actions · 2026-04-30T10:10:58Z

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details

     Cloning apache/main
    Building datafusion v53.1.0 (current)
       Built [  82.566s] (current)
     Parsing datafusion v53.1.0 (current)
      Parsed [   0.034s] (current)
    Building datafusion v53.1.0 (baseline)
       Built [  82.205s] (baseline)
     Parsing datafusion v53.1.0 (baseline)
      Parsed [   0.036s] (baseline)
    Checking datafusion v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.631s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure function_marked_deprecated: function #[deprecated] added ---

Description:
A function is now #[deprecated]. Downstream crates will get a compiler warning when using this function.
        ref: https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-deprecated-attribute
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/function_marked_deprecated.ron

Failed in:
  function datafusion::physical_planner::create_aggregate_expr_and_maybe_filter in /home/runner/work/datafusion/datafusion/datafusion/core/src/physical_planner.rs:2523
  function datafusion::physical_planner::create_aggregate_expr_with_name_and_maybe_filter in /home/runner/work/datafusion/datafusion/datafusion/core/src/physical_planner.rs:2498

     Summary semver requires new minor version: 0 major and 1 minor checks failed
    Finished [ 167.367s] datafusion
    Building datafusion-expr v53.1.0 (current)
       Built [  25.699s] (current)
     Parsing datafusion-expr v53.1.0 (current)
      Parsed [   0.070s] (current)
    Building datafusion-expr v53.1.0 (baseline)
       Built [  26.030s] (baseline)
     Parsing datafusion-expr v53.1.0 (baseline)
      Parsed [   0.077s] (baseline)
    Checking datafusion-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   1.066s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  54.211s] datafusion-expr
    Building datafusion-optimizer v53.1.0 (current)
       Built [  26.446s] (current)
     Parsing datafusion-optimizer v53.1.0 (current)
      Parsed [   0.026s] (current)
    Building datafusion-optimizer v53.1.0 (baseline)
       Built [  26.495s] (baseline)
     Parsing datafusion-optimizer v53.1.0 (baseline)
      Parsed [   0.028s] (baseline)
    Checking datafusion-optimizer v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.157s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  54.226s] datafusion-optimizer
    Building datafusion-physical-expr v53.1.0 (current)
       Built [  25.359s] (current)
     Parsing datafusion-physical-expr v53.1.0 (current)
      Parsed [   0.043s] (current)
    Building datafusion-physical-expr v53.1.0 (baseline)
       Built [  25.380s] (baseline)
     Parsing datafusion-physical-expr v53.1.0 (baseline)
      Parsed [   0.043s] (baseline)
    Checking datafusion-physical-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.310s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  52.350s] datafusion-physical-expr
    Building datafusion-physical-plan v53.1.0 (current)
       Built [  32.224s] (current)
     Parsing datafusion-physical-plan v53.1.0 (current)
      Parsed [   0.123s] (current)
    Building datafusion-physical-plan v53.1.0 (baseline)
       Built [  32.636s] (baseline)
     Parsing datafusion-physical-plan v53.1.0 (baseline)
      Parsed [   0.129s] (baseline)
    Checking datafusion-physical-plan v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.597s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  67.131s] datafusion-physical-plan
    Building datafusion-proto v53.1.0 (current)
       Built [  54.635s] (current)
     Parsing datafusion-proto v53.1.0 (current)
      Parsed [   0.132s] (current)
    Building datafusion-proto v53.1.0 (baseline)
       Built [  55.841s] (baseline)
     Parsing datafusion-proto v53.1.0 (baseline)
      Parsed [   0.136s] (baseline)
    Checking datafusion-proto v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   1.548s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 114.394s] datafusion-proto
    Building datafusion-sqllogictest v53.1.0 (current)
       Built [ 134.932s] (current)
     Parsing datafusion-sqllogictest v53.1.0 (current)
      Parsed [   0.021s] (current)
    Building datafusion-sqllogictest v53.1.0 (baseline)
       Built [ 138.174s] (baseline)
     Parsing datafusion-sqllogictest v53.1.0 (baseline)
      Parsed [   0.022s] (baseline)
    Checking datafusion-sqllogictest v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.089s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 276.384s] datafusion-sqllogictest

pepijnve · 2026-04-30T11:18:12Z

                            relation: None,
                            name: field.name().to_string(),
                            metadata: None,
+                            is_internal: false,


Do we still need the is_internal flag? I may have overlooked it, but it doesn't seem to actually be used anymore. It is passed on everywhere, but in the end setting it to true vs false doesn't seem to have an impact on the explain output.

No we don't need it now. I will remove it. Thanks you!

pepijnve · 2026-04-30T11:21:03Z

 03)----CoalescePartitionsExec
 04)------AggregateExec: mode=Partial, gby=[], aggr=[sum(alias1), sum(alias2)]
-05)--------AggregateExec: mode=FinalPartitioned, gby=[alias1@0 as alias1], aggr=[alias2]
+05)--------AggregateExec: mode=FinalPartitioned, gby=[alias1@0 as alias1], aggr=[sum(__common_expr_1) as alias2]


Perfect! This is an example of exactly the kind of issue I was struggling with in the explain plan output. The left hand side here which only contains aggr=[alias2] makes the plan unusable.

pepijnve · 2026-04-30T11:22:26Z

+01)AggregateExec: mode=Final, gby=[], aggr=[max(aggregate_test_100.c2) as percentile_cont(aggregate_test_100.c2,Float64(1))]
 02)--CoalescePartitionsExec
-03)----AggregateExec: mode=Partial, gby=[], aggr=[percentile_cont(aggregate_test_100.c2,Float64(1))]
+03)----AggregateExec: mode=Partial, gby=[], aggr=[max(aggregate_test_100.c2) as percentile_cont(aggregate_test_100.c2,Float64(1))]


This is a nice improvement since it makes the optimisation that happened visible in the plan.

pepijnve

If possible, I think it would help to revert the 'internal' portions of this PR. That would reduce the number of changes and improve the signal/noise ratio.

pepijnve · 2026-04-30T12:28:33Z

+- internal aggregate aliases may now show the underlying expression instead of
+  only the alias name
+
+If you have tooling that parses the `aggr=[...]` text from physical `EXPLAIN`,


Are people actually doing this? I guess it might be necessary sometimes, but that seems extremely brittle. The more general is if the explain output is considered part of the stable API of DataFusion or not.

pepijnve · 2026-04-30T12:34:12Z

+
+    #[doc(hidden)]
+    pub fn has_aliased_human_display(&self) -> bool {
+        self.human_display_is_aliased


Could this be a derived property of human_display or is it necessary for reliability to make it explicit? Just wondering if it's worth memoizing this eagerly.

I kept it explicit because I was thinking of it as display metadata.

We could derive it by checking whether human_display ends with as <name>, but that puts us back into parsing display strings. The explicit flag lets the builder validate the invariant once, and the formatting/reverse code can use it without guessing.

What do you think?

The downside is that I can now create a conflicting situation where the display string is aliased but the boolean indicates it isn't and vice versa. strip_alias_suffix is already kind of parsing the display string as well, so maybe the fact that we need to inspect the string value isn't that big of a problem?

The main reason I asked is because the addition of this field triggers a protobuf change as well and I was wondering if that was desirable.

One solution to this could be to make human_display (String, Option<String>) (or an equivalent explicit type). In other words make the alias an explicit and optional part. That way you wouldn't have to strip it either.

Let me try this and see how it turn out.

kumarUjjawal · 2026-05-03T18:19:41Z

There are few more cleanups left.

pepijnve

For me these plan format changes are exactly what I was hoping for and I think we were able to eliminate some rough edges in the code along the way. At this point we probably need input from a maintainer.

adriangb · 2026-05-05T14:01:20Z

 02)--SortExec: expr=[l_returnflag@0 ASC NULLS LAST, l_linestatus@1 ASC NULLS LAST], preserve_partitioning=[true]
 03)----ProjectionExec: expr=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus, sum(lineitem.l_quantity)@2 as sum_qty, sum(lineitem.l_extendedprice)@3 as sum_base_price, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)@4 as sum_disc_price, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax)@5 as sum_charge, avg(lineitem.l_quantity)@6 as avg_qty, avg(lineitem.l_extendedprice)@7 as avg_price, avg(lineitem.l_discount)@8 as avg_disc, count(Int64(1))@9 as count_order]
-04)------AggregateExec: mode=FinalPartitioned, gby=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(Int64(1))]
+04)------AggregateExec: mode=FinalPartitioned, gby=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(__common_expr_1) as sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(__common_expr_1 * Some(1),20,0 + lineitem.l_tax) as sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(Int64(1))]


Is this intentional? The __comon_expr_ part seems like unfortunate noise. Same with Some(1). I think what we'd want is sum(lineitem.l_extendedprice * 1 - lineitem.l_discount)

In the current state of this MR the __common_expr_1 part is certainly intentional. My argumentation above was that this exposes the fact that line 4 is not actually executing sum(lineitem.l_extendedprice * 1 - lineitem.l_discount), that's the logical view of things. Instead sum(__common_expr_1) is being computed and __common_expr_1 is calculated in line 7. For the physical plan I don't think you would want to obscure this.

The Some(1) bit is unfortunate indeed, but I think that's a separate issue. The physical explain output doesn't format literals as nicely as the logical plan output does.

The Some(1) bit is unfortunate indeed, but I think that's a separate issue. The physical explain output doesn't format literals as nicely as the logical plan output does.

Yeah I noticed this but left as is for now. @adriangb what's your overall thoughts on the current approach.

Based on @adriangb's comment I think this change may be controversial. Do we want more input on this?

Fwiw: I don't think it's a big deal either way. There's always going to be some tension between verbosity and readability. Maybe we need a flag for this (#21768 ?) or maybe if no one complains it's fine. Either way, I don't think there's a strong contract we promise with this, we can always revert the change.

adriangb · 2026-05-06T17:49:39Z

    name: Option<String>,
-    human_displan: String,
+    human_display: Option<String>,
+    human_display_alias: Option<String>,


I explored this change a bit with Claude and came up with what I think is a good proposal. Here's the transcript / summary.

One concern about the public API surface here: this changes the signature of create_aggregate_expr_with_name_and_maybe_filter (a pub fn) by adding a new human_display_alias: Option<String> parameter and changing human_display from String to Option<String>. That's a silent break for any downstream code that builds aggregate physical expressions through this entry point — and it isn't called out in docs/source/library-user-guide/upgrading/54.0.0.md.

Stepping back: I think the existence of create_aggregate_expr_with_name_and_maybe_filter and create_aggregate_expr_and_maybe_filter as parallel public free functions is already a smell. They're really one thing — "lower a logical aggregate Expr into a physical AggregateFunctionExpr plus its filter and order-by" — split awkwardly across two signatures. We already have AggregateExprBuilder for the pure physical-construction half. What's missing is a builder for the logical→physical lowering half.

Could we kill two birds with one stone in this PR by introducing a LoweredAggregateBuilder?

// datafusion-physical-expr/src/aggregate.rs pub struct LoweredAggregateBuilder<'a> { /* expr + schemas + execution_props + overrides */ } pub struct LoweredAggregate { pub aggregate: Arc<AggregateFunctionExpr>, pub filter: Option<Arc<dyn PhysicalExpr>>, pub order_bys: Vec<PhysicalSortExpr>, } impl<'a> LoweredAggregateBuilder<'a> { pub fn new(expr: &'a Expr, logical_schema: &'a DFSchema, physical_schema: &'a Schema, execution_props: &'a ExecutionProps) -> Self; pub fn with_name(mut self, name: impl Into<String>) -> Self; pub fn build(self) -> Result<LoweredAggregate>; }

Internally, build() does what these two functions do today: unwrap aliases, derive name / human_display / human_display_alias, lower args / filter / order-by via the existing create_physical_* helpers, then hand off to AggregateExprBuilder. All the new aliased-display logic from this PR lives here.

The proposed migration shape:

Add LoweredAggregateBuilder next to AggregateExprBuilder in datafusion-physical-expr/src/aggregate.rs (the create_physical_* helpers it needs already live in that crate, so no new dependencies).

Revert the signature change to create_aggregate_expr_with_name_and_maybe_filter. Keep both free functions on their pre-PR signatures.

Mark them #[deprecated(note = "use LoweredAggregateBuilder")]. They become thin delegations — the deprecated _with_name_ variant passes its single human_display straight through with no alias, matching pre-PR behavior. The new aliased-display path is only reachable via LoweredAggregateBuilder.

Migrate the planner's own call site (the only in-tree caller) over to LoweredAggregateBuilder in this same PR, so deprecation warnings don't fire on internal code.

Why I'm suggesting this in this PR rather than a follow-up: this PR is already breaking the public surface. If we land it as-is and clean up later, downstream users pay the migration cost twice — once now for the new parameter, once again when we deprecate the function. Doing it in one shot means:

Zero public signature breaks (AggregateFunctionExpr::human_display() -> Option<&str> is still a break and still needs the upgrade-guide entry, but the free-function break disappears).

The upgrade guide gets a "prefer LoweredAggregateBuilder" pointer instead of a parameter-list diff.

Future additions to aggregate construction stop touching public function signatures.

I realize this enlarges the PR and is partly scope creep on what was supposed to be an EXPLAIN fix — happy to help with the refactor (or pair with a committer who can) if it would be useful. What do you think?

If this makes sense it might be nice to break out as it's own PR (at least introducing LoweredAggregateBuilder

adriangb

I don't have strong opinions on the new display vs. old display. There's always going to be a tradeoff between pretty vs. has all of the info / reflects the ugly reality. More importantly we don't have a strict API contract with that, we can change it and later revert or tweak if there are complaints.

I flagged the one public API change I saw, I think we should discuss that before merging.

kumarUjjawal · 2026-05-08T03:30:27Z

I'm going to try to do this and see how it looks:

Add a small LoweredAggregateBuilder.
Move only the new logical-to-physical aggregate display logic into it.
Restore the old public free-function signatures.
Mark the old free functions deprecated.
Migrate the planner’s internal call to the builder.
Keep the upgrade guide focused on:
- human_display() now returns Option<&str>
- prefer LoweredAggregateBuilder for aggregate lowering

adriangb

Some minor nits but overall LGTM!

adriangb · 2026-05-08T15:14:25Z

+        self
+    }
+
+    #[doc(hidden)]


Why hidden?

adriangb · 2026-05-08T15:15:13Z

+        }
+    }
+
+    pub fn with_name(mut self, name: impl Into<String>) -> Self {


We should add docstrings to all of these. Ideally I can read the docstrings on these functions / structs with general DataFusion knowledge but without fully understanding how planning works and get some idea of what these do.

adriangb · 2026-05-08T15:16:05Z

 }
+
+#[cfg(test)]
+mod tests {


I feel like we can probably port other unit tests to test the new builder directly? It seems like a good unit of code to unit test. Can be a followup / when we remove the old methods, but especially any tests testing the now deprecated methods are good candidates.

kumarUjjawal · 2026-05-11T06:10:27Z

@pepijnve would you like to take another look

pepijnve · 2026-05-11T20:35:24Z

        self.expr.hash(state);
        self.relation.hash(state);
        self.name.hash(state);
+        self.metadata.hash(state);


Does including metadata in equality and hash calculations risk any unintended side effects? If it's not required for the plan change itself, it might be safer to not include this in this PR.

pepijnve · 2026-05-11T20:43:35Z

+
+fn encode_human_display_alias(human_display: &str, alias: &str) -> String {
+    format!(
+        "{HUMAN_DISPLAY_ALIAS_PREFIX}{}:{alias}{human_display}",


An alternative here that may be kinder to other consumers would be to encode using {human_display} as {alias} kind of like the code was doing before. The split logic is less robust, but maybe good enough?

pepijnve · 2026-05-11T20:44:47Z

 02)--SortExec: expr=[l_returnflag@0 ASC NULLS LAST, l_linestatus@1 ASC NULLS LAST], preserve_partitioning=[true]
 03)----ProjectionExec: expr=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus, sum(lineitem.l_quantity)@2 as sum_qty, sum(lineitem.l_extendedprice)@3 as sum_base_price, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)@4 as sum_disc_price, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax)@5 as sum_charge, avg(lineitem.l_quantity)@6 as avg_qty, avg(lineitem.l_extendedprice)@7 as avg_price, avg(lineitem.l_discount)@8 as avg_disc, count(Int64(1))@9 as count_order]
-04)------AggregateExec: mode=FinalPartitioned, gby=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(Int64(1))]
+04)------AggregateExec: mode=FinalPartitioned, gby=[l_returnflag@0 as l_returnflag, l_linestatus@1 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(__common_expr_1) as sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(__common_expr_1 * Some(1),20,0 + lineitem.l_tax) as sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(Int64(1))]


Based on @adriangb's comment I think this change may be controversial. Do we want more input on this?

adriangb · 2026-05-11T20:49:29Z

@kumarUjjawal if you wanted to reduce the diff it might be nice to split #21739 (comment) into a refactor only PR since it's mostly addressing an existing code smell.

github-actions Bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Apr 20, 2026

kumarUjjawal marked this pull request as draft April 20, 2026 07:56

github-actions Bot added documentation Improvements or additions to documentation sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules proto Related to proto crate functions Changes to functions implementation labels Apr 20, 2026

kumarUjjawal marked this pull request as ready for review April 20, 2026 12:34

pepijnve reviewed Apr 20, 2026

View reviewed changes

github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Apr 21, 2026

kumarUjjawal force-pushed the fix/aliased_expr_explain branch from 21c1690 to 2580bc5 Compare April 21, 2026 14:31

kumarUjjawal requested a review from pepijnve April 30, 2026 09:52

pepijnve reviewed Apr 30, 2026

View reviewed changes

github-actions Bot removed the functions Changes to functions implementation label Apr 30, 2026

pepijnve reviewed Apr 30, 2026

View reviewed changes

kumarUjjawal added 3 commits May 3, 2026 20:39

remove the is_internal part and clean up

0a7b6e8

reword the update guide

2d86dbe

Changed the aggregate display state

1275f19

kumarUjjawal force-pushed the fix/aliased_expr_explain branch from f25fb1a to 1275f19 Compare May 3, 2026 17:49

github-actions Bot removed the sqllogictest SQL Logic Tests (.slt) label May 3, 2026

update snapshot

36cc313

kumarUjjawal requested a review from pepijnve May 3, 2026 18:19

update slt

06b2aba

github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label May 4, 2026

update tpch

0ec6a10

github-actions Bot added the auto detected api change Auto detected API change label May 4, 2026

pepijnve reviewed May 5, 2026

View reviewed changes

kumarUjjawal requested a review from adriangb May 5, 2026 10:58

adriangb reviewed May 5, 2026

View reviewed changes

adriangb reviewed May 6, 2026

View reviewed changes

adriangb approved these changes May 6, 2026

View reviewed changes

kumarUjjawal added 2 commits May 8, 2026 14:10

introduce LoweredAggregateBuilder

3cfe545

fix clippy issue

7d97f04

kumarUjjawal requested review from adriangb and pepijnve May 8, 2026 14:14

adriangb approved these changes May 8, 2026

View reviewed changes

update docs and test

0116c1f

pepijnve reviewed May 11, 2026

View reviewed changes

Conversation

kumarUjjawal commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

kumarUjjawal commented Apr 20, 2026

Uh oh!

kumarUjjawal commented Apr 20, 2026

Uh oh!

kumarUjjawal commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kumarUjjawal commented Apr 20, 2026

Uh oh!

pepijnve commented Apr 20, 2026

Uh oh!

pepijnve commented Apr 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kumarUjjawal commented Apr 20, 2026

Uh oh!

kumarUjjawal commented Apr 20, 2026

Uh oh!

pepijnve commented Apr 21, 2026

Uh oh!

kumarUjjawal commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pepijnve Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pepijnve left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pepijnve Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kumarUjjawal commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pepijnve left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kumarUjjawal commented Apr 20, 2026 •

edited

Loading

kumarUjjawal commented Apr 20, 2026 •

edited

Loading

github-actions Bot commented Apr 30, 2026 •

edited

Loading

pepijnve Apr 30, 2026 •

edited

Loading

pepijnve Apr 30, 2026 •

edited

Loading

kumarUjjawal commented May 3, 2026 •

edited

Loading