Skip to content

branch-4.0: [fix](simplify agg) SimplifyAggGroupBy should verify injectivity #64335#65108

Open
github-actions[bot] wants to merge 1 commit into
branch-4.0from
auto-pick-64335-branch-4.0
Open

branch-4.0: [fix](simplify agg) SimplifyAggGroupBy should verify injectivity #64335#65108
github-actions[bot] wants to merge 1 commit into
branch-4.0from
auto-pick-64335-branch-4.0

Conversation

@github-actions

@github-actions github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Cherry-picked from #64335

)

## Problem

`SimplifyAggGroupBy` simplified `GROUP BY f(x)` to `GROUP BY x` without
verifying that `f(x)` is injective (one-to-one). This caused wrong
results:

| Expression | Why wrong |
|---|---|
| `a * 0` / `0 * a` | always evaluates to 0 — all rows fall into one
group |
| `0 / a` | always evaluates to 0 |
| `a / 0` | division by zero |
| `a + NULL` / `a * NULL` / ... | always evaluates to NULL |
| `a * 0.1` with float/double | precision loss may map different inputs
to same result |

## Fix

1. **`isBinaryArithmeticSlot`**: restructured to separate slot-expr from
literal,
then validate each independently. Float/double check runs early, before
   slot extraction.

2. **New `checkLiteral(expr, literal)`**: rejects NULL literal and
   Multiply/Divide by zero.

3. **New `canExtractSlot(expr)`**: replaces the old unconditional
`extractSlotOrCastOnSlot` — only accepts bare `Slot` or implicit
lossless
   widening casts (integral→integral, float→double, integral→decimal,
decimal→decimal). Range and scale are compared directly for correctness.

## Changes

- `SimplifyAggGroupBy.java`: +80 lines, rewritten core logic
- `ExpressionUtils.java`: -35 lines, removed unused `isSlotOrCastOnSlot`
/
  `extractSlotOrCastOnSlot`
- `SimplifyAggGroupByTest.java`: +216 lines, 25 tests covering all new
paths

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot requested a review from morningman as a code owner July 1, 2026 09:56
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen

Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (27/27) 🎉
Increment coverage report
Complete coverage report

@yujun777

yujun777 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

run p0

@yujun777

yujun777 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

run nonConcurrent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants