Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: making is_nullable ansi aware for sum_decimal and avg_decimal #981

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vaibhawvipul
Copy link
Contributor

Which issue does this PR close?

Closes #961 .

Rationale for this change

making is_nullable ansi aware for sum_decimal and avg_decimal

What changes are included in this PR?

Updated planner.rs, sum_decimal.rs and avg_decimal.rs

How are these changes tested?

CI tests pass

@viirya
Copy link
Member

viirya commented Sep 30, 2024

Triggered CI.

Copy link
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the error thrown when ansi_mode is enabled and an overflow occurs?

@vaibhawvipul
Copy link
Contributor Author

vaibhawvipul commented Oct 1, 2024

Where is the error thrown when ansi_mode is enabled and an overflow occurs?

As per issue-ticket's description - SumDecimal currently hardcodes nullable=true, and this is correct when ANSI mode is not enabled, because overflows cause null values. However, in ANSI mode, overflows cause exceptions. If the input is non-nullable then SumDecimal should probably also be non-nullable. The same is true for AvgDecimal.

I thought the scope of this work is only to make nullable either true or false based on ANSI mode. @parthchandra

@andygrove
Copy link
Member

Thanks for working on this @vaibhawvipul.

I checked Spark's source code in the master branch, and it also hard codes nullable as true. This makes sense because if we perform a sum on a column that only contains nulls, then we would expect a null output.

It would be good to add a test in CometAggregateSuite to confirm that we have the same behavior as Spark in this case.

Parth raises a good point that we should check that we have the behavior for the ANSI overflow case, but that can be a separate issue/PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SumDecimal always returns nullable=true
4 participants