Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Make shuffle compression level configurable #632

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

andygrove
Copy link
Member

Which issue does this PR close?

Closes #.

Rationale for this change

I noticed the comment // TODO: make compression level configurable so I went ahead and made this change so that we can test the performance impact of different compression levels.

What changes are included in this PR?

How are these changes tested?

@andygrove andygrove changed the title Make compression level configurable chore: Make shuffle compression level configurable Jul 5, 2024
val COMET_EXEC_SHUFFLE_COMPRESSION_LEVEL: ConfigEntry[Int] =
conf(s"$COMET_EXEC_CONFIG_PREFIX.shuffle.compressionLevel")
.doc("Zstd compression level used in shuffle.")
.intConf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could have a check here for valid values?

@@ -128,27 +128,32 @@ pub struct PhysicalPlanner {
exec_context_id: i64,
execution_props: ExecutionProps,
session_ctx: Arc<SessionContext>,
compression_level: i32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like that we are passing around the compression_level parameters. It's somehow a bit ugly if we are going to add more configurable entries later.

Could we leverage the SessionConf in session_ctx.state to avoid that?

@@ -80,6 +79,8 @@ pub struct ShuffleWriterExec {
/// Metrics
metrics: ExecutionPlanMetricsSet,
cache: PlanProperties,
/// zstd compression level
compression_level: i32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the compression_level is stored in SessionConf, we may retrieve that when executing instead of defining a new field?

@@ -455,6 +466,7 @@ pub unsafe extern "system" fn Java_org_apache_comet_Native_writeSortedFileNative
checksum_enabled: jboolean,
checksum_algo: jint,
current_checksum: jlong,
compression_level: jlong,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you need to update jni code in the JVM side too: org.apache.comet.Native#writeSortedFileNative

@andygrove andygrove marked this pull request as draft July 11, 2024 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants