Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter] Flip on queue batcher #11637

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-sili
Copy link
Contributor

@sfc-gh-sili sfc-gh-sili commented Nov 9, 2024

Description

This PR solves #10368.

Previously we use a pushing model between the queue and the batch, resulting the batch size to be constrained by the sending_queue.num_consumers, because the batch cannot accumulate more than sending_queue.num_consumers blocked goroutines provide.

This PR changes it to a pulling model. We read from the queue until threshold is met or timeout, then allocate a worker to asynchronously send out the request.

Link to tracking issue

Fixes #10368
#8122

Testing

This PR swaps out batch_sender directly and still passes all the existing tests.

Documentation

@sfc-gh-sili sfc-gh-sili force-pushed the sili-flip-on branch 7 times, most recently from 2900101 to 55aae5c Compare November 11, 2024 08:30
Copy link

codecov bot commented Nov 11, 2024

Codecov Report

Attention: Patch coverage is 54.76190% with 19 lines in your changes missing coverage. Please review.

Project coverage is 91.35%. Comparing base (9d2685f) to head (b3b9a3d).

Files with missing lines Patch % Lines
exporter/exporterhelper/internal/queue_sender.go 48.38% 15 Missing and 1 partial ⚠️
exporter/internal/queue/default_batcher.go 25.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11637      +/-   ##
==========================================
- Coverage   91.62%   91.35%   -0.27%     
==========================================
  Files         442      442              
  Lines       23776    23804      +28     
==========================================
- Hits        21785    21747      -38     
- Misses       1619     1682      +63     
- Partials      372      375       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@sfc-gh-sili sfc-gh-sili marked this pull request as ready for review November 11, 2024 08:58
@sfc-gh-sili sfc-gh-sili requested a review from a team as a code owner November 11, 2024 08:58
@songy23 songy23 requested review from bogdandrutu and dmitryax and removed request for songy23 November 11, 2024 15:08
Comment on lines 98 to 102
// Shutdown ensures that queue and all Batcher are stopped.
func (qb *BaseBatcher) Shutdown(_ context.Context) error {
qb.stopWG.Wait()
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the other comment.

Comment on lines 127 to 138
qb.currentBatchMu.Lock()
if qb.currentBatch == nil || qb.currentBatch.req == nil {
qb.currentBatchMu.Unlock()
continue
}
batchToFlush := *qb.currentBatch
qb.currentBatch = nil
qb.currentBatchMu.Unlock()

// flushAsync() blocks until successfully started a goroutine for flushing.
qb.flushAsync(batchToFlush)
qb.resetTimer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this, can we do this in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Here it is: #11666
batch_sender_test helped me detect that the original implementation is missing a flush on shutdown.

@sfc-gh-sili sfc-gh-sili changed the title [exporter] Flip on queue batcher [PAUSED] [exporter] Flip on queue batcher Nov 12, 2024
@dmitryax
Copy link
Member

Given the impact of this change (every collector user with sending_queue enabled, which is the default), I suggest we introduce it with a feature gate, e.g. exporter.batchingQeueue.

@sfc-gh-sili sfc-gh-sili force-pushed the sili-flip-on branch 8 times, most recently from 555baa2 to 6bb9b7f Compare November 15, 2024 01:29
@sfc-gh-sili sfc-gh-sili changed the title [PAUSED] [exporter] Flip on queue batcher [exporter] Flip on queue batcher Nov 15, 2024
@sfc-gh-sili
Copy link
Contributor Author

@dmitryax Hi Dimitrii, I wonder if you know what would be a better way to make sure existing tests pass with both feature gate on and off. Manually enabling then disabling in every single exporter test could work but I wonder if there's other option

@sfc-gh-sili sfc-gh-sili force-pushed the sili-flip-on branch 2 times, most recently from 3bf9d0b to 844cfc6 Compare November 15, 2024 05:33
"go.opentelemetry.io/collector/pipeline"
)

var usePullingBasedExporterQueueBatcher = featuregate.GlobalRegistry().MustRegister(
"telemetry.UsePullingBasedExporterQueueBatcher",
featuregate.StageBeta,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why starting with Beta? That sounds too aggressive. Let's start with Alpha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporterhelper] Awkwardness due to API between queue sender and batch sender
3 participants