Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explainer for PA per-participant metrics #1272

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

morlovich
Copy link
Collaborator

Please note that most of this isn't landed yet (which does mean that feedback is more actionable).

would have been run.
* `average-code-fetch-time`: average time of all code fetches (including JavaScript and WASM) used
for all invocations of particular function type for given (sub)auction.
* `percent-scripts-timeout`: percentage of script executions of this kind that hit the per-script
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To verify, for a seller this would be the percentage of scoreAd executions that hit sellerTimeout || 50 and reportResult executions that hit reportingTimeout || 50.

And for a buyer the percentage of generateBid executions that hit a perBuyerTimeout || 50 and reportWin executions that hit reportingTimeout || 50.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what exactly you mean by || 50, but it's not combining scoreAd and reportResult and generateBid/reportWin.

scoreAd would get its own time outs, and reportResult its own (which is always 0 or 100). So you would want to report it as "reserved.once" to some range of buckets for scoreAd and as "reserved.always" to another range of buckets for reportResult. (And if you start seeing some timeouts, you might decided that you need bigger sellerTimeout or bigger reportingTimeout respectively).

The big table below is supposed to be explaining this...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what exactly you mean by || 50

As I read the explainer/impl guide, 50 is the script-level default absent a relevant timeout having been configured.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And if you start seeing some timeouts, you might decided that you need bigger sellerTimeout or bigger reportingTimeout respectively).

Makes sense for a seller observing their script timeouts as they control the sellerTimeout knob.

But a buyer seeing timeouts cannot take action since the seller(s) set reportingTimeout.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, 50 being default is true. But it's separate measurements between scoreAd and reportResult (and likewise for buyer) --- exactly so people can check whether reportingTimeout and sellerTimeout/perBuyerTimeout are set correctly independently.

(`participating-ig-count` is the quotient here).
* `cumulative-buyer-time`: total time spent for buyer's computation, in milliseconds; this is what
would normally be compared against the per-buyer cumulative timeout. If the timeout is not hit,
the value is capped at the per-buyer cumulative timeout, if it's hit, the value will be the
Copy link
Contributor

@dmdabbs dmdabbs Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the timeout is not hit, the value is capped at the per-buyer cumulative timeout, if it's hit, the value will be the timeout + 1.

The seller has configured perBuyerCumulativeTimeout of 500. This value will either be 500 or 501?
And if no value was configured?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I should clarify that (thanks!). To answer meanwhile:
If no value was configured you'll just get a 0.
If the value was set to 500, you are not limited to just 500 or 501 --- like if it took 400ms you'll get 400ish ms.

The "capped" bit is that there is some sloppiness in measurements, so it's possible that nothing got timed out but it actually got measured as 501ms. And similarly the timeout may end up measured at 503ms or whatever, so it's trying to normalize these so you can tell them apart. (Though maybe I should use something more than +1 for easier bucketing?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased a bit

### Per-participant base values.

The newly added base values are as following:
* `participating-ig-count`: number of interest groups that got a chance to participate in this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that participating-ig-count slightly different than interestGroupCount per-buyer measurement available to buyers.

usage and whole-auction metrics rather than those specific to a particular function execution
or the winning bid.

First, the new `reserved.once` event-type is a special value that, for each sub-auction, selects a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad the contributions limit is increasing to 100.
https://chromestatus.com/feature/5114676393017344

* `average-code-fetch-time`: average time of all code fetches (including JavaScript and WASM) used
for all invocations of particular function type for given (sub)auction.
* `percent-scripts-timeout`: percentage of script executions of this kind that hit the per-script
timeout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

percentage of script executions of this kind that hit the per-script timeout

This of this kind reading would be accessible in reporting functions via the suggested use of reserved.always.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poor wording as I wrapped my head around values not strictly associated with reserved.once.

percent-scripts-timeout: percentage of script executions of this kind that hit the per-script timeout.

My comment was recognizing that one must use reserved.always in a reporting function. Since this baseValue actually measures invocations of the function in which the histogram contribution is reported, might the wording be "of this function?"

while percent-scripts-timeout in reportWin() is either 0 or 100 dependent on whether the reporting function's execution timed out or not (assuming reporting is done early enough to happen if it did)

This applies to scoreAd() also, right? Maybe "in reporting functions?"


average-code-fetch-time: average time of all code fetches (including JavaScript and WASM) used for all invocations of particular function type for given (sub)auction.

I read this as different measurement by function type (bid/score versus reporting), but until Chrome introduces a separate reporting logic url (#679) there's no differentiation, right? Because the fetches are part of the executor setup.

To verify, whether using .once or .always (in reportResult), sellers will receive the same value - their scoringLogicURL's code fetch time.

Buyers would also receive the same value regardless of where the contribution is reported.
A buyer with IGs using only biddingLogicURLs would receive the average code fetch across IGs that made it that far.
A buyer with IGs using WASM as well would receive the average code fetch irrespective of type.

Copy link
Collaborator Author

@morlovich morlovich Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So reporting code fetch time being the same as for generateBid/scoreAd is not necessarily true, since we have a process limit, and so the process may actually need to be started fresh for reporting (and fetch fresh... hopefully from cache). That's actually the common case for reportWin (basically unless a concurrent auction keeps the process alive).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, right. You don't hold onto that limited resource.

There are separate Chrome-wide limits on the number of buyer processes (10) and seller processes (3). These numbers are not optimized, and subject to change. On mobile, Chrome will generally reuse the single renderer process with one thread for bidders and one for sellers, instead of using more secure, isolated processes, due to resource constraints.

Verifying that "on mobile" equates to

Sec-CH-UA-Mobile: ?1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it equates to "Android" (so telling devtools to emulate an iPhone on your laptop won't change that... not would "request desktop site" on a phone). I suppose it should probably just say that.

Copy link
Collaborator Author

@morlovich morlovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!

@@ -255,6 +256,85 @@ The `reserved.always` event-type is a special value that indicates a report shou
matter whether a bid wins the auction or not. The browser will always trigger reporting for the
`always` contributions.

## Per-participant metrics

More recent versions of Chrome offer some additional features, focused on measuring resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M130 intends to enable Bidding & Auction Services by default. Does the new Per-Participant capabilities for now apply only to "classic" FLEDGE on-device auctions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically, yeah, there is no support for B&A. It would need to be added to the B&A server (when it sends over PrivateAggregation stuff it just sends over the computed bucket and value numbers, no baseValues at this point).

... But I think it won't need any Chrome-side changes.

aarongable pushed a commit to chromium/chromium that referenced this pull request Sep 24, 2024
See  WICG/turtledove#1272 for more context.

Bug: 361262468
Change-Id: I439706b8147aa8d75140ad07931bb24aea78248f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5782999
Reviewed-by: Caleb Raitto <[email protected]>
Commit-Queue: Maks Orlovich <[email protected]>
Reviewed-by: Joe Mason <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1359340}
aarongable pushed a commit to chromium/chromium that referenced this pull request Sep 25, 2024
In particular:
  regular-igs-count,
  percent-regular-ig-count-quota-used,
  negative-igs-count,
  percent-negative-ig-count-quota-used,
  ig-storage-used,
  percent-ig-storage-quota-used


See WICG/turtledove#1272 for more context.

Bug: 361262468
Change-Id: I4098696d1d6eca9f573ae500c754fc8c8f13a92b
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5789141
Commit-Queue: Maks Orlovich <[email protected]>
Reviewed-by: Ken Buchanan <[email protected]>
Reviewed-by: Caleb Raitto <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1360186}
@fhoering
Copy link
Contributor

fhoering commented Nov 5, 2024

@morlovich @alexmturner
Thanks for those new metrics that would allow better debugging.

Having more metrics raises the importance of ticket #1084 because currently for timings a large key space must be allocated and overlap of buckets are not handled.
The issue talks about timings but potentially the same issue will arise for the percentage metrics added here depending on with which precision they are reported (int, float with n digits)

@morlovich
Copy link
Collaborator Author

A lot of these are bounded (...though some by configuration which may be under a different party's control). cumulative-buyer-time is bounded by configured limit plus 1000 (if there is no limit configured, it's 0).
The percentages are bounded by... 110.

Hmm, I guess network times are technically not bounded, and maybe I should add that. The fetch itself does have a 30 second timeout, but there is some sloppiness in measurement. Similarly for script-run-time --- the actual execution has a timeout, but the measured time may be slightly

You do also have 128-bits of address space, however, so you can just give 2^64 metrics 2^64 space each. The values are initially measured as floats, and then after application of your scale and offset truncated.

@fhoering
Copy link
Contributor

fhoering commented Nov 5, 2024

The percentages are bounded by... 110.

What is the unit and precision of the percentages (int vs float) ? Technically if values are close, 99.5 and 99.9% can make a difference.

The fetch itself does have a 30 second timeout,

Which network fetch exactly because it looks low to me in general ? Do you have the Chromium source code for that ?

Hmm, I guess network times are technically not bounded, and maybe I should add that.

The client should be able to decide on how to bound and bucketize them based on the use case. It should not be hardcoded.

You do also have 128-bits of address space, however, so you can just give 2^64 metrics 2^64 space each.

My understanding is that this is not how it works. For timers if I allocate 2^64 I cannot use this for something else. But I let @alexmturner confirm or explain why it would work.

@morlovich
Copy link
Collaborator Author

The percentages are bounded by... 110.

What is the unit and precision of the percentages (int vs float) ? Technically if values are close, 99.5 and 99.9% can make a difference.

They starts as a double. Then scale and offset are applied, and the result is converted to an int.
So if you want more digits of precision, you can use a scale. e.g. without applying any you just end up with 99 and 99, but if you apply scale of 10 you will end up with ... 995 and either 999 or 998. Not sure exactly how the decimal ends up in binary.

The fetch itself does have a 30 second timeout,

Which network fetch exactly because it looks low to me in general ? Do you have the Chromium source code for that ?

scripts/wasm, trusted signals (might not be the case for them for V2 protocol, not sure).

https://source.chromium.org/chromium/chromium/src/+/main:content/services/auction_worklet/public/cpp/auction_downloader.cc;drc=be1dfd15d36d914a9feb33677a0c836c7922c689;l=253

You do also have 128-bits of address space, however, so you can just give 2^64 metrics 2^64 space each.

My understanding is that this is not how it works. For timers if I allocate 2^64 I cannot use this for something else. But I let @alexmturner confirm or explain why it would work.

Well, I mean you make first histogram have bucket offset 0, second have offset 0x100000000 , third 0x200000000, etc. (however one specifies hex for bigints in JS, anyway).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants