Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explainer for PA per-participant metrics #1272

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
90 changes: 90 additions & 0 deletions FLEDGE_extended_PA_reporting.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@ Where `signalBucket` and `signalValue` is a dictionary which consists of:
* generateBid() hitting timeout
* The auction was aborted (i.e. calling endAdAuction())
* an auction that never rendered the ad
* Some additional values described [separately](#per-participant-base-values).
* optional `offset` and `scale` that allow the reporter to manipulate the browser signal to customize the buckets and values they will receive in reports:
* `scale` will be multiplied by the browser provided value. Scale is applied before `offset` is added. Default value is 1.0.
* `offset` will be added to the browser provided value (allows you to shift buckets, etc). Default value is 0.
Expand Down Expand Up @@ -255,6 +256,95 @@ The `reserved.always` event-type is a special value that indicates a report shou
matter whether a bid wins the auction or not. The browser will always trigger reporting for the
`always` contributions.

## Per-participant metrics

More recent versions of Chrome offer some additional features, focused on measuring resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M130 intends to enable Bidding & Auction Services by default. Does the new Per-Participant capabilities for now apply only to "classic" FLEDGE on-device auctions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically, yeah, there is no support for B&A. It would need to be added to the B&A server (when it sends over PrivateAggregation stuff it just sends over the computed bucket and value numbers, no baseValues at this point).

... But I think it won't need any Chrome-side changes.

usage and whole-auction metrics rather than those specific to a particular function execution
or the winning bid.

First, the new `reserved.once` event-type is a special value that, for each (sub)auction, selects a
random invocation of `generateBid()` and of `scoreAd()` independently, and reports private
aggregation contributions with that event only from those executions. (In case of an auction
with component auctions, the top-level auction will have a single `scoreAd()` invocation selected as
well).

The event may not be used in `reportWin()` or `reportResult()`; since those already run at most
once per auction level, `reserved.always` may be used instead.

This feature is intended for reporting overall per-participant metrics only once rather than for
every interest group participating. A number of new `baseValues` representing such values are
available and described below, but it can also be useful with per-IG metrics which are not expected to
vary much like `signals-fetch-time`, to sample them.

While usage of `reserved.once` will be ignored by older versions, newly added `baseValues` will not
be, so the calls to `contributeToHistogramOnEvent()` should be individually wrapped in `try/catch`.
That is also encouraged in general since `contributeToHistogramOnEvent()` is specified to throw
on permission policy violations.

Users using this are strongly encouraged to report their metrics during the beginning of their
scripts, since if the script hits a per script time out before asking to report them nothing will
get sent, which can result in inaccuracy, especially for `percent-scripts-timeout`.

### Per-participant base values.

The newly added base values are as following:
* `participating-ig-count`: number of interest groups that got a chance to participate in this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that participating-ig-count slightly different than interestGroupCount per-buyer measurement available to buyers.

(sub)auction, i.e. they had registered ads, did not have unsatisfied capabilities, and were not
filtered based on priority. Interest groups included in this might not actually get to bid if the
cumulative timeout expires, or the script fails to load, etc, but they would have if nothing went
wrong.
* `average-code-fetch-time`: average time of all code fetches (including JavaScript and WASM) used
for all invocations of particular function type for given (sub)auction.
* `percent-scripts-timeout`: percentage of script executions of this kind that hit the per-script
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To verify, for a seller this would be the percentage of scoreAd executions that hit sellerTimeout || 50 and reportResult executions that hit reportingTimeout || 50.

And for a buyer the percentage of generateBid executions that hit a perBuyerTimeout || 50 and reportWin executions that hit reportingTimeout || 50.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what exactly you mean by || 50, but it's not combining scoreAd and reportResult and generateBid/reportWin.

scoreAd would get its own time outs, and reportResult its own (which is always 0 or 100). So you would want to report it as "reserved.once" to some range of buckets for scoreAd and as "reserved.always" to another range of buckets for reportResult. (And if you start seeing some timeouts, you might decided that you need bigger sellerTimeout or bigger reportingTimeout respectively).

The big table below is supposed to be explaining this...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what exactly you mean by || 50

As I read the explainer/impl guide, 50 is the script-level default absent a relevant timeout having been configured.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And if you start seeing some timeouts, you might decided that you need bigger sellerTimeout or bigger reportingTimeout respectively).

Makes sense for a seller observing their script timeouts as they control the sellerTimeout knob.

But a buyer seeing timeouts cannot take action since the seller(s) set reportingTimeout.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, 50 being default is true. But it's separate measurements between scoreAd and reportResult (and likewise for buyer) --- exactly so people can check whether reportingTimeout and sellerTimeout/perBuyerTimeout are set correctly independently.

timeout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

percentage of script executions of this kind that hit the per-script timeout

This of this kind reading would be accessible in reporting functions via the suggested use of reserved.always.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poor wording as I wrapped my head around values not strictly associated with reserved.once.

percent-scripts-timeout: percentage of script executions of this kind that hit the per-script timeout.

My comment was recognizing that one must use reserved.always in a reporting function. Since this baseValue actually measures invocations of the function in which the histogram contribution is reported, might the wording be "of this function?"

while percent-scripts-timeout in reportWin() is either 0 or 100 dependent on whether the reporting function's execution timed out or not (assuming reporting is done early enough to happen if it did)

This applies to scoreAd() also, right? Maybe "in reporting functions?"


average-code-fetch-time: average time of all code fetches (including JavaScript and WASM) used for all invocations of particular function type for given (sub)auction.

I read this as different measurement by function type (bid/score versus reporting), but until Chrome introduces a separate reporting logic url (#679) there's no differentiation, right? Because the fetches are part of the executor setup.

To verify, whether using .once or .always (in reportResult), sellers will receive the same value - their scoringLogicURL's code fetch time.

Buyers would also receive the same value regardless of where the contribution is reported.
A buyer with IGs using only biddingLogicURLs would receive the average code fetch across IGs that made it that far.
A buyer with IGs using WASM as well would receive the average code fetch irrespective of type.

Copy link
Collaborator Author

@morlovich morlovich Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So reporting code fetch time being the same as for generateBid/scoreAd is not necessarily true, since we have a process limit, and so the process may actually need to be started fresh for reporting (and fetch fresh... hopefully from cache). That's actually the common case for reportWin (basically unless a concurrent auction keeps the process alive).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, right. You don't hold onto that limited resource.

There are separate Chrome-wide limits on the number of buyer processes (10) and seller processes (3). These numbers are not optimized, and subject to change. On mobile, Chrome will generally reuse the single renderer process with one thread for bidders and one for sellers, instead of using more secure, isolated processes, due to resource constraints.

Verifying that "on mobile" equates to

Sec-CH-UA-Mobile: ?1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it equates to "Android" (so telling devtools to emulate an iPhone on your laptop won't change that... not would "request desktop site" on a phone). I suppose it should probably just say that.

* `percent-igs-cumulative-timeout`: percentage of interest groups from this buyer that did not get
to participate in this (sub)auction due to the per-buyer cumulative timeout
morlovich marked this conversation as resolved.
Show resolved Hide resolved
(`participating-ig-count` is the denominator here).
* `cumulative-buyer-time`: total time spent for buyer's computation, in milliseconds; this is what
would normally be compared against the per-buyer cumulative timeout (which must be set for this
to be non-zero). If the timeout is not hit, the value will be how long the buyer actually took,
capped by the per-buyer cumulative timeout, if the timeout is hit, the reported value will be the
timeout + 1000.
* `percent-regular-ig-count-quota-used`,`percent-negative-ig-count-quota-used`,
`percent-ig-storage-quota-used`: percentage of the database quota used by the buyer for
regular interest group count, negative targeting interest group count, and overall byte usage
respectively. This is capped at 110 since the quotas may not be enforced immediately (and actual
usage in that case may be bigger than 110%).
* `regular-igs-count`, `negative-igs-count`, `ig-storage-used`: the raw counts for the buyer's
number of regular interest group, negative targeting interest groups, and overall byte usage,
respectively. This is also capped at 1.1x the current quota, but please do keep in mind that the
quota might increase in the future, so if you use these metrics rather than percentage-based ones,
you may wish to reserve some extra margin around the bucket space (perhaps something like 15x) to
avoid confusion in the future.

Note that these metrics are measured only for some kinds of worklet executions — some are
only relevant for bidders, and get 0 in the seller functions. In case of reporting functions,
they sometimes repeat what was available in the corresponding `generateBid()` or `scoreAd()`,
and sometimes get their own measurement. This is shown below:

| `baseValue` name | In `generateBid() ` | In `reportWin()` | In `scoreAd()` | In `reportResult()` |
| ---------------- | ------------------- | ---------------- | -------------- | ------------------- |
| `average-code-fetch-time` | Measured | Measured | Measured | Measured |
morlovich marked this conversation as resolved.
Show resolved Hide resolved
| `percent-scripts-timeout` | Measured | Measured | Measured | Measured |
| `participating-ig-count` | Measured | From `generateBid()` | 0 | 0 |
| `percent-igs-cumulative-timeout` | Measured | From `generateBid()` | 0 | 0 |
| `cumulative-buyer-time` | Measured | From `generateBid()` | 0 | 0 |
| `percent-regular-ig-count-quota-used` | Measured | From `generateBid()` | 0 | 0 |
| `percent-negative-ig-count-quota-used` | Measured | From `generateBid()` | 0 | 0 |
| `percent-ig-storage-quota-used` | Measured | From `generateBid()` | 0 | 0 |
| `regular-igs-count` | Measured | From `generateBid()` | 0 | 0 |
| `negative-igs-count` | Measured | From `generateBid()` | 0 | 0 |
| `ig-storage-used` | Measured | From `generateBid()` | 0 | 0 |

For example, `percent-scripts-timeout` in `generateBid()` is the portion of executions of
`generateBid()` in that (sub)auction that timed out, while `percent-scripts-timeout` in
`reportWin()` is either 0 or 100 dependent on whether the reporting function's execution timed out
or not (assuming reporting is done early enough to happen if it did); while
`percent-igs-cumulative-timeout` will be the same value in both.

Similarly, `percent-scripts-timeout` makes sense for seller functions like `scoreAd()`, but
`percent-igs-cumulative-timeout` doesn't, so it just evaluates to 0.

## Reporting Per-Buyer Latency and Statistics to the Seller

The seller may want to collect aggregate statistics on latency and bids for their auctions.
Expand Down
Loading