Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further reducing aggregatable report delays #974

Closed
csharrison opened this issue Aug 28, 2023 · 2 comments · Fixed by #1113 or #1114
Closed

Further reducing aggregatable report delays #974

csharrison opened this issue Aug 28, 2023 · 2 comments · Fixed by #1113 or #1114

Comments

@csharrison
Copy link
Collaborator

csharrison commented Aug 28, 2023

Recent feedback (meeting notes) indicates that even the new modest report delays for aggregatable reports are not ideal, and we should explore the instant-style approaches outlined in https://github.com/WICG/attribution-reporting-api/blob/main/report_verification.md#could-we-just-tag-reports-with-a-trigger_id-instead-of-using-anonymous-tokens. Here is what a privacy neutral enhancement to the API could look like that enables this use-case:

Trigger registration could come along with an optional top-level parameter that ad techs can set: trigger_event_id1. If this parameter is set, a few things happen:

  • Aggregatable reports will be generated for this registration unconditionally, even if it was not preceded by a source event. In that case any generated report will act like the existing null aggregatable reports i.e. they will not affect any aggregate computation.
  • Aggregatable reports associated with this registration will be delivered immediately after the browser receives the registration.
  • Aggregatable reports associated with this registration will come bearing a new piece of cleartext information in its body which passes the trigger_event_id.

By default this will result in a single report per registration. However, if aggregatable_source_registration_time is ”include”, it will require generating 30 reports for every trigger registration. Given that 30 reports is non-ideal from a system health perspective, we may want to consider disallowing this combination, and only support instant reports if the aggregatable_source_registration_time is “exclude” (feedback welcome though!). The purpose of these “unconditional” reports is to ensure that the output of the trigger registration does not leak any cross-site information (i.e. whether the user had previously seen an ad or not).

cc @thegreatfatzby , @bmayd, and @dmdabbs who voiced support for reducing delays all the way to 0 in our Aug 21 call, especially in relation to CPA-based billing.

Note: this proposal may greatly increase (up to 20x2) the number of reports you receive and need to query the aggregation service with, which can impact aggregation query latency. Here is a derivation to find out how it will impact you:

  • Let true_rate be the fraction of registrations that lead to a true report (i.e. have a previous matching source)
  • With no noise, the number of reports is true_rate * num_triggers
  • Currently, with the 5% flip probability the number of reports is ((1 - true_rate) * .05 + true_rate) * num_triggers = (.95 * true_rate + .05) * num_triggers
  • With this proposal, the number of reports is num_triggers. Thus, adopting this proposal would result in (1 / (.95 * true_rate + .05)) times more total reports than the status quo (note that this smoothly interpolates between 0x and 20x at the extremes).

Note with true_rate = .2, this gives you ~4x times more total reports than the status quo.

Footnotes

  1. Keen observers will note the similarities between this and the context_id param in https://github.com/patcg-individual-drafts/private-aggregation-api/blob/main/report_verification.md#shared-storage

  2. Assuming aggregatable_source_registration_time is “exclude”.

@chandan-giri
Copy link

Google Ads is supportive of the proposal given it removes ~3% transmission loss due to the reporting delays of 0-10 min. To overcome the increased processing cost from additional null reports, ads can selectively register the trigger for important conversions like purchase and checkout.

@bmayd
Copy link

bmayd commented Nov 12, 2023

I have a standing conflict with this meeting these days, but wanted to respond. If I'm understanding the proposal, the suggestion is to trade an increased processing burden for more timely reporting. I think that's fine as long as participants have control over the trade-off, which it sounds like they do, and as long as there's no loss of signal, just an increased effort to recover it, which it also sounds like is the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment