Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

‘Extra Report Delay’ for Aggregate API #724

Open
renanfel opened this issue Mar 16, 2023 · 0 comments
Open

‘Extra Report Delay’ for Aggregate API #724

renanfel opened this issue Mar 16, 2023 · 0 comments
Labels
possible-future-enhancement Feature request with no current decision on adoption

Comments

@renanfel
Copy link
Contributor

I want to consider adding ‘extra report delay’ to aggregatable report’s shared info and the definition of shared ID for Aggregate API to partially address the impact of delay loss on aggregate API.

Context

Aggregatable Reports today are scheduled to be sent with a random delay between 10 min and 1 hour. However, due to a variety of circumstances, the reports may be further delayed. For example, the user was offline when the report was scheduled to be sent.

Currently, the definition of shared ID prevents the ad tech from processing a “delayed aggregatable report” if aggregatable reports, which have the same shared ID, were already processed. For example, assume an ad tech employs a batching and processing strategy of starting to process reports 2 hours after all were scheduled to arrive. In this case, when the ad tech tries to process a batch of reports that arrived with a longer delay (i.e. after processing has started), the aggregation service will reject it. See, for example: item#1 in #716

Proposal

With this change, the browser will include in the aggregatable report a new field, extra_report_delay, which reflects how long the report was delayed in being delivered to the ad tech endpoint, beyond the intended random delay. In other words, it’s the difference between the delivery time and scheduled report time.

To minimize performance impact on the Aggregation Service, we expect to bucket the extra_report_delay field, for example: no/little delay (<=2 hrs), some delay (2hr - 24hr), long delay (>24 hrs).

By expanding the definition of the shared ID, ad techs could generate summary reports using the aggregatable reports that arrive with little or no delay, and later process delayed reports. Deciding how to batch the reports with the extra_report_delay field will be based on balancing utility and privacy. Two illustrative examples:

  • Ad tech continues to batch and process reports regardless of the extra delay value. The summary reports will include the same level of noise as is today. But the ad tech may not be able to process delayed reports.
  • Ad tech batches and processes reports by extra_report_delay separately. Assuming the first value of the field is no/little delay (<=2 hrs) -- ad tech can process all reports with the extra_report_delay value of "no/little delay", as early as two hours after the scheduled report time, and generate a summary report (with noise drawn). Later, when the ad tech processes the longer delayed aggregatable reports, another summary report will be generated (with noise drawn again)

We are looking for the following feedback on this proposal, especially on:

  1. Despite the additional noise generated when delayed reports are processed, would it be more useful than the current state of not being able to process such delayed reports?
  2. To inform the definition of delay buckets -- what’s the typical processing strategy of ad techs?
@csharrison csharrison added the possible-future-enhancement Feature request with no current decision on adoption label Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
possible-future-enhancement Feature request with no current decision on adoption
Projects
None yet
Development

No branches or pull requests

2 participants