Reducing delays for aggregatable reports #738

csharrison · 2023-03-28T19:27:07Z

Currently, aggregatable reports in the API come with a randomized delay of (10 mins, 1 hour), though often due to offline users or inactive browsers the delay can be much larger. This delay serves to protect whether any particular user had an attributed conversion, which is cross site data. However, if aggregatable reports leaked less cross-site data, they could be delivered with reduced delay.

In this issue I propose reducing the delay in the API to something like ~0-10 minutes instead of ~10-60 minutes.

We can do this by taking ideas from issue 439 and introducing null reports for some fraction trigger registrations, to reduce the total amount of cross-site information embedded in a report. Currently, as a lot of the cross site information is embedded in the existing source_registration_time field, we are also thinking of making this field optional as part of this change. E.g. in the trigger registration JSON:

{
  aggregatable_source_registration_time: "omit" // or "include"
  ...
}

If the source_registration_time field is present, the null report rate will need be higher. Currently we are thinking of something like ~0.05 reports (in expectation) for triggers which don’t specify the source registration time, and ~0.25 reports if they do.

Note: ideally this level of configuration would be done globally at the reporting origin level rather than the trigger level, but that would require introducing global configuration to the API across users. This is something we might explore in the future, so any feedback where we’d want to vary this across an ad-tech / reporting origin would be useful.

We also think that in the future, embedding the source_registration_time inside the encrypted payload would improve the situation here and reduce the amount of null reports. For this reason we think defaulting to omitting this field makes sense.

The text was updated successfully, but these errors were encountered:

csharrison · 2023-03-31T00:04:43Z

Let me clarify the algorithm for generating null reports here that I am thinking:

Upon a successful trigger registration (whether it generates any real reports or not):
If source_registration_time is omitted
- Sample n = GeometricDistribution(p_1), and schedule n null reports to be sent
Otherwise:
- For each possible value of source_registration_time
  - Sample n = GeometricDistribution(p_2), and schedule n null reports to be sent with the given source_registration_time

Where p_1 = ~.952 and p_2 = ~.992 to achieve the expected null reports above.

To generate a null report, use a contribution where the key is a random 128 bit number and the value is 0. Everything else should be specified by the trigger (except source_registration_time which is handled above). Null reports should not check or affect rate limits.

csharrison · 2023-04-10T21:02:02Z

An alternative, simpler privacy mechanism to geometric noise is a "one way flipping" proposal. Here is the algorithm:

Upon a successful trigger registration:
If source_registration_time is omitted
- If no real report is present, emit a null report with probability p_1
Otherwise, for each possible value of source_registration_time:
- If no real report is present with this value of source_registration_time, emit a null report with probability p_2

This mechanism does not have some interesting properties of the geometric distribution (e.g. privacy amplifies with shuffling reports among many users), but at the same time I'm not sure we could achieve this with the report verification proposal. So in general, this one might be preferred because:

It is simpler
It emits a bounded # of fake reports (vs. the unbounded Geometric distribution)

cc @linnan-github

csharrison mentioned this issue Mar 28, 2023

Consider a maximum # of reports per source for aggregatable sources #740

Closed

akashnadan mentioned this issue Mar 31, 2023

Industry expectation on monthly report on the 1st of each month #717

Open

This was referenced Apr 4, 2023

Update explainer for reducing delays for aggregatable reports #749

Merged

Spec for introducing null reports and source registration time configuration #750

Merged

csharrison mentioned this issue Apr 13, 2023

Scheduled Reporting Window for Spend #66

Closed

linnan-github closed this as completed in #749 Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing delays for aggregatable reports #738

Reducing delays for aggregatable reports #738

csharrison commented Mar 28, 2023

csharrison commented Mar 31, 2023

csharrison commented Apr 10, 2023

Reducing delays for aggregatable reports #738

Reducing delays for aggregatable reports #738

Comments

csharrison commented Mar 28, 2023

csharrison commented Mar 31, 2023

csharrison commented Apr 10, 2023