Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle k-anonymity fallback #592

Closed
fhoering opened this issue May 24, 2023 · 7 comments
Closed

How to handle k-anonymity fallback #592

fhoering opened this issue May 24, 2023 · 7 comments

Comments

@fhoering
Copy link
Contributor

The documentation gives the following strategy to handle k-anonymity:
(1)

Since a single interest group can carry multiple possible ads that it might show, the group will have an opportunity to re-bid another one of its ads to act as a "fallback ad" any time its most-preferred choice is below threshold. This means that a small, specialized interest group that is still below the updateUrl threshold could still choose to participate in auctions, bidding with a more-generic ad until the group becomes large enough.

(2)

If generateBid() picks an ad whose rendering URL is not yet above the browser-enforced microtargeting prevention threshold, then the function will be called a second time, this time with a modified interestGroup argument that includes only the subset of the group's ads that are over threshold. (The under-threshold ad will, however, be counted towards the microtargeting thresholding for future auctions for this and other users.)

Let's say I implement the logic explained in (1). I always choose the most specific url first if it is present in the ads field, if none is present I choose the less specific one.

If as described in (2) the 2nd time we call computeBid all urls under the threshold are removed we can never actually display the less specific one, as both url would have been removed.

One strategy could be to always start with the more generic url first until it passes the k-anon threshold. But in this case I need to track this by user, something that only the k-anon server could do.
What would happen if I want to handle more than 2 urls, from more generic to more specific. As I have only 2 possible tries for computeBid I would need to choose two out of many.

Ideally we would like have one of the following behaviors:

  1. generateBid() is called again with the original renderUrl removed from the interest group. It is called repeatedly until either generateBid() returns a renderUrl that passes the k-anon threshold or it returns no renderUrl. All renderUrls that were returned in this chain of calls get their k-anon count incremented.
  2. generateBid() returns an ordered list of renderUrls and the worklet chooses the first that passes k-anon while incrementing the counts of all other renderUrls that got rejected.
@michaelkleber
Copy link
Collaborator

Hi Fabian, you make an excellent point. I was indeed thinking of the "fallback ad" as one that was already being shown as part of a broader ad campaign, so that you knew a priori that it would be over the k-anon threshold. But your use case is interesting also.

Unfortunately, allowing a single interest group and single auction to contribute to the k-anonymity count for a large number of ads has poor privacy properties. An IG's contribution to the count for ad X really needs to be about ad X: it means either "I showed ad X to the user" or "I would have shown ad X to the user if I could have." It can't mean "I would have shown ad X to the user, if I could have and also could not have shown them any of ads Y, Z, W, Q, etc." Contributing for many different ads would also hurt the differential privacy guarantees that the k-anonymity server needs to make, which would in turn cause a much more variability in how much noise the k-anon threshold is subject to.

Have you considered something simple like using the Key-Value server to serve the bit "Fallback ad is over-threshold"? Then while the fallback ad is under threshold, you could have everyone with that ad bid to show it, and it would get over threshold quickly. Once you observed (through post-auction reporting) that the fallback ad was being rendered, you could flip the bit in the K/V server, and switch to the intended behavior of bidding with the more-specific ad first, and with the fallback ad on the second round.

@fhoering
Copy link
Contributor Author

Thanks for the explanations @michaelkleber.

It could be an option to start serving the generic ad first until we see it in the reporting logs and then always propose the fallback. The problem the rolling window of 7 days. How do we know that the k-anonymity hasn't expired yet ? So it seems like the only reliable way to solve this would be to always propose a subsample of traffic on the generic url to be able to stay over the threshold.

Do you think it could be an option to query k-anonymity server for an ad directly from inside computeBid ? It would solve this problem without even having the execute a fallback auction which could be good for performance.

Somehow Chrome already needs to call the k-anonymity server anyway to be able to execute the fallback auction and remove ads under the threshold. And as everything happens on device nothing can leave the auction beyond the signals that are already exposed from computeBid to event level reporting (12 bit modeling signals, bid, ad, ..)
Potentially, with the current implementation, it could be already possible to detect that we are in the fallback auction by checking the urls that are proposed, something like adding adding a fake url that never gets chosen all the time and then see if it is there or not.

@michaelkleber
Copy link
Collaborator

Hmm. I feel like making the k-anonymity bit available inside computeBid invites abuse: the risk is bidding logic that says "If I was going to bid with a k-anonymous ad, then I might as well first bid the same amount for some random non-k-anon ad that I have no intention of showing to this user, just to safely bump up its count, and then make my real bid in the second round."

@fhoering
Copy link
Contributor Author

If I was going to bid with a k-anonymous ad, then I might as well first bid the same amount for some random non-k-anon ad that I have no intention of showing to this user, just to safely bump up its count, and then make my real bid in the second round

But if I can detect the 2nd round (by checking if fake urls that I have never shown before have been removed) I can do that anyway. If I have some more specific ad in this IG then I always would like to try them first anyway. In case they go through because they would be better performing than a more generic ad.

@michaelkleber
Copy link
Collaborator

There's nothing wrong with knowing it's the first or second round; the abuse threat is about bidding with an ad because you know there is no chance of it actually winning.

@fhoering
Copy link
Contributor Author

fhoering commented May 29, 2023

OK. I agree that being able to increment the k-anon threshold of n (potentially 100s, 1000s) ads for 1 user is not perfect. Compared to today it would allow to have many more urls that pass the k-anon threshold whereas today I can only do this for 1 user and 1 opportunity at a time.

However, I don't see that much of a difference to be able to query the k-anonymity server from the bidding worklet and then directly bidding with the right ad instead of having a fallback auction and server side tracking of having passed the k-anon threshold.
It would still limit the k-anon counter to 1 ad at a time for 1 opportunity and user. And this would be more efficient than having to do server side tracking of when an ad has actually been seen to be able to stop proposing it.

I suggest to add this topic to the next Fledge call for discussion.

@fhoering
Copy link
Contributor Author

Closing this. Superseded by #867 to be able to test different strategies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants