Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider allowing SharedStorage read access within Fledge bidding worklets #208

Closed
jkarlin opened this issue Jul 20, 2021 · 10 comments
Closed

Comments

@jkarlin
Copy link

jkarlin commented Jul 20, 2021

I'm not well versed in FLEDGE (apologies), but it seems like it might be advantageous (see request here WICG/shared-storage#7) to allow for reading from Shared Storage within FLEDGE worklets. This would enable a/b experiments, frequency capping, allow blocking of user unwanted ads, and more without having to explicitly support these mechanisms in FLEDGE.

@jkarlin
Copy link
Author

jkarlin commented Jul 20, 2021

Ah, this does seem problematic. Effectively you'd be allowing cross-site data (via shared storage) to choose from an arbitrary number of ads. Which means that you could leak an arbitrary amount of cross-site data into a fenced frame, and ultimately to the landing page of the ad if the user clicks on it.

This is different from Shared Storage's current output gate which allows for selection from a small number of urls (e.g., 3-5).

@vincent-grosbois
Copy link

Ah, this does seem problematic. Effectively you'd be allowing cross-site data (via shared storage) to choose from an arbitrary number of ads.

I'm not sure exactly what you mean by that.
In Fledge, the bidding function has to return an ad that is limited to the ads stored inside the interest group, defined at the time of JoinAdInterestGroup(), so there wouldn't be an "arbitrary number of ads" to choose from, unless you'd also want to change this Fledge limitation

I think the more general issue is that I'm not totally clear on what should be the compatibility between SharedStorage and Fledge, and what we would need exactly. Indeed this is quite a complex topic.

It seems from your issue title that you suggest that in Fledge bidding worklet, it should be possible to call sharedStorage.get(key) for instance. I think that from a privacy pov this is still OK, even though it'd probably need confirmation. But I'm not sure doing this would cover all advertising use cases if only this type of call was allowed.

@jkarlin
Copy link
Author

jkarlin commented Jul 20, 2021

I'm not sure exactly what you mean by that.
In Fledge, the bidding function has to return an ad that is limited to the ads stored inside the interest group, defined at the time of JoinAdInterestGroup(), so there wouldn't be an "arbitrary number of ads" to choose from, unless you'd also want to change this Fledge limitation

You may be right. I guess it depends on if each interest group that an origin has is analyzed in isolation or if there is state shared between them.

It seems from your issue title that you suggest that in Fledge bidding worklet, it should be possible to call sharedStorage.get(key) for instance. I think that from a privacy pov this is still OK, even though it'd probably need confirmation. But I'm not sure doing this would cover all advertising use cases if only this type of call was allowed.

That is indeed what I'm proposing, but this is only okay if the number of total ads to select from is in the 3-5 range. What do you mean if only this type of call? Do you mean you'd also need to write to shared storage?

@vincent-grosbois
Copy link

vincent-grosbois commented Jul 20, 2021

That is indeed what I'm proposing, but this is only okay if the number of total ads to select from is in the 3-5 range.

Why is that ? From my understanding the number of ads to select from (for a given interest group) is not limited, and the only requirements are that the ad was already present in the interest group 'ads' fields, and that the ad passes the "micro targerting protection" threshold (to be displayed).

I don't understand why adding new SharedStorage capabilities in the worklets would put additional requirements on Fledge.
To me, as long as we can "prove" that calling sharedStorage.get(key) inside a worklet is OK from a privacy point of view, it shouldn't have impacts on the number of allowed ads.

What do you mean if only this type of call? Do you mean you'd also need to write to shared storage?

Not sure actually :)
What I meant is that, we'd need to carefully analyse if allowing sharedStorage.get(key) inside a Fledge worklet is enough to enable feature X. Just from the description it seems quite hard to be 100% sure. I think it's probably a topic we will work on in the coming weeks anyway

@jkarlin
Copy link
Author

jkarlin commented Jul 20, 2021

Why is that ? From my understanding the number of ads to select from (for a given interest group) is not limited, and the only requirements are that the ad was already present in the interest group 'ads' fields, and that the ad passes the "micro targerting protection" threshold (to be displayed).

If you can use shared storage to select from 1024 ads, then the selected ad represents up to log2(1024)=10 bits of information. And that's 1024 of cross-site information (e.g., sites you've visited).

@vincent-grosbois
Copy link

vincent-grosbois commented Jul 20, 2021

Why is that ? From my understanding the number of ads to select from (for a given interest group) is not limited, and the only requirements are that the ad was already present in the interest group 'ads' fields, and that the ad passes the "micro targerting protection" threshold (to be displayed).

If you can use shared storage to select from 1024 ads, then the selected ad represents up to log2(1024)=10 bits of information. And that's 1024 of cross-site information (e.g., sites you've visited).

I don't really get this.
My understanding of what you're suggesting is :

  • Fledge bidding worklet does whatever it wants, as per the Fledge spec
  • additionally, it can call sharedStorage.get(key) to retrieve any kind of value stored via SharedStorage key/value API (not limited to an ad url, not limited to 1024 possible values)
  • fledge bidding logic uses the result of shared storage get function in its code to return a Fledge ad url, as per the initial Fledge spec, where the only limitations on ads come from Fledge spec

In what I wrote above we are in regular fledge worklet, so there is no specific "shared storage" limitation related to ads.

It seems what you have in mind is not what I described above, but something that mixes both Fledge bidding worklet and SharedStorage runURLSelectionOperation worklets... in which case I don't understand your initial proposal

@appascoe
Copy link
Collaborator

@jkarlin Is this only a concern in the case of a click in the fenced frame? I can't see how the information would leak otherwise.

If I can take a crack at explaining, @vincent-grosbois:

The user goes to a.example, which creates an interest group with 1024 ads (or whatever), with URLs:

https://a.example/ad/1
https://a.example/ad/2
...
https://a.example/ad/1024

The interest group owner has relationship with 1024 other sites:

https://b1.example
https://b2.example
...
https://b1024.example

Each of these sites writes into shared storage window.sharedStorage.set("cross-site", 39); (or whatever number is assigned to one of these "b" domains). The bidding functions reads this value from shared storage, and selects https://a.example/ad/39 to bid on and render. If the user clicks on this ad, a.example learns that this user (which they previously had a first-party cookie for) has been to site b39.example.

These ads may have passed their k-anonymity checks, but that doesn't prevent a.example knowing some cross-site visits for this user specifically, and that's the issue. This is why the Shared Storage API limits the URL selection to five URLs; effectively it's a tradeoff on usability for A/B experimentation and data leakage.

@vincent-grosbois
Copy link

Thanks @appascoe , I got it now :) Seems like indeed allowing to use SharedStorage would change the behavior of Fledge wrt what the kind of infos the advertising website can learn....

@jkarlin
Copy link
Author

jkarlin commented Jul 21, 2021

Yep. So, I'm going to close this for now assuming that we don't want to introduce this significant a change to FLEDGE.

Going back to WICG/shared-storage#7, which requested support for A/B testing in FLEDGE via Shared Storage, perhaps A/B testing could be done via the interest group itself? E.g., put a user in interest group 100 for the A case and 101 for the B case. Not sure if that would suffice for everyone, but that followup should likely happen in a new FLEDGE issue.

@jkarlin jkarlin closed this as completed Jul 21, 2021
@jkarlin jkarlin changed the title Consider allowing SharedStorage read access within Fledge worklets Consider allowing SharedStorage read access within Fledge bidding worklets Aug 16, 2021
@jkarlin
Copy link
Author

jkarlin commented Sep 6, 2022

Bringing this back from the dead to ponder on some more. I still think that we want to provide read/write access to Shared Storage in FLEDGE for the use cases mentioned above plus negative filtering #319.

So here is another proposal:

Create a filtering function somewhere in the DSP flow (e.g., post bid on each of the IGs) that allows the top ~8 bids to be filtered (e.g., dropped if they're no good) via a worklet that can read from SharedStorage before sending the urls off to scoreAd. This puts a cap of up to 3 bits of SharedStorage data to flow into the output URL which seems acceptable.

This seems useful for negative filtering, frequency capping, and lift measurement.

I'm not so sure if it'd work for general a/b very well, because the a and b may not show up in the top 8. That might require filtering at the IG level before bidding.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants