Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relaxing the same-origin policy to allow for subdomains #813

Open
rdgordon-index opened this issue Sep 19, 2023 · 39 comments
Open

relaxing the same-origin policy to allow for subdomains #813

rdgordon-index opened this issue Sep 19, 2023 · 39 comments

Comments

@rdgordon-index
Copy link
Contributor

rdgordon-index commented Sep 19, 2023

As per https://github.com/WICG/turtledove/blob/main/FLEDGE.md#21-initiating-an-on-device-auction :

All fields that specify URLs for loading scripts or JSON (decisionLogicURL and trustedScoringSignalsURL) must be same-origin with seller...

And, as expected, you get the following error from the API if you violate this requirement:

TypeError: Failed to execute 'runAdAuction' on 'Navigator': decisionLogicURL 'https://subdomain.example.com/seller.js' for AuctionAdConfig with seller 'https://example.com' must match seller origin.

Similar to what was noted in #421, there are similar considerations for SSPs (aka sellers); the decisionLogicURL is CDN-friendly, effectively a static asset, while the trustedScoringSignalsURL needs to generate a dynamic response from the seller, and is not CDN-friendly.

https://github.com/privacysandbox/attestation/blob/main/how-to-enroll.md allows ad techs to register their TLD+1 domains -- is there a possibility that the same-origin requirement can be similarly relaxed? In other words, so that an auctionConfig could contain:

  'seller': 'https://www.example-ssp.com',
  'decisionLogicURL': 'https://cdn.example-ssp.com/.....',
  'trustedScoringSignalsURL': 'https://tss.example-ssp.com/....',
@rdgordon-index
Copy link
Contributor Author

@pm-nitin-nimbalkar
Copy link
Contributor

This is a legitimate issue. Most of the ad tech companies host different applications on different sub domains.

@MattMenke2
Copy link
Contributor

The same-origin constraint is for security, rather than tracking prevention. That may sound scary, but security issues are solved by one origin giving the other origin permissions, so I think if we just enable CORS for these requests, we can probably do this for sellers (and probably bidders as well).

Someone better versed in web security than I should doubtless verify I'm correct about this.

@MattMenke2
Copy link
Contributor

Actually, there's a potential issue I hadn't considered: These URLs come from the publisher page, not the seller, so we may need to use the publisher page as the requesting origin. For the decisionLogicURL, and the directFromSellerSignals logic, especially, the request for the latter isn't even made by FLEDGE, but rather from the source page.

So CORS may not be a solution here.

@rdgordon-index
Copy link
Contributor Author

These URLs come from the publisher page, not the seller

In practice, these come directly from the seller's component auction auctionConfig -- for example, in the PBJS integration, the publisher doesn't do anything other than indicate that slots are eligible for PA auctions -- no further configuration is necessary.

@MattMenke2
Copy link
Contributor

Unfortunately, the security model of the web provides us no way to verify that the auctionConfig is authoritatively from the seller's origin. There's nothing to prevent a malicious publisher or 3rd party with script access to the frame from mucking with the auctionConfig.

@bmayd
Copy link

bmayd commented Oct 25, 2023

Unfortunately, the security model of the web provides us no way to verify that the auctionConfig is authoritatively from the seller's origin.

Sounds like the type of problem that is addressed by employing cryptographic signatures.

@michaelkleber
Copy link
Collaborator

We discussed this request in the 2023-10-18 meeting, notes here. There are some important web security model constraints that we must adhere to, but all the multiple-domain things you want are indeed possible — we just need to make sure that no domain is being fooled into its data going from/to places it didn't intend.

The work to do here falls under the heading of CORS (Cross-Origin Resource Sharing) and its relatives. We really need someone with appropriate expertise to figure out what sets of opt-in HTTP response headers we need from the fetch of the script and the fetch from the KV server, so that the script can safely run on values from the KV.

@k-o-ta
Copy link
Contributor

k-o-ta commented Nov 8, 2023

As a DSP, I am also interested in relaxing this policy.
Please let me share my use cases and concerns for the better spec.

When using PrivAggrAPI with PA API, the report is sent to the origin of the Interest Group owner.
If PrivAggrAPI and ARA use the same Aggregation Service, the reporting origin of PrivAggrAPI and ARA must be the same.
If we use B&A service or K/V service, they must also have the same origin.

In summary, biddingLogicUrl, trustedBiddingSignalsUrl and several servers (B&A, K/V, reporting servers (PrivAggrAPI, ARA)) need to have the same origin.

Since I am using AWS, I would set up per-path routing in ApplicationLoadBalancer to implement this.
This could cause the following problems

  • Using different terraforms for B&A, K/V, and our reporting servers, which may conflicts in ALB configuration.
  • When configuring ALB's listener rule, we may confuse similar paths such as .well-known/~.

By the way, since buyers call joinAdIG in their iframe, I don't think malicious third parties can rewrite InterestGroup information, unlike they can rewrite the seller's auctionConfig.
If this is true, the origin of biddingLogicUrl and trustedBiddingSignalsUrl can be different from the InterestGroup origin.

This may not be the suitable thread to discuss my topic. If you know a better place, please let me know.

@dmdabbs
Copy link
Contributor

dmdabbs commented Nov 8, 2023

Two GH issues, numerous commenters describing their use cases. How do we elevate this to a feature request and have collaborators weight it?

@thegreatfatzby
Copy link
Contributor

@michaelkleber have you guys had any luck finding an appropriate security expert? Is this something Edge could help out with maybe?

@thegreatfatzby
Copy link
Contributor

Also both Xandr and MSAN have similar issues as above, the CDN being mixed with other APIs, having to go through the same load balancer and adding re-write rules doesn't align well with how we currently do routing.

@dmdabbs
Copy link
Contributor

dmdabbs commented Feb 5, 2024

Yes, this gives SysEng indigestion.

@thegreatfatzby
Copy link
Contributor

We also have another issue related to this that made me bang my head when it was pointed out to me. We often will use sub-domains to cordon off "environments" for particularly important clients.

For instance, we have a client, call them Important Site (when we onboarded them, it was quite a thing, I promise), which we we give a dedicated collection of nodes so that

  1. We release to them more carefully
  2. They have dedicated hardware so they don't get caught up in the antics of the hoi poloi
  3. We have custom configurations for them.

I haven't thought through this quite as thoroughly as other things, but decent first cut is that we'd want to be able to do something like this for their Interest Groups:

owner: igs.adnxs.com (basic, probably preferred to keep this constant)
trustedBiddingSignalsUrl: ImportantSite-kv.adnxs.com (dedicated resources/release-cycle/etc)
biddingLogicUrl: ImportantSite-cdn.adnxs.com (dedicated CDN bucket)

For the biddingLogicUrl we would take a hit on K-ness in theory, although in practice we generally wouldn't since their creatives are presumably distinct (this could help with any edge cases on that since I thiiiiink our reporting logic would be the same and we might not want it to be distinct for k-reasoning-reasons).

@thegreatfatzby
Copy link
Contributor

@michaelkleber is there any chance an initial "relaxation" could be to allow a single redirect within the same TLD+1, only for the static files, i.e. bidding logic and score ads? In our setup we typically deploy static files to a CDN that isn't behind our load balancers, to keep those requests away from our main infrastructure. To be able to not have to go through our main LBs for static files would be particularly valuable.

@thegreatfatzby
Copy link
Contributor

Another small relaxation could be to whitelist some set of CDNs who are known, for static file usage only.

@thegreatfatzby
Copy link
Contributor

thegreatfatzby commented Feb 26, 2024

Another interesting wrinkle here is that, as currently planned, the trustedBiddingSignals/trustedDecisionSignals will need to go through a TEE based KV server, which at the moment can only be hosted in Google Cloud or AWS, not in on prem data center or in other public clouds, including Azure.

For an IG owner this means that, absent a relaxation to same origin in this case, once the BYOS KV feature goes away any Ad Tech who does not use GCS/AWS will need to have their TEE KV service be same origin as their updateUrl and bidding logic. In turn this means that either:

  1. Even for on-device the Ad Tech needs to move all their infrastructure to GCS/AWS.
  2. The Ad Tech has to have it's load balancers proxy at least one high volume, non cacheable service to a DC it doesn't live in: either it lives same DC as the updateUrl and proxies KV requests to the cloud, or it lives in the cloud and proxies updateUrl requests to it's "normal" DC. This will present quite a fun choice, as the KV server is likely higher load (so maybe I put the LB there), but updateUrl is likely more complex from an "application" perspective, integrating with client objects and various configurations, so connection issues going from a cloud LB to the updateUrl would be bad, and we'd likely end up with higher costs of the financial and non-financial kind.
    2a. Edit: I previously stated updateUrl is hot path; that is not quite right. It is currently a high volume service that we'd consider part of the "Real Time Platform" that get requests from the open internet and we put lots of constraints around, but it would be off the hot path today...however, we have discussed the updateUrl being callable for "immediate" update, which depending on how we do it would be hot path.

Relaxing same origin would let us route static files to a CDN, updateUrls to the on prem DC, and TEE KV calls to a cloud. (Although as I'm sure is clear but I'll state anyway, I don't mean this to endorse TEEs only being allowed in public clouds, just that given that constraint this issue would allow an operational improvement).

@thegreatfatzby
Copy link
Contributor

thegreatfatzby commented Mar 14, 2024

Hey @JensenPaul realized I have some follow up questions from the discussion we had last week in the call RE the CORS based solution it sounds like we're heading towards.

  1. Can I assume this will still require same site, i.e. subdomains of example.com are OK but not a different domain? I suppose this isn't technically impossible if both sites are attested, but from a privacy model perspective I'd think that's disallowed.
  2. Will the coalescing of Trusted Bidding Signal requests to the trustedBiddingSignalsUrl still be based on the owner alone? I.e., we won't now have the coalescing happen per (owner, bidding url, update url) set of subdomains?
  3. Will redirects still not be supported?

@MattMenke2
Copy link
Contributor

I'm not Paul, but I'll give it a shot.

  1. Can I assume this will still require same site, i.e. subdomains of example.com are OK but not a different domain? I suppose this isn't technically impossible if both sites are attested, but from a privacy model perspective I'd think that's disallowed.

I'm not sure about this one.

  1. Will the coalescing of Trusted Bidding Signal requests to the trustedBiddingSignalsUrl still be based on the owner alone? I.e., we won't now have the coalescing happen per (owner, bidding url, update url) set of subdomains?

I don't think we can do this, for a number of reasons. We will use CORS to fetch these, which only has a notion of one requesting origin, which will be the bidder's origin.

Also, merging interest group names and keys from two different origins into a single request seems pretty concerning to me. The TEE trusted server model may give us a way to avoid the latter (There are multiple discrete blobs group by joining origin already, I believe, though that's not my area. Could imagine adding separate ones for different IG owners as well, though). Of course, that removes some of the advantages from coalescing in the first place.

  1. Will redirects still not be supported?

We don't support redirects for a number of reasons - questions over network partitioning and cross origin tracking being one of them. I think we should probably figure out what we want to do about network partitioning (both with the current BYOS model and the future TEE model) before we can decide if we want to handle cross-origin redirects. I think we're probably completely safe just allowing same-origin redirects without any thinking, but there's currently no secure API in Chrome to allow one but not the other (we don't trust the process we run Javascript in).

@thegreatfatzby
Copy link
Contributor

Hey @MattMenke2 thanks for the quick response, for (2) I think I may have miscommunicated. I'm not at all asking for KV Call Coalescing across origins of owners, I definitely see why that can't happen. What I'm asking is if we now had something like this:

IG_One= {owner: www.example.com, trustedUrl: www.example.com/kv, biddingUrl: cdn.example.com, updateUrl: updates.example.com/update}

IG_Two= {owner: www.example.com, trustedUrl: www.example.com/kv, biddingUrl: cdn2.example.com, updateUrl: updates2.example.com/update}

Would those two would still be coalesced into the same call to www.example.com/kv, or would the different subdomains for biddingUrl and/or updateUrl cause them to be in separate KV calls?

@MattMenke2
Copy link
Contributor

Ah, so those actually aren't coalesced for another reason - our request coalescing code is currently relatively simple, and only scoped to a single "worklet" object (which is currently distinguished basically by the owner, all URLs, and all parameters that go into the bidding signals fetch, except key and IG name). It's not actually a policy decision. You could imagine IGs with the same trusted signals URL having different script URLs, and IGs with the same script URL having different signals URL. Having a general request manager that handles all those cases gets rather complicated.

That having been said, we are significantly reworking how signals fetching work as we work on support for TEE trusted signals servers. It's possible in the process we'll end up adding this capability, though it's far from a sure thing.

@thegreatfatzby
Copy link
Contributor

OK...so to iterate a touch more, in this new extended setup:

IG_Example_cdn_One= {name: IGOne, owner: www.example.com, trustedUrl: www.example.com/kv, biddingUrl: cdn.example.com, updateUrl: updates.example.com/update}

IG_Example_cdn_two= {name: IGTwo, owner: www.example.com, trustedUrl: www.example.com/kv, biddingUrl: cdn.example.com, updateUrl: updates.example.com/update}

IG_Example_cdn2_three= {name: IGThree, owner: www.example.com, trustedUrl: www.example.com/kv, biddingUrl: cdn2.example.com, updateUrl: updates.example.com/update}

IG_Example_cdn_four= {name: IGFour, owner: www.example.com, trustedUrl: www.example.com/kv, biddingUrl: cdn2.example.com, updateUrl: updates.example.com/update}

In this case there would be two calls to the KV endpoint, one with IGOne and IGTwo, one with IGThree and IGFour?

@MattMenke2
Copy link
Contributor

Yes, with the current code, there would be two calls to the KV server.

Given the question, I assume it's safe to say that's a case you care about? We've been operating under the assumption that multiple bidding scripts would be uncommon.

@thegreatfatzby
Copy link
Contributor

Ohhhh interesting...here I was asking about domains, but even if the script is same origin but different path we will have the separate calls?

The domains thing was more of a confirmation...the path part is more impactful as it means that deployment of new scripts can impact how those calls are made, so if we try to progressively roll out new bidding logic it would impact how those calls are made. I think in general that would be a net negative, and it would be better to control that off of owner and KV endpoint, decoupling it from at least the path of the bidding script.

Actually, thinking about it more, will separate updateUrls also result in uncoupling the KV calls? Hoping the answer on this one is no as this one is kind of designed to be different per IG...right?

@MattMenke2
Copy link
Contributor

No, not updateUrls.

We group two InterestGroups if we can reuse the Javascript, the WASM (if any), and the trusted bidding signals fetch between them. Everything that doesn't prevent that can vary. The exact same logic also allows sharing between auctions that run at once, though they also have to be from the same frame.

Looking at the code, our key for both buyers and sellers consists of:

WorkletType type;
GURL script_url;
std::optional<GURL> wasm_url;
std::optional<GURL> signals_url;

// `needs_cors_for_additional_bid` is set for buyer reporting for additional
// bids; those need to perform a CORS check others don't.
bool needs_cors_for_additional_bid;

std::optional<uint16_t> experiment_group_id;
std::string trusted_bidding_signals_slot_size_param;

@JensenPaul
Copy link
Collaborator

I’ve been thinking about a path forward here and consulting with some security experts. I think the major concerns we want to make sure we’re addressing as we extend this ability to fetch Protected Audience resources from different origins falls mainly into two categories:

  1. Making sure the origin being fetched from allows their response to be shared with the origin that will receive it.
  2. Making sure the origin receiving the fetched response understands which origin it originates from, thus giving them the ability to only allow responses from certain origins.

I think we can support this extension given a few additional restrictions:

  • Use CORS for fetching trusted signals, specifying an Origin header with the origin of the bidding/scoring script that will receive the signals. This helps to address concern # 1.
  • When the trusted scoring signals origin doesn't match the scoring script origin, require the scoring script fetch to return a response header indicating allowed trusted scoring signals origins. This helps to address concern # 2.
  • When trusted signals origin doesn't match bidding/scoring script origin, how the signals are presented to the script so they are not confused for a same-origin response, for example add a new parameter to generateBid()/scoreAd() that is a map from the origin returning the data to the data returned. This helps to address concern # 2.
  • To avoid unnecessary CORS preflight checks we should consider whether we should safelist any existing headers.

(This was broadly described in our 3/6/2024 meeting)

@JensenPaul
Copy link
Collaborator

Can I assume this will still require same site, i.e. subdomains of example.com are OK but not a different domain? I suppose this isn't technically impossible if both sites are attested, but from a privacy model perspective I'd think that's disallowed.

I don't think CORS treats cross-site or cross-domain any differently than it treats cross-origin, so I'm not sure the requirements I proposed make any same-site or same-domain requirements.

@rdgordon-index
Copy link
Contributor Author

Use CORS for fetching trusted signals, specifying an Origin header with the origin of the bidding/scoring script that will receive the signals. This helps to address concern # 1.

Makes sense, no reason that the TSS/TBS origin can't return an ACAO response header to grant cross-origin access.

When the trusted scoring signals origin doesn't match the scoring script origin, require the scoring script fetch to return a response header indicating allowed trusted scoring signals origins. This helps to address concern # 2.

Ditto -- PAAPI already has a number of such headers in place today.

When trusted signals origin doesn't match bidding/scoring script origin, how the signals are presented to the script so they are not confused for a same-origin response, for example add a new parameter to generateBid()/scoreAd() that is a map from the origin returning the data to the data returned. This helps to address concern # 2.

To clarify, this is basically the Origin/ACAO request/response pair?

To avoid unnecessary CORS preflight checks we should consider whether we should safelist any existing headers.

Aligned.

Seems like a viable approach -- any ETA on when this would be available?

@rdgordon-index
Copy link
Contributor Author

any ETA on when this would be available?

#1156

@dmdabbs
Copy link
Contributor

dmdabbs commented Apr 30, 2024

FWIW, implementation issue:
https://issues.chromium.org/issues/332913415

@thegreatfatzby
Copy link
Contributor

It looks like all of the PRs attached to the Chromium ticket are merged :) Maybe someone who knows can comment, even w/o exact dates, on what that would mean for next steps on the deployment side?

@MattMenke2
Copy link
Contributor

MattMenke2 commented May 8, 2024

To enable this on canary, the incantation is --enable-features=FledgePermitCrossOriginTrustedSignals (The string I pasted initially in the meeting was wrong). It allows cross-origin trusted bidding and scoring signals. See explainer changes at #1156, for explanation of additional requirements for cross-origin signals to work.

The string to use with navigator.protectedAudience.queryFeatureSupport() is "permitCrossOriginTrustedSignals".

@ajvelasquez-privacy-sandbox
Copy link
Collaborator

Hi everyone, we plan to start testing this very soon in canary/dev. One thing we wanted to update folks with is, we are requiring the domains of the trusted signals origins to be checked for and pass the enrollment and attestation as per https://github.com/privacysandbox/attestation.

@JensenPaul
Copy link
Collaborator

This new behavior should be available for testing in 50% of Chrome Canary and Dev channel traffic today. Feature detectable like so.

@morlovich
Copy link
Collaborator

(Not quite dev yet since there is no version of dev with the attestation check released yet, but if the next dev release is cut in the ordinary way and not something-special-is-happening way it ought to have it).

@omriariav
Copy link
Contributor

Do we have an estimation for moving this to beta?
thanks

@morlovich
Copy link
Collaborator

Just landed the config for going to 50% beta. As usual, it may take some time to actually climb up to that

@morlovich
Copy link
Collaborator

Going to 1% in stable. As usual, it may take some time to actually climb up to that.

@morlovich
Copy link
Collaborator

Config for stable launch landed. It will take some time for it to actually apply...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests