Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache partitioning impacts perf optimizations that FPS might help recover #35

Open
erik-anderson opened this issue Feb 17, 2021 · 14 comments

Comments

@erik-anderson
Copy link

After Edge and Chrome have started to deploy HTTP cache partitioning, the Edge team was contacted by the SharePoint team about a perf optimization they've had that cache partitioning broke.

The scenario is roughly this:

Imagine a folder viewer for a cloud storage drive. It contains Word, Excel, and PowerPoint documents which can be previewed directly from the same page. To do that, they load the viewer for those apps in an iframe.

To speed up the viewer load, they kick off a download to a JS file that they know the iframe will turn around and load (basically think of it as a prefetch). The time between kicking off that download and loading up the iframe is very small.

The domains involved here would be a top-level one like https://microsoft.sharepoint.com and an example of an iframe URL is https://excel.officeapps.live.com. The resource itself would be https://some.cdn.office.net/path/to/script.js.

Since the cache is double keyed now, the top-level page fetching the script doesn't provide any benefit for the iframe which has a different partitioning key.

How FPS might help:

If they could use First-Party Sets to call out that the SharePoint and the Excel origins are part of the same FPS and share the same partitioning key, they would recover this optimization.

There might be some interesting challenge where the registerable domain restriction might be too specific. All of those are owned by Microsoft, but perhaps an individual owner like "the Office team" would want to configure it for sharepoint.com+officeapps.live.com while some other part of Microsoft might want to have the rest of live.com in some other FPS. Perhaps the fix there is to have them change the domains their endpoints are on so there's clearer sub-org ownership. Or maybe there's some model where subdomains of registerable domains can be part of a set that we should explore.

Resource Hints are one potential way for them to recover some of their optimization, but it still wouldn't address/recover the additional roundtrip.

I'm opening this issue in the hopes of discussing further about how FPS might help this scenario.

@othermaciej
Copy link

I don't think First Party Sets is necessary for this. The behavior described sounds like triple-keying rather than double-keying; otherwise, the page and a frame it embeds would both be getting resources from the same cache partition.

@erik-anderson
Copy link
Author

erik-anderson commented Feb 24, 2021

The way Chromium has implemented cache partitioning is described in https://developers.google.com/web/updates/2020/10/http-cache-partitioning.

It's keying based on top-level origin + iframe origin.

The relevant example from that article:

Cache Key: { https://a.example, https://a.example, https://x.example/doge.png }
Now the user comes back to https://a.example but this time the image (https://x.example/doge.png) is embedded in an iframe. In this case, the key is a tuple containing https://a.example, https://a.example, and https://x.example/doge.png and a cache hit occurs. (Note that when the top-level site and the iframe are the same site, the resource cached with the top-level frame can be used.

Cache Key: { https://a.example, https://c.example, https://x.example/doge.png }
The user is back at https://a.example but this time the image is hosted in an iframe from https://c.example.

In this case, the image is downloaded from the network because there is no resource in the cache that matches the key consisting of https://a.example, https://c.example, and https://x.example/doge.png.

I realize that Safari uses only top-level eTLD+1. Perhaps there's a mismatch in what folks mean when they say double-keying.

I agree that, with Safari's current approach, this perf issue would not exist.

@annevk
Copy link

annevk commented Mar 10, 2021

This seems not advisable as partitioning is a security boundary and helps to defeat attacks outlined at https://xsleaks.dev/. (I still don't think FPS should exist, but thought I should note this since it was agenda+'d.)

@krgovind
Copy link
Collaborator

@annevk Apologies for being a bit pedantic; but I'm trying to tease apart any nuances that I may be missing: The fact that the partition key uses top-frame and current frame "site" and not "origin" makes me think that FPS could be a reasonable replacement? (Since the main premise of FPS is that "site" currently relies on "registrable domain" which is an outdated definition based on the DNS).

Isn't "origin" the agreed upon "security boundary"?

@annevk
Copy link

annevk commented Mar 11, 2021

I think the idea that FPS could replace the current meaning of site is misguided. Site as it is defined today is very much a security boundary in its own right (it's what we key agent clusters on, see HTML) and serves as a process boundary in Chrome and soon Firefox. Concretely, you would not want an XSS in youtube.com to be able to (side channel) read accounts.google.com.

@davidben
Copy link
Contributor

FPS should not blanket replace every use of "site" in the platform. @annevk is right that sites are security boundaries elsewhere in the platform. E.g. the process allocation business is a consequence of how document.domain behaves. FPS should leave that alone.

@bslassey
Copy link

I think the better way to think of FPS is retaining existing properties of 3rd parties (e.g. cross domain cookie access and shared caching of resources) within a set of domains as we reduce the capabilities of 3rd parties in general (e.g. blocking third party cookies and partitioning the various caches by first party).

@erik-anderson
Copy link
Author

erik-anderson commented Mar 11, 2021

The discussion this issue was intended to focus on is very specific to if FPS should impact cache partitioning.

It wasn't intended to suggest modifying any existing security boundaries (or semi-boundaries) that existed before browsers started shipping keyed cache partitions.

For the specific customer that prompted this issue, it's specific to double keyed cache partitioning (top-level origin + frame origin), though it's worth discussing the more general top-level-origin keying case as well given the broader "common resources living on a CDN" use case.

@annevk
Copy link

annevk commented Mar 11, 2021

Right, I'm saying that cache partitioning is a security boundary.

@erik-anderson
Copy link
Author

We discussed this issue during today's Privacy CG call.

Information leaks, accidental or otherwise, are a primary concern driving partitioning. Much of it is around privacy, but some of it is security as well.

As the discussion today covered, there are many other APIs with similar concerns beyond cache that aren't effectively isolated within iframes, so it's not a current priority for Mozilla to add the additional level of keying that's in Chromium though it's likely to be desirable.

The primary thing I would like to understand better w.r.t. this specific issue is how strong the concerns apply when both the top-level window and the embedded frame are explicitly a part of the same FPS. The two sites could presumably choose to explicitly pass context across via whatever cross-site messaging flows get unblocked by FPS (e.g. cookies with a specific attribute or something else) in addition to using existing postMessage flows available to embedding scenarios. Is the outstanding concern, then, that having it leverage a shared cache key can have subtle implications via the side-channel aspect that site developers are unlikely to sufficiently understand?

If FPS (or, perhaps something outside of FPS, e.g. some CORS-like solution) offered a more explicit pattern where the sites could proactively agree to some URL patterns to share cache between them (with the possible scoping of it to "share by using the top-level site's key" rather than "fully share across the two top-level sites"), would that significantly address the concerns?

To map that high-level thought to my original example, if the FPS definition for the sites could include something that says "we desire that all URLs under https://some.cdn.office.net (which may or may not be part of the FPS) get shared", then when a URL is fetched within the subframe context of https://excel.officeapps.live.com, the browser might choose to alter the cache partitioning key from https://microsoft.sharepoint.com+https://excel.officeapps.live.com+https://some.cdn.office.net/resource.js to https://microsoft.sharepoint.com+https://some.cdn.office.net/resource.js (when the top-level URL is on https://microsoft.sharepoint.com).

I'm not sure the complexity of such an approach would actually be warranted, but I would like to get some clarity on what folks consider to be a reasonable solution space in an environment where we're worried about side channel data leaks between various frames loaded under the context of a single site.

@krgovind
Copy link
Collaborator

Answering @erik-anderson's question about what could be a reasonable solution space (not speaking to the solution proposed above): My understanding is that an explicit pattern would indeed alleviate the security concerns around allowing same-party, cross-domain sharing of the cache.

@chrisn
Copy link

chrisn commented Mar 12, 2021

We would be interested in potential solutions for the "common resources living on a CDN" use case.

@domenic
Copy link

domenic commented Mar 12, 2021

I hope it is OK to discuss non-FPS-related solutions in this thread. But the OP's situation sounds like exactly the sort of thing we're trying to accomodate over in https://github.com/jeremyroman/alternate-loading-modes : providing a privacy- and partition-preserving mechanism for prefetching (and prerendering) content.

We've spent most of our time thinking about whole documents and their subresources, which might not make as much as sense in the subresource-focused situation described here. But I suspect many of the mechanisms and underlying spec could be used even for subresources.

Some more details

To give a high-level flavor of our current thinking for whole-documents: the prerender or prefetch would be done without any credentials, and would put all its resources into a separate "speculative" HTTP cache partition, e.g. with key { https://excel.officeapps.live.com, https://some.cdn.office.net/, speculative = true }.

Upon activation, i.e. upon transitioning of the prerendered page on https://excel.officeapps.live.com to being a user-visible top-level browsing context, the { https://excel.officeapps.live.com, https://some.cdn.office.net/, speculative = true } partition would get "merged" into the usual { https://excel.officeapps.live.com, https://some.cdn.office.net/, speculative = false } partition. "Merged" here is under active discussion, and might look more like a fallback or memory cache or something, but the basic idea is to allow use of the speculative resources upon activation, and throw them out otherwise.

Since this is the FPS repo, probably we shouldn't dig too deep into the details of a prerender/prefetch mechanism here, but please feel free to open an issue at https://github.com/jeremyroman/alternate-loading-modes if you think this might be worth exploring...

johannhof added a commit to johannhof/first-party-sets that referenced this issue Feb 23, 2022
This is with regards to proposals such as WICG#35 which might be acceptable
from a privacy perspective (assuming collaboration between sites in a
set), but could be challenging from a security perspective (assuming a
compromised site in the set).
@mostafalarki1368

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants