-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exposing back/forward cache blocking reasons to sites #7094
Comments
Exposing the information to JS sounds reasonable (for example as a part of pagehide event). I wonder what the API could look like, so that it could capture various reasons. And need to be careful to not expose cross-origin information. |
Yes @clelland pointed out that we could leak the fact that some subframe is using an unload handler or GPS or whatever. You wouldn't know which frame, just that some frame has it but I wonder A few other ideas
|
If a user navigates a child frame you might learn things about the user's navigation habbits on other origins. I don't think we can expose anything that goes across the origin boundary. |
Yeah, fair enough. So I think the most info we could possibly offer would be to have per-frame a struct with
and then make this struct available to the top-level frame for a history navigation that is not cached. |
Created an explainer here. |
Do you think this proposal clears the bar of not exposing cross-origin information? I wanted to make sure. |
@rubberyuzu I agree with the general sentiment of that document (thanks for writing it!), but what happens in these scenarios:
|
That's a good point. I think the most conservative way would be to...
So, in the following scenario:
Report reasons of A1, and mask B's subtree (only report if B's subtree is blocking BFCache or not)
Report reasons of A1 and A2.
Report reasons of A1, and treat the subframe as cross-origin. For cross-origin iframes, we only report "src" instead of the current URL (report B instead of C as src URL).
Please let me know what you think! |
I wonder if we should separate "src" and "location" into separate fields where "location" ends up blank for everything that is cross-origin. That might be more useful for developers and I think it would end up exposing the same amount of information. Otherwise that looks reasonable to me. Are we using "same origin" or "same origin-domain" by the way? I assume the former given the plan to remove |
The explainer needs a bit of an update. Instead of using a dictionary, it should be presenting as a tree of JS objects. The fields would be the same for cross- and same-origin but some null or empty for cross-. So My assumption was that if the parent frame can access the child frame and script it, then it should be allowed to know what blocked it. I'm not sure what is the best way to express that. If document.domain is being removed then not including it in this meaning would be fine if that's how new features are doing it. |
That would be "same origin-domain", but in general we try not to use it for new features. Note that it's also stricter than that due to excluding the subtree of cross-origin documents (which are currently visible to some extent and especially if anything in that subtree is same origin). To be clear, I think that is the correct decision. |
We looked at the proposal in Chrome Security, and we were wondering if any kind of reporting for cross-origin iframes is not an XSite leak, even if we do not send a reason. For example, consider an iframe which has an unload handler if the user is signed in, and doesn't have it if they are not. Just knowing that the iframe blocked the page from going into bfcache might be enough to know whether the user was signed-in in the cross-origin iframe or not. To take a similar example, in the COOP reporting API, where iframe actions would be of interest to the top-level page, we have chosen not to report any information from the cross-origin iframe. So we are wondering whether the safe choice here is not to report anything at all for cross-origin iframes. |
I think this information is already available via various channels, albeit in a noisy way. For example:
I'm not sure how this impacts the overall security analysis. |
@camillelamy I think it's possible to extract exactly the same signal right now. Assume a.com is trying to find out if b.com blocks BFCache.
The only difference is that the new API makes it possible to extract the information while including more subframes from other origins and/or using BFCache-blocking features but I'm not sure that is a material difference, an attacker could create a simple a.com/attack and quickly navigate away and back, collect the information and then present a more complex page on a.com/attack with the information in hand. |
I see. Yeah that makes sense. And to confirm, the only cross-origin URL the page is going to see is the one it initially asked to load in the iframe (ie pre server redirects and pre subsequent navigations)? |
@camillelamy Thanks. It will see the value of the iframe's |
Ok that sounds good. |
I tagged this for the upcoming triage meeting, but will not be attending. I will leave the agenda+ label because I was mostly going to say "we're starting to get serious about implementing this". It'd be great if other implementers took a look at the explainer at https://github.com/rubberyuzu/bfcache-not-retored-reason/blob/main/NotRestoredReason.md. In particular I am curious for people's thoughts on WICG/bfcache-not-restored-reason#2. |
The proposed API doesn't seem to work too well cases when page is evicted from bfcache because of use of some API. For example Firefox let's one to have open BroadcastChannel and still bfcache the page, but if one uses the channel, then page is evicted. Chrome seems to have similar cases for example with service workers' claim() (at least based on the proposed tests in web-platform-tests/wpt#31082 (comment)). |
What's the problem with the API in that case? If the user returns to the page that eviction reason will be listed in the reasons. To be clear, the API does not tell you what is preventing the current page from being cached. It only tells you after a history navigation why the previous page was not cached. |
How would reporting API work in that case? |
We haven't put a lot of work into the RAPI case. I guess we have a choice there
While 2. gives more information, it's unclear that it's useful information as it could lead devs to focus on pages which are not cached but also not navigated back to. Do you see a problem with 1.?
Fair point. We are basically exposing Chrome's internal telemetry which covers blocking reasons and also evicting-later reasons, so I agree "blocked" is not the best choice. "not-restored-reasons" is a bit of a handful. Suggestions welcome and then we can update the doc. |
It is unclear to me what the goal is with the reporting API use here. Should (a) the server know all the cases a page is blocked from entering bfcache or being evicted from bfcache because of use of some other API or (b) just tell that page couldn't be restored when user tried to get back to it? In (1), since implementations evict pages from bfcache because of memory pressure or timeout or whatever internal heuristics, should such case be reported? "blocked-or-evicted" as the term might work, assuming "block" and "evict" ends up to other specs too. |
I noticed something that could be a privacy risk on this API. I wrote it in details here. Basically the concern is that we could expose 1) that extensions are installed and active on the page and 2) possibly which extensions. One solution is that we hide all this information, masking all extension related reasons as "internal error". |
Sorry I missed this! As for the terminolgy: |
Update on the API- We spotted a potential privacy leak in the API here: We decided to fix this by randomly selecting a cross-origin iframe and report. |
We discussed about the proposal some more (at Mozilla) and we're not happy to expose any browser internal reasons. Only reasons to which the page author can somehow affect (like use of some API etc) should be exposed. Also, the explainer has "But as per WICG discussion, Performance Navigation Timing API was more preferred, and we are not going to implement this as Pageshow API." Could you open that reasoning a bit? Which WICG? And perhaps a link to the meeting notes? |
Thanks for the comment. There were two discussions where we talked about which API we should extend to include NotRestoredReasons : We decided on NavigationTiming API instead of Pageshow because NavigationTiming API already provides information about the navigation such as navigation type being "back-forward", and adding NotRestoredReasons seemed to be natural extension of that. |
From mozilla/standards-positions#766
|
I would like to ask your opinions about the naming of this API. Currently the explainer calls the API "NotRestoredReasons" and PerformanceNavigationTiming's field "notRestoredReasons". Should we call this NotReactivatedReasons? Or do you have some other suggestions? |
I think since we refer to the "fully active" concept a lot in BFCache-related spec and documentation (e.g. https://w3ctag.github.io/bfcache-guide/), "Reactivated" seems like a more consistent choice. |
I'm mostly concerned with web developer API consistency; we don't necessarily need that to be consistent with the spec language. (Although it's always nice when they can match.) From that perspective I'm most worried about |
Currently developers can tell whether BFCache is being used or not in the wild but they cannot tell what reasons are blocking it from being used and what actions to take to improve their hit-rate.
We (Chromium) would like to make it possible for sites to collect information on why back/forward cache was not used on a history navigation.
One possibility would be to implement a reporting mechanism in Reporting API that sends the items that blocked back/forward cache.
Another possibility would be to make it available through a JavaScript API, e.g. when pageshow is not persisted but is a history navigation, it could contain information about why it was not persisted. Or it could be available from some other API. This would explicitly expose the fact that this was a history navigation.
For both of these we would probably want to standardise some of the reasons (where they are common and part of the spec) but also allow vendor-specific reasons for cases where blocking was not required by spec but happened anyway.
One side benefit we could potentially get from this is that we would be able to write web platform tests for why back/forward cache is blocked.
cc @clelland @annevk @smaug---- @mystor @cdumez @beidson @hober @altimin @xharaken @fergald @domenic
The text was updated successfully, but these errors were encountered: