[Privacy] Address privacy requirement around "network changes" #30

mingyc · 2022-09-07T11:00:30Z

The discussion from #28 (comment) suggests that one of the privacy requirements around network change, as stated below, is too ambiguous:

Beacons are only sent over the same network that was active when the beacon was registered (e.g. > if the user goes offline and moves to a new network, discard pending beacons).

The foundamental privacy issue here is to prevent beacon from sending to fulfill the expectation that users expect a network cannot know about their previous navigation history when
(1) they have already left a page (either by closing a tab or by navigating away) AND
(2) they have already left the origin network

The following example is the same from the linked comment:

User is on network1 (e.g. mobile network)
User navigates to site1.com
site1.com creates and queues beacon1 on the page.
User navigates away from site1.com
A. where site1.com is put into bfcache, and beacon1 is still pending.
B. browser checks whether network has changed or not => No
User switches to network2 (e.g. office wifi), and thinks that network2 wouldn't know about the navigation history of site1.com
beacon1 is sent by browser via network2 (when either page evicted from bfcache or timeout/backgroundTimeout timer ends) => an unexpected behavior

Note that Step 5 can always happen after 4-B as the device's network change takes some time to propagate from OS to browser.

Proposed solution

The current proposal is to force the browser to send out all beacons when a page is no longer open, i.e.

the user has navigated away from the current page
- => handles forced sending before pagehide event is dispatched)
the user has closed the current page
- => already handled in the cases of tab closure/browser closure)

and replaces 4-B with sending out all beacons.

Cons

There can still be leak to network2 if a network change happens very quickly in between B & D
A. navigating away
B. browser forces sending out all beacon1 from network1
C. (network changes to network2)
D. The browser's network API actually sends out request constructed from beacon1.
This potentially makes backgroundTimeout unusable for "navigating away" use case, as the the time between this timers starts (i.e. page enters hidden state) and forced sending (i.e. page enters bfcache) can be very short. But it can still work when the page visibility change but navigation doesn't happen, e.g. switching to another tab.

The text was updated successfully, but these errors were encountered:

mingyc · 2022-09-12T11:22:22Z

A followup question: whether to force sending out on different-page navigation or different-site navigation.

Note that a beacon is attached to a document (page).

Update Privacy section to clarify why beacon needs to be stopped when network changes (#27 #28) Also added links to related privacy discussions: #3, #27, #30, #34 Updated beacon TTL to 30min per suggestion in #16 (comment)

…away to a different document. This is to mitigate potential privacy issue that when network changes after users think they have left a page, beacons queued in that page still exist and get sent through the new network, which leaks navigation history to the new network. See [1] for more details. This CL adds the following changes: 1) In browser: Add a call to `PendingBeaconHost::SendAllOnPagehide()` from `RenderFrameHostManager::UnloadOldFrame()` to let old RFH send out all its pending beacons before it is put into BFCache or being unloaded. 2) In renderer: Update `PendingBeaconDispatcher` to be called on to update all its pending beacons' states before a `pagehide` event is dispatched to event listeners. This change makes `backgroundTimeout` property useless in the scenario that users navigate away to a different page. But it can still work in other cases like when tab is minimized or when switching tabs. [1]: WICG/pending-beacon#30 Bug: 1293679 Change-Id: I5a7ddf4284717e1af482286dd6f56344d4ef4b26 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3881967 Reviewed-by: Rakina Zata Amni <rakina@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Commit-Queue: Ming-Ying Chung <mych@chromium.org> Cr-Commit-Position: refs/heads/main@{#1047323}

fergald · 2022-09-22T02:55:36Z

If the user still has another instance of site1.com open or opens one later on or there is a service worker for site1.com, then it should be fine to deliver the beacon, we're not doing anything the site itself couldn't do (e.g. if it wrote data to IDB and managed delivery itself).

I think this would leave us a small risk of dropping data due to network change.

We still need to decide if it's OK to wait while in BFCache or not (since there is still a race with the network change event).

…away to a different document. This is to mitigate potential privacy issue that when network changes after users think they have left a page, beacons queued in that page still exist and get sent through the new network, which leaks navigation history to the new network. See [1] for more details. This CL adds the following changes: 1) In browser: Add a call to `PendingBeaconHost::SendAllOnPagehide()` from `RenderFrameHostManager::UnloadOldFrame()` to let old RFH send out all its pending beacons before it is put into BFCache or being unloaded. 2) In renderer: Update `PendingBeaconDispatcher` to be called on to update all its pending beacons' states before a `pagehide` event is dispatched to event listeners. This change makes `backgroundTimeout` property useless in the scenario that users navigate away to a different page. But it can still work in other cases like when tab is minimized or when switching tabs. [1]: WICG/pending-beacon#30 Bug: 1293679 Change-Id: I5a7ddf4284717e1af482286dd6f56344d4ef4b26 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3881967 Reviewed-by: Rakina Zata Amni <rakina@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Commit-Queue: Ming-Ying Chung <mych@chromium.org> Cr-Commit-Position: refs/heads/main@{#1047323} NOKEYCHECK=True GitOrigin-RevId: 2be095e89fa5007ea5bc1bfe4419165471be3454

tungnh28 · 2022-11-07T21:12:37Z

Hmm, there're quite a few APIs having the same privacy concerns here (such as report API and background sync). However, I think this leak is quite limited by the fact that pending beacon requests must be HTTPS only, so the request contents are not revealed (only IP will be leaked), and only under a certain situation of racing between network change event and beacon request setup (because we would discard entirely pending requests after getting network change).

Ideally, beacons must only be sent persistently over only one network (tagged at the time the beacon was registered), and discard the future sending if we find the network is different than the old one. There was an immature exploration of Chrome regarding the idea (unfortunately I actually don't know the state or underneath tech difficulties, still asking around). I guess it's time to push it forward.

mingyc · 2022-11-08T05:20:34Z

... only under a certain situation of racing between network change event and beacon request setup (because we would discard entirely pending requests after getting network change).

@tungnh28 Do you suggest that it's ok for us to try with best-effort, i.e. dropping pending beacons on network changes, after the tab is closed, in the browser process?

Ideally, beacons must only be sent persistently over only one network

We would argue that a beacon should be sendable over different network as long as the tab that creates the beacon is still open when network changes, as developers can achieve this same behavior by other existing APIs.

tungnh28 · 2022-11-08T17:14:57Z

Do you suggest that it's ok for us to try with best-effort, i.e. dropping pending beacons on network changes, after the tab is closed, in the browser process?

Given the current risks are pretty low, the best-effort approach sounds good to me (I think we are doing the same with report API).

We would argue that a beacon should be sendable over different network as long as the tab that creates the beacon is still open when network changes, as developers can achieve this same behavior by other existing APIs.

Yeah, indeed, and with extending in the comment, (with double keyed because we don't have to concern about user's friction).

mingyc · 2022-11-15T06:20:55Z

@tungnh28 Could you please confirm if beacons can only be sent over the same network can be bypassed if background sync is enabled given this comment?

(I also asked similar question here)

These are discussions from WICG#3 and WICG#30

tungnh28 · 2022-11-15T10:35:33Z

I don't think so, we had the same privacy concern/limitation in background sync as well (that's one of the motivation of background-sync preference setting, for users who knew the risks and want to disable background sync).

These are discussions from WICG#3 and WICG#30

mingyc · 2022-11-15T11:05:12Z

OK. Then background-sync permission just grant use to bypass checking other live site instance?

I've updated this change. @tungnh28 Could you also please take a look? Thanks!

fergald · 2022-11-16T07:21:29Z

@tungnh28 can you explain why this sending wouldn't just be governed by the background-sync permission? As you say, it's the same concerns, so it seems like it should be the same permission.

In particular, a site that wants to send data in the background can do so from a service worker if it has that permission. This seems like another case where we could be pushing devs to hand-roll something instead of using this API.

The workaround for this are either to

always send on pagehide and lose some batching opportunities
pay attention to network changes and sometimes lose data

1 is probably OK but I feel inconsistent for background sync not to apply to this network activity.

tungnh28 · 2022-11-17T13:03:04Z

can you explain why this sending wouldn't just be governed by the background-sync permission?

As the primary privacy concern here, we are trying to prevent beacon from revealing the previous navigation history (host) in some certain situations, apparently gating behind background-sync does not help us resolving it. However, it might be the right thing to do since we are expecting to provide an alternative API which should not be less capable than the existing one.

The privacy risks seem to be quite reduced and I don't think that concern should block us moving forward. I'd like to friendly tag someone who knows the old privacy story of background-sync for better ideas. @beverloo, maybe?

tungnh28 · 2022-11-30T08:35:28Z

Friendly tag @Sauski and I am looking forward to the feedback of privacy team.

Sauski · 2022-12-01T14:42:34Z

I realise that, from a privacy perspective, we don't have a consistent stance on APIs which want to communicate after the page has been unloaded. After chatting with the Chrome privacy folks, we've come up with a stance that applies to any API which wants to send data at some point after the page has been "unloaded" (with the exclusion of periodic background sync, which is limited to PWAs and is a different beast).

Starting at time T0, defined as when the eTLD+1 for the page which wants to communicate (e.g. send a beacon / report / sync message) is _no longer _ "fully active" in any tab - e.g. considered as "non-existent” from the user’s perspective. Note that entering the BFCache does explicitly not extend this.

At time T0, the user agent could send any beacons / reports / sync messages, regardless of the background sync user permission. Acknowledging that unload handlers + fetch get polyfill this functionality % reliability.

Past time T0 (assuming the eTLD+1 remains "non-existent"), any outbound communication requires the background sync user permission.

Additionally, past time T0, any network change must result in any pending communications being dropped (regardless of why they are pending, e.g. including because they are in BFCache). The time between T0 and a network change is not relevant - a user may have closed tabs just before swapping networks, precisely because they don't want those sites communicating on the new network.

We also consider some "timeout" time past T0, where regardless of network change, it would be unreasonable to send the communication. Elsewhere it was discussed that Firefox has a 40 minute timeout, and this would definitely be a hard upper bound. Below that, we don't have strong feelings - obviously shorter is more private, but below this hard limit, it seems OK for the API to decide.

Acknowledge that this stance isn't respected by some APIs (e.g. background sync doesn't consider network change) - from a privacy perspective, we'd consider this a privacy deficiency, and will be working to rectify that. Going forward, from a privacy perspective, we'd be looking to uphold the bar defined above for all APIs.

mingyc · 2022-12-02T05:59:17Z

Thanks for the updates!

@Sauski I have the following questions

defined as when the eTLD+1 for the page which wants to communicate (e.g. send a beacon / report / sync message) is _no longer _ "fully active" in any tab

Will it still be eTLD+1 after 3P Storage Partitioning Key is launched everywhere, which changes the key to (Root Frame eTLD+1, current frame Origin)?

Past time T0 (assuming the eTLD+1 remains "non-existent"), any outbound communication requires the background sync user permission.

Does this imply that opening another eTLD+1 tab after previous T0 will lead to another T0?

Sauski · 2022-12-05T20:40:33Z

Will it still be eTLD+1 after 3P Storage Partitioning Key is launched everywhere, which changes the key to (Root Frame eTLD+1, current frame Origin)?

I could see this going either way, from a raw privacy perspective, the embedded origin could post message up to the top frame and have it act on its behalf, recovering the raw eTLD+1 scope. On the other hand, if we don't have a good reason to break away from the strong model, I don't see why we should.

Does this imply that opening another eTLD+1 tab after previous T0 will lead to another T0?

It's my intuition is that it would (@tungnh28 for thoughts). E.g. if you had something pending send, you could refresh the pending time when the user re-opened the site (or the site came out of bfcache etc).

I worry however, that this would lead to some potentially perverse interpretations (e.g. holding onto some data beyond the lifetime described above on the hope that another "T0" occurs and everything becomes valid again)

tungnh28 · 2022-12-06T15:21:06Z

Thank you very much for the insight.

It's my intuition is that it would (@tungnh28 for thoughts). E.g. if you had something pending send, you could refresh the pending time when the user re-opened the site (or the site came out of bfcache etc).

It sounds reasonable to me, although I am still a bit concerned about breaking the gap between embedded iframe and 1P frame. Given the case the pending beacon request is coming from iframe (please correct me if it's not the case), and there's exactly an active instance of the same top frame + different subframe (or we are reopening the top level page), it seems not right to me.

Edit: I was saying the case of reopening the same tld, but different subframes.

mingyc · 2022-12-08T04:03:20Z

@Sauski

the embedded origin could post message up to the top frame and have it act on its behalf, recovering the raw eTLD+1 scope.

But if we only focuses on behaviors after T0 (defined as when the eTLD+1 for the page which wants to communicate is no longer "fully active" in any tab), the page should not be able to make any kinds of communication to recover the scope?

--
Some follow up questions:

Do you think crash recovry from disk storage will be banned by this privacy decision?
- The timing when a browser crashes can be considered as T0
- Background sync permission for a site may or may not have changed after reopening the browser.
If a dedicated worker is running for a page, it should also follow this rule that pending beacons (created by the worker) should not be sent after T0?

@tungnh28 Could you please explain more with examples? I don't quite get your concern.

Pending beacons can bet sent by any type of documents (may not be iframe).

tungnh28 · 2022-12-08T08:25:33Z

Is it the case of reopening top level frame, causing different subframes?
I was saying due to some web apps (for example google app script) or user interaction, the subframe A would be injected (or result of navigation) and trigger pending beacon requests. However reopening the top level page is going back to the original state (where A is not available)

Sauski · 2022-12-19T14:20:13Z

Some more thoughts:

Do you think #34 will be banned by this privacy decision?
The timing when a browser crashes can be considered as T0
Background sync permission for a site may or may not have changed after reopening the browser.

I don't think banned would be the interpretation - difficult to have meet the proposed privacy requirements might be better. Can you help me better understand why browser level crashes are important to include provisions for? Crashes in the renderer, sure, but then we can continue to track network changes at the browser level.

From a security perspective, I'm not sure we'd even want to reveal that a browser level crash occurred.

If a dedicated worker is running for a page, it should also follow this rule that pending beacons (created by the worker) should not be sent after T0?

It's not that they shouldn't be sent, but I can't see any reason that they shouldn't follow the same requirements as above (e.g. they'd at least need the background sync capability).

Also refer to #3, #30 and whatwg/fetch#1647

#3 #27 #30

mingyc · 2024-01-12T07:21:09Z

Re-surfacing this discussion.

Per latest discussion from Chrome Privacy reviewers, it is expected that any new API implemented in Chrome should not perform any background communication if none of the original document that initiates such communication is alive, even if such behavior can already be achieved by using service worker. This is because a stricter requirement is enforced on new APIs. (Also see previous conclusion for PendingBeacon API in #30 (comment))

Hence, a simplified version is currently implemented in Chrome for upcoming Origin Trial, that any pending fetchLater requests for a document will be flushed if that document enters BFCache, without checking if any of other same-origin documents is still open. In the future, a less stricter version should be considere by checking for the existence of other same-origin documents before flushing.

@noamr PTAL. We probably have to update the spec to include this behavior.

mingyc added discussion api Issue with API specs labels Sep 7, 2022

mingyc mentioned this issue Sep 7, 2022

pending: if a beacon is already discarded by browser, should it become non-pending #28

Closed

This was referenced Sep 12, 2022

When the user agent sends data should be clarified #7

Closed

Crash recovery #34

Open

Update Privacy section to clarify network change #29

Merged

mingyc mentioned this issue Oct 20, 2022

[Privacy] Time limit on retaining beacons for bfcache'd pages #3

Open

mingyc changed the title ~~Address privacy requirement around "network changes"~~ [Privacy] Address privacy requirement around "network changes" Nov 10, 2022

mingyc added a commit to mingyc/pending-beacon that referenced this issue Nov 15, 2022

Add more privacy details around when beacons can be sent

f9de15c

These are discussions from WICG#3 and WICG#30

mingyc added a commit to mingyc/pending-beacon that referenced this issue Nov 15, 2022

Add more privacy details around when beacons can be sent

df3a6a3

These are discussions from WICG#3 and WICG#30

mingyc mentioned this issue Nov 15, 2022

Add more privacy details around when beacons can be sent #55

Closed

mingyc added a commit to mingyc/pending-beacon that referenced this issue Nov 15, 2022

Add more privacy details around when beacons can be sent

543d80d

These are discussions from WICG#3 and WICG#30

mingyc added a commit to mingyc/pending-beacon that referenced this issue Nov 15, 2022

Add more privacy details around when beacons can be sent

feb3cf9

These are discussions from WICG#3 and WICG#30

mingyc removed the api Issue with API specs label Nov 25, 2022

mingyc mentioned this issue Dec 14, 2022

Migrated: Support for postMessage? #58

Closed

mingyc mentioned this issue Mar 20, 2023

Fetch-based design #70

Open

mingyc mentioned this issue May 16, 2023

Deferred fetching whatwg/fetch#1647

Open

4 tasks

mingyc mentioned this issue Jul 26, 2023

[explainer] Add alternative approch discussions to fetchLater #81

Merged

mingyc added a commit that referenced this issue Jul 26, 2023

[explainer] Add alternative approch discussions to fetchLater (#81)

f4f4b31

Also refer to #3, #30 and whatwg/fetch#1647

mingyc added a commit that referenced this issue Nov 13, 2023

[fetch-later] Add P&S considerations into explainer (#86)

567ddc8

#3 #27 #30

mingyc mentioned this issue Jan 19, 2024

Consider to support retry mechanism #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Privacy] Address privacy requirement around "network changes" #30

[Privacy] Address privacy requirement around "network changes" #30

mingyc commented Sep 7, 2022 •

edited

mingyc commented Sep 12, 2022

fergald commented Sep 22, 2022

tungnh28 commented Nov 7, 2022 •

edited

mingyc commented Nov 8, 2022 •

edited

tungnh28 commented Nov 8, 2022

mingyc commented Nov 15, 2022 •

edited

tungnh28 commented Nov 15, 2022

mingyc commented Nov 15, 2022 •

edited

fergald commented Nov 16, 2022 •

edited

tungnh28 commented Nov 17, 2022 •

edited

tungnh28 commented Nov 30, 2022 •

edited

Sauski commented Dec 1, 2022

mingyc commented Dec 2, 2022

Sauski commented Dec 5, 2022

tungnh28 commented Dec 6, 2022 •

edited

mingyc commented Dec 8, 2022 •

edited

tungnh28 commented Dec 8, 2022 •

edited

Sauski commented Dec 19, 2022

mingyc commented Jan 12, 2024 •

edited

[Privacy] Address privacy requirement around "network changes" #30

[Privacy] Address privacy requirement around "network changes" #30

Comments

mingyc commented Sep 7, 2022 • edited

Proposed solution

Cons

mingyc commented Sep 12, 2022

fergald commented Sep 22, 2022

tungnh28 commented Nov 7, 2022 • edited

mingyc commented Nov 8, 2022 • edited

tungnh28 commented Nov 8, 2022

mingyc commented Nov 15, 2022 • edited

tungnh28 commented Nov 15, 2022

mingyc commented Nov 15, 2022 • edited

fergald commented Nov 16, 2022 • edited

tungnh28 commented Nov 17, 2022 • edited

tungnh28 commented Nov 30, 2022 • edited

Sauski commented Dec 1, 2022

mingyc commented Dec 2, 2022

Sauski commented Dec 5, 2022

tungnh28 commented Dec 6, 2022 • edited

mingyc commented Dec 8, 2022 • edited

tungnh28 commented Dec 8, 2022 • edited

Sauski commented Dec 19, 2022

mingyc commented Jan 12, 2024 • edited

mingyc commented Sep 7, 2022 •

edited

tungnh28 commented Nov 7, 2022 •

edited

mingyc commented Nov 8, 2022 •

edited

mingyc commented Nov 15, 2022 •

edited

mingyc commented Nov 15, 2022 •

edited

fergald commented Nov 16, 2022 •

edited

tungnh28 commented Nov 17, 2022 •

edited

tungnh28 commented Nov 30, 2022 •

edited

tungnh28 commented Dec 6, 2022 •

edited

mingyc commented Dec 8, 2022 •

edited

tungnh28 commented Dec 8, 2022 •

edited

mingyc commented Jan 12, 2024 •

edited