-
Notifications
You must be signed in to change notification settings - Fork 68
Add text on DP formal analysis and its assumptions #271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
5838518
2e27352
6fe5c2a
c997c37
5b54331
9c2a1ee
f36fac9
fd4209c
750176b
430c710
a6ff01c
93aaf12
68213be
ea2d795
ec9b250
efbcda5
b845991
a9e1e51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -2338,6 +2338,45 @@ for a number of reasons: | |||||||||||||||
| Without allocating [=privacy budget=] for new data, | ||||||||||||||||
| sites could exhaust their budget forever. | ||||||||||||||||
|
|
||||||||||||||||
| ### Formal Analysis of Privacy Properties and Their Limitations ### {#formal-analysis} | ||||||||||||||||
martinthomson marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||
| The paper [[PPA-DP-2]] provides formal analysis of the mathematical privacy guarantees | ||||||||||||||||
| afforded by *per-site budgets* and by *safety limits*. Per-site | ||||||||||||||||
bmcase marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||
| budgets include [=site=] in the [=privacy unit=], whereas safety limits exclude it | ||||||||||||||||
| thereby enforcing a global individual DP guarantee. | ||||||||||||||||
|
|
||||||||||||||||
| The analysis shows that *per-site individual DP guarantees* hold under a restricted system | ||||||||||||||||
bmcase marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||||||
| model that makes two assumptions, which may not always be satisfied in practice: | ||||||||||||||||
|
|
||||||||||||||||
| 1. *No cross-site adaptivity in data generation.* A site’s queryable data stream (impressions | ||||||||||||||||
| and conversions) must be generated independently of past DP query results from other sites. | ||||||||||||||||
| 1. *No leakage through cross-site shared limits.* Queries from one site must not affect which | ||||||||||||||||
| reports are emitted to others. | ||||||||||||||||
|
|
||||||||||||||||
| Assumption 1 is necessary because the system involves multiple sites that could interact | ||||||||||||||||
|
||||||||||||||||
| Assumption 1 is necessary because the system involves multiple sites that could interact | |
| The assumption that sites cannot adapt their queries is necessary | |
| because the system involves multiple sites that could interact |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| learns, from DP measurements, to make an ad more effective, a user may convert on their site | |
| learns generally-applicable information that helps them make their ads more effective, | |
| that will make it more likely that their ads are attributed for conversions, | |
| as opposed to a competitor. |
Here, "DP measurements" refers to measureConversion() as well. Watch out for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a case where we are talking about aggregate results from many devices that you get back from the aggregation service.
I feel we need some sort of term for this in the spec as for what to call the final aggregate DP results that go back to the advertiser. Maybe "query" is too general but something like "DP attribution results" or "DP measurements" should be clear and maybe we need to define that somewhere in the intro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a stab at defining "attribution result" in the intro and trying to use this whenever we mean the final outputs learned by the advertiser.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the assumption, but I think that the main challenge here is different. Sites might gain an understanding that a particular visitor to each site in a set is the same person (due to federated login, same email address, or anything including navigation tracking which we can't stop). AND THEN they decide to pool their per-site budgets to use the API to extract more information about that person. In that case, we have no defense from the per-site budget. Sites are only limited by their ability to link activity across sites (which is too easy, as noted) and then the global budget.
So we should acknowledge that limitation as well as the more theoretical one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinthomson I added a paragraph at the end of this section to capture this additional challenge an adversary is faced with. Let me know if you think that still needs any adjusting.
In addition to facing safety limits discussed above, an attacker using multiple colluding sites to gain
more DP budget about users also face the practical limitation of being able to link a user across sites.
This is limitation does not itself provide a theoretical DP benefit but does impose a significant
challenge to the attacker when the user agent has made it difficult to link users across sites.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Assumption 2 is necessary when we have shared limits that span multiple sites. An example of | |
| An assumption that sites are unable to coordinate their use of the API is necessary | |
| when we have shared limits that span multiple sites. An example of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm trying to say here is just that if you want to have shared limits you have to make Assumption 2 for the per-site budgets to hold.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, queries... If you want to use that word, it might be necessary to explain it up front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we can use measureConversion as this is talking about what happens on a single device.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider defining this as [=global safety limits=], per above.
| By contrast, the analysis shows that *safety limits* -- which operate at global level, | |
| excluding [=site=] from the [=privacy unit=] -- can be implemented to deliver *sound global individual | |
| DP guarantees* regardless of whether either assumption is satisfied. | |
| The analysis shows that [=global safety limits=] -- | |
| which do not have a [=site=]-specific [=privacy unit=] -- | |
| deliver sound individual DP guarantees | |
| regardless without relying on either of these assumptions. |
Importantly, after introducing the analyses and some context, this is the first thing I would say. It's a simple statement that is easy to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, we can move this up top if we want to start with what holds without any assumptions.
Uh oh!
There was an error while loading. Please reload this page.