The Big Red Data Blog: e-Privacy: A Framework for Data-Publishing against Realistic Adversaries (Part II)

Recall the setting from my previous posting. We want to publish a table while maintaining the privacy of the individuals in the table. So far, there only existed extreme adversaries: Either very weak or super strong adversaries - nothing in between.
If we only protect privacy against weak adversaries then a smarter adversary will be able to breach the privacy of individuals in practice. For example, an adversary who knows "it is flu season and there are likely to be many elderly patients with flu symptoms in the hospital." can be smart enough to breach the privacy of individuals.
On the other hand if we protect against extremely strong adversaries then our published table is close to being useless.
That is why we would like to protect against realistic adversaries in the middle ground. So how do "realistic" adversaries look like?

In our VLDB paper, we introduce e-privacy which protects against adversaries who have collected some statistics about the population and use them to build their belief about individuals. An adversary can have the following knowledge:

Knowledge about the general population. The adversaries obtain this knowledge from other datasets, for example, through articles that they have read that contain survey data or by looking at other data sets.
Knowledge about specific individuals in the table. The adversaries obtain this knowledge as in prior work, for example, by knowing some facts about their friends or neighbors.

As we just outlined, the adversary starts with some beliefs about the world. Upon seeing the released table the adversary may revise her belief - for example, the adversary may know the rate of cancer in the general population and assumed this was the same for all patients but upon seeing the data from a hospital, might change her mind. By how much an adversary changes her belief depends on what we call the stubbornness of the adversary. Infinitely stubborn adversaries believe in facts, and thus they do not change their belief at all. Adversaries with finite stubbornness need more or less data to convince them that their prior beliefs were incorrect.

In the paper we explain how to take a table and generalize it in order to protect against these adversaries. This was impossible if you wanted to protect against the super strong adversaries. But now, we can find out which tables we can publish defending against realistic adversaries.

We believe that e-privacy with its defense against realistic adversaries is very useful in practice, and thus we hope that you will want to experiment with it and try it out on your own data. We are eager to learn about your experience!

For more details on the framework and an experimental evaluation of information content of data published with e-privacy please refer to our paper. If you want to know more about our projects on data privacy you can find more information here.

The Big Red Data Blog

Friday, September 4, 2009

e-Privacy: A Framework for Data-Publishing against Realistic Adversaries (Part II)

No comments:

Post a Comment

Cornell University Database Group

Links

Search This Blog

Labels

Blog Archive