You want friendly software?

01 September 2020
3 Minutes of reading

Differential Privacy: what is it?

Differential Privacy: what is it?

The protection of sensitive data is an issue that is becoming more and more important every day: Differential Privacy is the latest in terms of protecting one's own information.

Recent events involving Facebook and Cambridge Analytica have given, if possible, further emphasis to a problem that only marginally affected individuals but that companies already knew very well.

Violating sensitive data can cause devastating financial losses and compromise an organization's reputation for years. From severe business and data losses to fines and clean-up costs, data breaches have far-reaching consequences.

According to the Data Breach Report, conducted by the Ponemon Institute and sponsored by IBM Security, in 2019 the total cost of the violations led to losses of almost $4 billion, as reported by 507 companies distributed in 16 geographical areas and 17 different industries.

The Netflix case

Methods to make data anonymous have already been developed over the years, but the evolution of hacking methods puts even the most advanced techniques to the test. Ask Netflix for confirmation.

In 2007 Netflix promised a million dollar prize to anyone who was able to develop the best collaborative filtering algorithm. However, two researchers from the University of Texas-Austin took up the challenge in the opposite direction.

Netflix released a selection of its users' data for the contest, but eliminated all personally identifiable information: no name, no movie titles. However, the researchers were able to de-anonymize a number of users through the nominally anonymous data set. All they had to do was scan IMDB, the reference site for movie reviews, and compare these rating patterns with those of Netflix. The result? The two researchers identified 80% of the anonymized Netflix Database.

Differential Privacy in pills. What is it and how does it work?

Simplifying before going any further: Differential Privacy serves to eliminate the possibility that such a Reverse Engineering operation could take place. It does so by adding background "noise", so that the bad guys cannot work on clean data but only on information that is intentionally confused and unusable.

Let us now see in detail how this mechanism works and why it is the future of sensitive data protection.

Distract, confuse and limit: this is the secret of Differential Privacy

Differential Privacy works through a complex mathematical structure that uses two mechanisms to protect personal or confidential information within data sets:

  • A small amount of statistical "noise" is added to each result to mask the contribution of the individual data points. This noise works to protect an individual's privacy without significantly affecting the accuracy of the responses extracted by analysts and researchers.
  • The amount of information revealed by each query is calculated and subtracted from an overall privacy budget in order to block further queries that may compromise - permanently - personal privacy.

Thanks to these methodologies, Differential Privacy makes it impossible to deduce any accurate and precise information about a particular person, hiding the set of data that the same has released during navigation.

What is noise and how is it "added" to our sensitive data?

This is the most technical and fascinating part of the whole process. The example is taken from the magazine Foundations and Trends in Theoretical Computer Science and is based on a basic question, any question, to which a person must answer "yes" or "no". Before the person pronounces the final answer, the so-called "noise" is inserted, which is nothing but a random element that intervenes to shuffle the cards.

Let's imagine flipping a coin before the answer is recorded: if "head" comes out, the real answer will be inserted in the model. If "cross" comes out, another coin will be flipped which will determine the answer "yes" or "no" depending on whether the result of the flip is "head" or "cross" respectively.

A friend has your back

Partners of Bureau Veritas and for over twenty years passionate about Cybersecurity and everything that revolves around a computer and its applications, we at Goodcode are specialized in implementing Differential Privacy systems for those companies that want to add an additional level of security to the protection of sensitive data of their company and customers connected to it.

To do so, simply contact us here.

We have already assisted many companies that have suffered data theft that is crucial to the fate of their business, helped them to proceed with recovery operations and return to full operation, but we believe that the safest way to take care of your business is to avoid taking any kind of risk.
At Goodcode, we protect you from bad guys... what else are friends for?

You want friendly software?