Anonymized Credit Card Data Really Isn't Very Anonymous

Illustration for article titled Anonymized Credit Card Data Really Isnt Very Anonymous

Credit card companies often strip your details from their records and then share it with third parties, claiming that it's anonymized. But a new study from MIT reveals that analysis of just four purchases made on your card can identify you with more than 90 percent accuracy, even when your details are removed.

Advertisement

The study used data from three months of credit card transaction made by 1.1 million people. The researchers analyzed the transactions by time and location to pin-point who might be making them, and found then used a small number of known purchase details to work out who, from the pool of over 1 million people, made them.

One result shows purchases being made in a bakery one day and a restaurant the other. The team found just one person that could have made the purchases, "and we now know all of his other transactions, such as the fact that he went shopping for shoes and groceries on 23 September, and how much he spent," they explain to Associated Press.

Advertisement

The team found that they only need four purchases to identify an individual on the anonymizied credit card records, or three purchases if the prices are known. The study also revealed that it's easier to identify women using the technique, though the researchers can't yet explain why.

The study goes to show that a sense of privacy though anonymized data is somewhat of an illusion. Even without any of our details to identify us, all it takes is careful use of metadata—in this case, our shop visits—to identify us completely. Gulp. [Science via AP]

Image by Shutterstock/Valerie Potapova

Share This Story

Get our newsletter

DISCUSSION

One result shows purchases being made in a bakery one day and a restaurant the other. The team found just one person that could have made the purchases, "and we now know all of his other transactions, such as the fact that he went shopping for shoes and groceries on 23 September, and how much he spent," they explain to Associated Press.

Ok, this sounded rather far fetched. How are they going to identify someone who made two random purchases purely by those purchases? Then I read the whole paper:

For example, let's say that we are searching for Scott in a simply anonymized credit card data set (Fig. 1). We know two points about Scott: he went to the bakery on 23 September and to the restaurant on 24 September. Searching through the data set reveals that there is one and only one person in the entire data set who went to these two places on these two days.

Well no shit, that is not exactly hard to figure out if you already know he went there. How is it not anonymous data when you take away the knowledge of the who and when?