Your credit card data says a lot about you—so much, in fact, that scientists can figure out exactly who you are by looking at just a few purchases.
A new MIT study shows researchers can pinpoint your identity with over 90 percent accuracy when they see four credit card purchases. That number falls to three if the price is included, according to the report published in the journal Science.
It didn’t matter much at all that the credit card companies had attempted to anonymize the records of the 1.1 million people over the course of 3 months the researchers examined the data, calling into question the methods with which these firms attempt to maintain customer privacy.
Banks removed names, credit card numbers, shop addresses, and transaction times. The researchers could only work with metadata: Transaction amounts, shop type, and a code in place of a person’s name.
Researchers quickly narrowed down on people’s individual spending patterns and correlated the metadata with information from outside sources.
A 2013 Scientific Reports study did much of the same thing with cellphone data by looking only at metadata on a user’s movements and quickly figuring out exactly who the person was. Both studies shared a lead author, Yves-Alexandre de Montjoye.
Metadata of Americans who have committed no crime is often looked at by law enforcement as what they present to be a compromise between privacy and investigation. What this study, and others like it, show is that the idea that metadata protects privacy is an illusion or a lie.
“It is not surprising to those of us who spend our time doing privacy research,” Lorrie Faith Cranor, director of the CyLab Usable Privacy and Security Laboratory at Carnegie Mellon University, told the Associated Press. “But I expect it would be surprising to most people, including companies who may be routinely releasing de-identified transaction data, thinking it is safe to do so.”
The research suggests that all companies with big troves of customer data—credit card firms, phone companies, taxi cab companies—have to be much more careful before releasing any data to the public, including to other companies as well as to academic researchers who have increasingly used such databases in recent years.
“Without such safeguards, rich databases could remain off limits,” Science reports. “Take, for example, the data MIT has accumulated from its massive open online courses. It’s an information trove that education researchers dream of having: a record of the entire arc of the learning process for millions of students,” says Salil Vadhan, a computer scientist at Harvard University. “But the data are under lock and key, partly out of fears of a prospective privacy breach.”