data leak cybersecurity privacy

Blue Coat Photos/Flickr (CC-BY-SA)

Data harvesting firm leaked personal data on 48 million people

The company refutes the details of the leak.


Phillip Tracy


A company that scrapes data from social media to build detailed profiles left a file filled with the personal information of 48 million people out in the open.

Security researchers at UpGuard discovered LocalBlox, a data harvesting firm similar to Cambridge Analytica, failed to protect the mounds of information it collected without permission from Facebook, LinkedIn, Twitter, and Zillow.

On its website, the company touts it “automatically crawls, discovers, extracts, indexes, maps and augments data in a variety of formats from the web and from exchange networks.” This is all done to create a comprehensive “3-dimensional” profile of users to sell to marketers.

data repository upguard localblox

As if that wasn’t bad enough, the company failed to safeguard that data. UpGuard found information on 48 million people sitting in a publicly accessible Amazon Web Services (AWS) S3 bucket. The 1.2 terabyte files contained names, physical addresses, dates of birth, job histories, Twitter handles, and even IP and email addresses, among other things. LocalBlox was made aware of the vulnerability in February and secured the files within a few hours.

But amid the Cambridge Analytica privacy scandal hovering over Facebook, this incident calls into question not just whether data companies are capable of securing information, but how they obtain that data in the first place. UpGuard points out in a blog post how easy it was for LocalBlox to harvest data from Facebook, while some sources it used were more mysterious. For example, it purchased marketing databases and “information caches” from payday loan operators but wrote short identifiers like “ex” in other dataset fields.

The amount of information taken is equally alarming, though it should be noted that all the data collected came from public profiles. Still, this example of a comprehensive profile LocalBlox put together, is certainly troubling. The firm is capable of packaging together someone’s profile photos, location, skills, and other information most people don’t want in the hands of an unknown third-party.

It goes to show your data is being targeted by companies to be monetized in any way possible. Too often, those firms care little for securing that information, if only to have it for themselves.

“When aggregated together at scale, your psychographic data can be used to influence you,” UpGuard notes. “It is what makes exposures of this nature so dangerous, and also what drives not only the business model of LocalBlox but of the entire data analytics industry.”

Interestingly, when ZDNet reached out to LocalBlox’s chief technology officer, Ashfaq Rahman, he alleged Chris Vickery, director of cyber risk research at UpGuard, had “hacked into” the publicly accessible bucket and said “most” of the 48 million files were fabricated or used for testing. He did not say why he swiftly restricted the file’s access or what percentage of the information was real.

Facebook, LinkedIn, Twitter, and Zillow all responded to ZDNet’s request for comment by emphasizing that any data scraping without consent is in violation of their platform’s policies.

Share this article

*First Published:

The Daily Dot