- Ancestry.com ad tries to sell slavery as romance—not rape 5 Years Ago
- The 9 best Satanic movies on Shudder 5 Years Ago
- Twitch streamer banned after accidentally revealing racist chats 5 Years Ago
- This video captures 15 years of meme trends in 10 minutes 5 Years Ago
- Trump calls parts of Mueller Report ‘total bullshit’ in unfinished tweetstorm Today 8:24 AM
- Amid ‘Avengers’ hype, ‘Spider-Man: Far From Home’ bumps up release date Today 7:57 AM
- Netflix’s ‘Someone Great’ is a coming-of-age rom-com for twenty-somethings Today 7:03 AM
- The best new movies and TV shows to stream this weekend Today 7:00 AM
- ‘Ramy’ explores the intersection of Muslim and millennial identities Today 6:30 AM
- The top 10 Sekiro bosses, ranked Today 6:00 AM
- How to install PlayStation Vue on Kodi to stream live TV Today 5:30 AM
- Alexandria Ocasio-Cortez supports resolution that could lead to Trump’s impeachment Thursday 9:46 PM
- Ricardo Milos dancing memes are the new Rickroll Thursday 9:09 PM
- Laura Loomer sues Twitter, Muslim lobbying group over account ban Thursday 8:15 PM
- Far-right troll Ian Miles Cheong gets flamed for mocking a ‘Star Wars’ fan Thursday 6:17 PM
Wikipedia’s data retention guidelines come with a loophole that allows them to keep data about you indefinitely.
Remember that Wikipedia wormhole you fell down the other night? Maybe it started with you looking at a list of every single episode of Glee and somehow ended with reading the biographical entry on John Wayne Gacy. You might not remember, and you might not want anyone else to know about it. But your history might still be on Wikipedia’s servers.
The free online encyclopedia is tracking the viewing patterns of some, but not all, of its users. And although the records are to be kept no longer than 90 days, it may retain an altered version of that data indefinitely.
Earlier this year, the Wikimedia Foundation—the non-profit organization that backs the user-edited, online encyclopedia—issued new data retention guidelines to let users know what kind of information they are tracking. Like many Web entities, the WMF has turned to transparency in order to respond to the growing public backlash against big data.
“The Foundation’s overall aim is to retain the minimum amount of information necessary in order to support the needs of the Foundation and the wider Wikimedia movement,” said Jay Walsh, a WMF spokesman.
In supporting “the needs of the Foundation,” Wikimedia automatically collects some personal information from visitors—even if they aren’t logged into a Wikipedia account. Such personal information includes visitors’ IP addresses and other data that “could be used to personally identify you.”
However, the policy states that all personal information will be kept for a maximum of 90 days, before it is either “deleted, aggregated, or anonymized.” In aggregating data, user information is combined with other data to illustrate broader trends, while anonymizing data removes the parts of the information that can identify particular users. In both cases, this allows WMF to keep data for more than 90 days. But the foundation admits that neither of these two processes can “completely eliminate the risk of re-identification.”
And this is the part that concerns some users, like Wikipedian Wnt.
“[A]ccording to the policy, not only do they retain it 90 days, but they then can retain it indefinitely by ‘anonymizing’ the IP addresses by ‘encrypting’ the ‘most specific’ part of the IP address, a process which they admit may not actually protect identity,” Wnt wrote during a recent exchange with Wikipedia cofounder Jimmy Wales on his talk page.
Wnt is concerned that the anonymizing/aggregating policy is leaving a loophole for Wikipedia to permanently retain data that is vulnerable to hackers or possible subpoena by law enforcement. He argues that even after data has been anonymized, it would be possible for technically well-equipped individuals to reconstruct IP codes and identify users.
“With these records acknowledged and their existence legitimized, there is no reason why they can’t start filing papers, cracking codes, and lining up access dates for whatever reasons they may have.”
In responding to Wnt’s concerns, Jimmy Wales noted that only a fraction of user actions—about 1 out of every 1,000—are randomly selected for tracking. He also pointsedout that 90 days is merely a maximum, and that most personal data is processed in a couple of days.
The Wikimedia Foundation would not reveal to the Daily Dot, however, what percentage of this data is actually deleted and what percentage is anonymized or aggregated and kept past 90 days.
Wikimedia was also tight-lipped about how often they are compelled to turn over information to law enforcement, with Walsh saying that “the Foundation complies with legitimate and lawful requests from enforcement agencies when it is necessary to do so.”
In April, Wikimedia published guidelines pertaining to requests for user information. In it, WMF states that information request are “relatively rare” and that each one is handled with discretion. The Foundation also has a policy of notifying users of a request for their information before it’s given out, but sometimes they forego that notification at the behest of law enforcement.
When it comes to the question of whether Wikimedia should be tracking user information at all, Walsh defended the practice, saying it was important for the overall growth of the site.
“The data the Foundation collects is critical in the development of new products that help them reach a wider international audience of readers, and to make it easier and more rewarding for people to begin contributing to the projects,” he said.
Photo Cary Bass/Flickr (CC BY-SA 2.0)
Tim Sampson is a reporter who focused on the technology, business, and politics beats. He's also an established comedy writer, with work on Comedy Central and in The Onion and ClickHole.