Tumblr allegedly blocked the Archive Team from archiving data.

Crevis/Shutterstock Tumblr (Licensed) Ana Valens

Tumblr reportedly blocked archivists from saving blogs before the NSFW purge

In the end, it wasn't enough time to save every Tumblr blog.

 

Ana Valens

Tech

Posted on Dec 19, 2018   Updated on May 20, 2021, 11:06 pm CDT

Archivist Jason Scott works for the Internet Archive and runs a separate unaffiliated project, Archive Team, which is dedicated to preserving dying or endangered websites. So when Tumblr announced that it was banning all NSFW content from the site on Dec. 3, the team had a mere two weeks to save as much content as they could before the Dec. 17 ban began.

But apparently, Tumblr wasn’t happy with the Archive Team’s plans. According to a tweet from Scott, Tumblr “mass IP-banned entire swarths of @archiveteam volunteers and warrior instances” on Dec. 15, mere days before the ban.

“Some of our folks will do some work to see if they can get a little more out, but Tumblr has spoken: Get the hell off our lawn, we got some deletin’ to do,” Scott tweeted on Saturday.

Granted, Scott’s Archive Team wasn’t permanently banned. The crew got back to work just two days later after “a crack team of activist archivists… turned everything up to 11.”

As of Dec. 17, Scott reported that 646,000,000 Tumblr URLs were added into the Internet Archive’s Wayback Machine, with over 210,000 blogs either in the pipeline or already in the archive. However, this pales in comparison to the sheer number of blogs on Tumblr: The site reports that there are million blogs on the site with 167.5 billion posts.

Admittedly, Scott was pretty forthcoming about the challenges that archiving Tumblr presents. In a Twitter thread from Dec. 12, he explained that the Archive Team really had seven days to archive Tumblr blogs, not 14, because of complications with downloading websites from Tumblr’s “hellscape of code.” He suggested the final archive was “not going to be complete and it might not even be a significant percentage.”

“When Oath/Verizon/Tumblr slams the lid down starting December 17th, that’s it. We can’t do anything more on our end,” Scott tweeted. “We can only see what’s public on the internet, anyway. That means we’re moving at EMT speed during a flaming fire of the worse kind of internet shutdown.”

Scott and the Archive Team’s struggles point to a much larger problem with maintaining data on the internet hosted and maintained by major corporations. With incredibly short notice prior to the NSFW content ban, Verizon, which now owns Tumblr, held the upper-hand over internet history activists, essentially preventing huge portions of the site from being saved for future posterity.

As Scott told Fast Company, archivists are usually given “30 or 60 days or 90 days warning,” and with an estimated 400 to 800 terabytes worth of data affected by the ban, there’s only so much content that can be saved. It’s as if Verizon snapped a finger and wiped out porn from Tumblr and can pretend it never existed in the first place.

It’s a sobering reminder that when a major company quickly pulls the plug on a huge chunk of data, it also removes huge portions of the internet with it, damaging archivists’ ability to preserve our modern times for future generations.

“I expect screaming about all aspects of how this goes on,” Scott tweeted, addressing critics who wanted more privacy control over archived blogs. “I wish a tiny bit of that ire could be aimed towards the fact that this entire crisis is completely made up and the result of a truly random decision. I wouldn’t trust those organizations with any personal data.”

The Daily Dot reached out to Tumblr and Scott for comment and is awaiting a response.

H/T Techdirt

Share this article
*First Published: Dec 19, 2018, 11:01 am CST