Yes, Google is reading your email, and yes, you should care

If you missed the news, it would be understandable. Headlines about a sex offender arrest don’t often inspire a click so much as a nod. Yes, justice is being meted out to unsavory types, keep up the good work out there.

But the implication of one particular arrest last week, that of Houston resident and convicted sex offender John Henry Skillern, sets a dangerous precedent—and one worth paying attention to. Skillern’s last conviction was 20 years ago and now, after Google tipped off authorities, he faces charges for promotion and possession of child pornography.

“I would never be able to find that,” Houston Metro Detective David Nettles admitted to local news station KHOU. “I really don’t know how they do their job, but I’m just glad they do it.”

In the case of Skillern, Google was complying with a federal law that binds all “electronic communication service providers” to report evidence of child pornography, which is not protected under the first amendment. It’s worth noting of course that Google’s action isn’t an outlier—it’s just the case we happened to hear about, seemingly because it blew up on a local news outlet. (If the law is working as intended, why don’t we hear about cases like this one on a regular basis?)

The process by which a Gmail account gets flagged for child pornography is a fairly straightforward one. Working with a the National Center for Missing & Exploited Children’s (NCMEC), Google cross-references images on its servers with those in a database of known photos of child exploitation. Photos in the database are assigned a unique, encrypted digital fingerprint through a process called hashing. When a copy in circulation shows up on a service like Gmail—and on Google’s servers—it produces a match, and a flag. Google implemented the process in 2008, but how many matches and arrests have been produced to date isn’t public knowledge. Google does say that ” this evidence is regularly used to convict criminals.”

As others have noted, the current protocol incentivizes the creation of new pornographic material that isn’t yet indexed—likely a nightmare scenario for the folks who created the system to begin with. And according to the NCMEC, over 2.5 million reports have hit its CyberTipline since 1998, though arrests resulting from those tips are only occasionally reported in its “success” stories blog at a rate of a few per year.

Oddly, in response to the case, Google reassured us that we can continue to plot other kinds of crimes through our inboxes. Burglary, for instance, according to a statement distributed to press outlets like the New York Times:

It is important to remember that we only use this technology to identify child sexual abuse imagery, not other email content that could be associated with criminal activity (for example using email to plot a burglary).

But considering its steady, quiet practice of overhauling the terms of service agreement we happily sign off on to use its products, Google’s current position offers little peace of mind. But that’s not just Google’s fault, though the company’s deep relationship with the NSA remains as murky as ever. It’s worth remembering that the agency’s definition for suspected terrorism is an ever-widening gyre.

Few would argue that current U.S. laws go too far in protecting children from sexual exploitation, but our justice system just happens to line up with public opinion on that one. Sexual crimes against minors have long been particularly reviled by the law, the public and even other violent offenders, but the slope is slippery.

Consider other kinds of crimes. Polling conducted by Pew and others consistently finds that the majority of Americans are worried about the NSA’s invasive digital data collection in the name of the war on terror. And beyond being complicit in the government’s unlawful, ineffective methods for ferreting out terror suspects, Google and other technology companies kowtow to greedy, misguided entertainment industry titans like the MPAA and RIAA, abiding by the terms laid out in The Digital Millennium Copyright Act (DMCA) in order to minimize their liability.

As PCWorld’s Mark Hachmann writes:

Google has typically worked with law enforcement and regulatory agencies, developing a Content ID database to help speed take-down requests for copyrighted information from YouTube, for example. With ContentID, the content holders themselves can notify Google and ask for a copy of Frozen, for example, to be removed.

Of course, the DMCA still leads to absurd abuses like the time Prince decided (and then un-decided) to sue 22 of his own fans for $1 million a pop. Meanwhile, systemic white collar crime continues to seam rip the fabric of our economy largely unpunished. Not all laws are created equal—nor all criminals. The uncomfortable reality is that law enforcement is subjective and more often than not meditated by where the money is or isn’t flowing.

From a Terms of Service change Google made in April, 2014 (new language in green):

Notably, the word “analyze” wasn’t present in Google’s ToS from 2007, nor its 2012 revision. In light of the ever-shifting legal and technological landscape, it’s a good time to take a long, hard look at what you’ve been willfully ignorant of. When does the end justify the means? And who gets to decide? And, most importantly, are we all screwed?

The answer is “maybe.” To try and figure out why, I spoke with Brennan Novak, a hacker and designer who works on Mailpile, a Web client that seeks to “bring encryption and privacy to the masses”—a noble cause and a noble team, from what I know of them.

What kind of precedent do you think the recent Google-prompted arrest sets?

BN: Google has to comply with the law, and while the law often works for the greater good, the question remains, what if the law requires them to do other snooping which is more controversial?

The precedent set by this revelation is a chilling one for sure, but hopefully it will be good for privacy advocates who feel Google is far too large and centralized and seek alternatives which are becoming more and more available.

Do we know what other kinds of illegal activity Google and email providers scan for?

BN: We really have no way of knowing what type of analysis Google is doing on everyone’s data. Is it just robots trying to find keywords for advertising or are they building psychological profiles of people and working with scientists like the Facebook experiment?

Consider things like Bitcoin or marijuana, both of which have varying degrees of legality in different territories. How would Google create algorithms that autoscan emails while also adhering to due process? Traditionally, this requires a warrant, which is supposed to be issued on a case by case basis and only after an assessment that there is probable cause to believe a crime has been committed. Individuals should be secure in their homes and their possessions. Warrantless searches and private law enforcement are not legal, and that is a line that is seemingly being crossed here.

What are people’s biggest misconceptions about email privacy?

BN: I think most people feel that email is one of the most private communication channels out there and this is largely from the fact that emails do not exist on public pages like the rest of the Web—email happens at this other protocol layer of the internet, SMTP. However, just because it’s not viewable from a web browser doesn’t mean emails don’t pass through multiple servers owned by people with malicious intent.

Is there any hope for privacy in email clients? Even if we encrypt and host our own email, will it inevitably end up in the hands of companies like Google based on the people we contact?

BN: There is absolutely hope. Encrypting email is a great step forward especially if we can get large masses of people doing it by default. True, encrypted email sent through Google’s servers will still have readable metadata (to:, cc:, from:, and subject: lines), but that is still a huge improvement as Google will not be able to perform bulk analysis of the encrypted content. Also, independently hosting email closes this gap even more.

My project, Mailpile is making encryption easier on many levels. There are lots of efforts to make safer transfer protocols that don’t leak metadata like SMTorP and Darkmail.

If you could tell someone the single simplest step toward privacy, what would you recommend?

BN: Start learning the basics of how digital communication works. Learn the differences between cloud/server and local computer/client. If you’re adventurous go attend a CryptoParty. The people hosting those events are very knowledgeable and friendly.

Novak’s project will be available in beta on August 13 (or September 13 for the 1.0 release) so giving Mailpile a look is a good start too. For more about email privacy, the service offers a privacy FAQ with enough depth to tide you over until the formal launch. However you choose to do it, it’s high time to start considering who is privy to all of those digital postcards you slap a stamp on and fling off toward servers unknown.

Illustration by Jason Reed

Yes, Google is reading your email, and yes, you should care

Wake up.

Taylor Hatmaker