Open Source and the Big Data dilemma

It’s been 25 years since advocates of a liberal approach to software met at a conference in Palo Alto, Calif., and emerged with the term “Open Source.” At the time, the driving issues centered on control of the software, but they’ve since shifted hard toward a concern with the data that’s generated by all those apps. One reason for that change is a product often seen as Open Source’s signal victory, the Android operating system.

A Brief History of Software Movements

Before Open Source, the principle force resisting the tendency of commercial software makers to close off the inner workings of their products was the Free Software movement. Begun 15 years before Open Source, it culminated with the 1991 release of the first Linux-based operating system. The institutional core of Free Software was (and remains) the GNU Project, providing legal permissions that allow users to run, share, and modify the source code of software bearing GNU licenses.

If the goal of opening up the operating system motivated GNU activists, then the Browser Wars were the occasion that unified Open Source. With the release of Explorer 4 in 1997, Microsoft amped up its attempts to gain the upper hand over Netscape Navigator, then the most popular browser on the market. By distributing Explorer as the default browser on Windows, Microsoft implicitly acknowledged that the commercial frontier had shifted from desktops to online and sought to use its dominant operating system to corner the browser market as well.

Netscape struck back with what was, at the time, a stunning gambit. The next year—just months before the Department of Justice opened its three-year antitrust case against Microsoft—Netscape announced its intention to make Navigator’s source code public. In the long term, the browser’s fate was already sealed, but the move established the Mozilla project, which refashioned the Navigator core into the modern Firefox browser. At the same time, it inspired a conference of software enthusiasts to advocate broader commercial adoption of the Free Software ethos under the name Open Source.

The emergence of the Open Source Initiative quickly opened a rift between the faithful. Intent on diplomacy, the Open Source faction praise the practical benefits of public source code as an inducement to businesses. By contrast, Free Software loyalists retain a more ideological bent, emphasizing the ethical arguments for favoring “free-as-in-freedom.” The distinction may not seem especially clear from the outside, but luminaries like Richard Stallman have long argued that the points of contention between Free Software and Open Source have practical consequences, like restrictions on what consumers can actually do with the software they use.

Just as the Browser Wars changed the circumstances for Free Software, setting the stage for Open Source, the Smartphone Wars have unexpectedly turned the tables on both. In many ways, it has eclipsed the conflict between the two, altogether changing the ideological stakes. As with so much in the modern life, the underlying culprit is data.

From Open Source to Closed Data

At present, the most widely used smartphone OS on the market is Android, built on the Linux kernel. Because it’s distributed as Open Source software, Free and Open Source advocates have tended to see Android as proof of both the practical advantages and virtuousness of freely distributable source code. By contrast, Android’s two major competitors—Apple’s iOS and Microsoft’s Windows Phone—are both proprietary and can only be distributed by a licensed vendor, on an approved device.

That contrast was made all the sharper by the initial victory Apple scored by being first to bring touch-based smartphones to market. As a recently published excerpt from Wired writer Fred Vogelstein’s upcoming book Dogfight explains, the 2007 announcement of the first-generation iPhone sent the Android team (already deep in development on a prototype called Sooner) back to the drawing board.

By the end of the year, they had formed the Open Handset Alliance (OHA), a consortium of design firms and hardware manufacturers collaborating on open standards for mobile devices. As the cornerstone of those standards, Android is the primary beneficiary of the Alliance. Yet, members of the OHA, which includes most major producers of mobile phones and tablets, are restricted from designing devices for use with incompatible versions—which is why Amazon was forced to contract with a laptop developer to produce its Android-based Kindle Fire tablet. In this case, Open Source doesn’t quite equate to free-as-in-freedom.

If that were the extent of the complications introduced by Android, though, we’d be looking at a rather straightforward variation on the old Free Software vs. Open Source debate. The bigger issue stems from the financial interest behind Android. When the company formed in 2003, one of its initial backers was Google; two years later, Google acquired the startup outright with the goal of producing a Google-branded smartphone.

What would the search giant want with its own smartphone? It’s simple: a bigger market. Google had already made a very profitable business out of advertising—a recent study shows it commanding as much as 80 percent of the online market—but in the early years of the new century, telephone service providers were the gatekeepers of mobile phone platforms. If Google wanted command of the mobile search market, including the potential for advertising on mobile platforms, it needed to break that stranglehold with its own product.

Even beyond display ads, though, a Google-owned mobile platform serves as a valuable source of data about its users. Google can collect that data and put it toward uses that increase the value of Google services. One well-known case is the use of smartphone location tracking to calculate street traffic density on Google Maps.

More generally, though, the data Android users provide (often without their knowledge) contributes to the profiles Google builds in order to generate more and better targeted advertising. If Google can connect the dots between your smartphone and your desktop—as, for example, when you share bookmarks between one version of Chrome and another—then it can convert the data it collects from your smartphone (like your recent visit to a local proctologist) into advertising that you see when you log into YouTube on your home computer.

Choose What You’ll Lose

From Google’s perspective, then, the investment is justified by the way it feeds their Big Data machine. As recently as a year ago, many of us would have regarded that as an ultimately harmless, if occasionally creepy, practice. That was before the Edward Snowden revelations showed how Big Data provided not only the model but also much of the content for a more worrying encroachment on our privacy.

Compare that to the iOS model, which builds revenue mostly by collecting a portion of app and in-app sales. Even as Android has cornered upwards of 80 percent of the mobile market, iOS continues to profit simply because its App Store continues to sell more apps than Google Play. The upshot is that, because Apple has no appreciable stake in building revenue through advertising, it also has a much smaller vested interest in hording data about its users. That, in turn, makes it harder for the company to intrude on the privacy of its users, as well as easier to exempt itself from many FISA court requests.

The tradeoff is a much more closed ecosystem—one that legally reserves to Apple control of what happens on devices that operate on iOS. At times, Apple has exercised that hegemony in ways that have struck users and developers alike as anti-egalitarian, as when its App Store removed apps designed to help Chinese iPhone users circumvent government censorship. Such moves seem to confirm the concerns of Free Software and Open Source advocates alike.

As a result, mainstream consumers are all but forced to choose between proprietary systems that limit their ownership and Open Source alternatives that exploit their data. Up to now, the former mostly has meant iOS and the once-ubiquitous Blackberry OS; the latter camp includes not just OHA-compliant versions of Android but also closed source competitors like Facebook’s Home frontend and (to the extent that Microsoft relies on it to flesh out the Big Data promise of Bing) the Windows Phone as well.

In short, the conflict that defines the new generation of operating systems is not Free Software vs. Open Source. Rather, the key issue today concerns how we choose between control of our devices and ownership of our data.

A new generation of Open Source platforms (most promising, perhaps, the nonprofit Mozilla Foundation’s Firefox OS) offers some hope of resolving that dilemma. So far, though, those alternatives are only available on a narrow range of devices, most of them released only in select European markets. Until they gain more widespread adoption, U.S. consumers may be forced to choose between the need for privacy and the imperatives of control.

Illustration by Jason Reed

Open Source and the Big Data dilemma

Consumers are forced to choose between proprietary systems that limit their ownership and Open Source alternatives that exploit their data.