At its F8 developer conference, the embattled social media giant revealed how it uses Instagram uploads and hashtags to train object recognition models. Using hundreds of graphics processors, Facebook organized 3.5 billion images across 17,000 accounts and created machine learning models that can beat top-of-the-line industry benchmarks.
Facebook says its best work achieved 85.4 percent accuracy on ImageNet, an image database used to test object recognition software. That’s about 1 to 2 percent higher than other leading software.
“We rely almost entirely on hand-curated, human-labeled data sets. If a person hasn’t spent the time to label something specific in an image, even the most advanced computer vision systems won’t be able to identify it,” said Mike Schroepfer, Facebook’s chief technology officer.
The difficulty of working with this data format is determining which hashtags are relevant to the content in the photos. There are various reasons why someone would add a tag, but not all of them help indicate what’s in the image. To filter those out, Facebook created its own system that prioritizes relevant content, a necessary step in creating what it calls the “large-scale hashtag prediction model.”
While all this may sound impressive, Facebook’s machine learning practices raise some serious privacy concerns. As you may now be aware, Facebook has data points on billions of users spread across its popular social networks. The company says it only uses content posted publically, nothing in your inbox or that you’ve set for only friends to view.
It’s also notable that the new algorithms are designed specifically for machine learning, not to predict your next posts or bombard you with relevant advertisements. But as Facebook attempts to reduce the damage from its ongoing data scandal by increasing transparency, this photo scanning method brings up hard questions about whether people know what their posts are being used for.
From a technical standpoint, Facebook has successfully achieved an impressive feat by effectively cleaning a mass of unorganized data and turning it into a useful software tool. By accurately recognizing objects in images, Facebook can improve its search tools and even combat abuse.