Article Lead Image

Image via Stephen Nellis/Twitter

Microsoft’s new bot can recognize certain images—and guesses badly about the rest

Bots are our (awkward) future.

 

Jay Hathaway

Internet Culture

Posted on Apr 19, 2016   Updated on May 26, 2021, 10:26 pm CDT

Microsoft took a lot of heat over the dismal failure of its teen girl Twitter chatbot, Tay.AI, which pranksters quickly conditioned to become a vulgar racist, but Tay wasn’t its only experiment with machine learning. Witness CaptionBot, which is currently being trained to automatically caption images. Sometimes it works. More often, it gets things hilariously wrong.

According to Microsoft, the CaptionBot AI is a combination of Microsoft’s Computer Vision API (for image analysis), Bing Image Search (as a library to compare against), and Emotion API (for assessing facial expressions). 

CaptionBot’s strong suit seems to be recognizing celebrity faces, which makes sense considering how often they’re photographed. 

But it has a little trouble with other things…

The bot isn’t susceptible to saying explicitly racist things, like Tay was, and Microsoft planned ahead by teaching it to ignore photos of Hitler and some other potential disasters. Still, if Microsoft was to learn anything from Tay, it should have realized that trolls will go to absurd lengths to elicit upsetting results from a bot. CaptionBot is susceptible, too: 

But for every gruesome mistake, there are several funny ones:

https://twitter.com/DarkBunnyTees/status/720599164468518912

https://twitter.com/stephmelnick96/status/720706798077022208

And when in doubt, the AI just assumes everything is giraffes:

https://twitter.com/HappyHorseSkull/status/720382892703023104

Accordingly, here is perhaps the best drawing ever of two giraffes:

CaptionBot.AI

CaptionBot still has a lot more learning to do, but for now we may as well laugh at its growing pains.

Share this article
*First Published: Apr 19, 2016, 10:00 am CDT