Image via Stephen Nellis/Twitter

Microsoft’s new bot can recognize certain images—and guesses badly about the rest

Bots are our (awkward) future.

Feb 29, 2020, 8:29 am*

Internet Culture


Jay Hathaway

Microsoft took a lot of heat over the dismal failure of its teen girl Twitter chatbot, Tay.AI, which pranksters quickly conditioned to become a vulgar racist, but Tay wasn’t its only experiment with machine learning. Witness CaptionBot, which is currently being trained to automatically caption images. Sometimes it works. More often, it gets things hilariously wrong.

According to Microsoft, the CaptionBot AI is a combination of Microsoft’s Computer Vision API (for image analysis), Bing Image Search (as a library to compare against), and Emotion API (for assessing facial expressions). 

CaptionBot’s strong suit seems to be recognizing celebrity faces, which makes sense considering how often they’re photographed. 

But it has a little trouble with other things…

The bot isn’t susceptible to saying explicitly racist things, like Tay was, and Microsoft planned ahead by teaching it to ignore photos of Hitler and some other potential disasters. Still, if Microsoft was to learn anything from Tay, it should have realized that trolls will go to absurd lengths to elicit upsetting results from a bot. CaptionBot is susceptible, too: 

But for every gruesome mistake, there are several funny ones:

And when in doubt, the AI just assumes everything is giraffes:

Accordingly, here is perhaps the best drawing ever of two giraffes:


CaptionBot still has a lot more learning to do, but for now we may as well laugh at its growing pains.

Share this article
*First Published: Apr 19, 2016, 10:00 am