Last week, Microsoft released an app designed to help the blind and visually impaired “see” what is going on around them. Called Seeing AI, the app uses computer vision to describe people, text, and objects by analyzing whatever is in view of a user’s smartphone camera. It relays that information back to the user with a computer assistant that talks through their phone’s speakers. Microsoft claims its app is capable of reading short text and documents and analyzing products, scenes, people, and (eventually) currency, but there’s only one to really find out.
I tested to see how well it could recognize and describe different settings and objects—and I was pleasantly surprised by the results.
Seeing AI: Reading short text
In my first test, I asked the app to read a few paragraphs from a Daily Dot article on HyperLoop One. I blew up the article on my 27-inch monitor, grabbed an iPhone 7 Plus, pressed on the app’s “Short Text” icon, and pointed my camera at the screen.
The app immediately started reading text to me in a female robotic voice. That promising start was quickly followed by persistent annoyance. The voice on the other end awkwardly skipped large sections of text and constantly repeated itself.
Here’s the block of text I asked Seeing AI to read.
Elon Musk’s vision for high-speed travel in giant steel tubes just got one step closer to reality.
Hyperloop One completed the first full-scale test of its hyperloop pod in vacuum conditions, the company announced earlier today. The full-sized hyperloop vehicle got nowhere near “airline speeds,” topping out at 70 mph during its run down the company’s test track in Nevada.
This is what it came up with.
Elon Musk’s vision for travel. Elon Musk’s high-speed travel in giant steel. Elon Musk. Elon Musk’s vision for high speed travel in giant steel got one step closer to reality. One completed test of its in vacuum conditions. The company announced earlier today the airlines speeds topping out at 70.
Clearly, the app struggled to read text from a screen—or it’s just really, really into Elon Musk. But I didn’t give up on the “Reading” function, figuring most of the app’s troubles came down to the poor visibility of a monitor and layout of the webpage. So I grabbed one of my favorite Haruki Murakami novels—Colorless Tsukuru Tazaki and His Years of Pilgrimage—and gave the app another spin.
Without skipping a beat, the app read two paragraphs from the book without a single error. Sure, the way it fluctuated its voice was a bit awkward, and its dry tone was devoid of any emotion, but the words were all there.
“Short Text” has a lot of potential. It’s extremely fast and works well when text has a simple formatting. But its inconsistencies keep me from recommending it.
Seeing AI: Document
The “Document” feature was created to provide audible guidance to help the visually impaired take photos of documents. It recognizes when something is a document and uses audio cues to help the user align the paper into their camera’s view. For example, it might say “left edge not visible.” Once you’ve gotten it aligned, the app automatically snaps a photo and scans the document’s text and formatting onto your phone. From there, it should read whichever sections of the text you swipe on.
It occurred to me that this could be a fantastic way to overcome the limitations of the “Short Text” feature. To test that theory, I took the same text from the article before, printed it out, and gave it a go using Document.
Here’s a screengrab of the document.
The app nailed it, all the way down to the punctuation.
Unfortunately, I was not able to get it to read the text to me. Hopefully Microsoft fixes this feature in the future.
Seeing AI: Product
There are a number of apps capable of scanning barcodes and pulling up product information, so I felt pretty confident Seeing AI would do a good job this round. It didn’t disappoint.
The most intuitive element is how it guides users to product barcodes. You will start hearing a beep if a barcode comes into view of your camera. That beeping gets faster and faster as you get closer to the label. It takes some getting used to, but once you’ve gotten the hang of where your camera is in relation to the item you are holding, it works pretty well.
I grabbed a pair of Sennheiser headphones to see if it would tell me which model I was carrying. It was spot-on.
I then pulled out some fiber caps to see its potential for providing important information about medicines and supplements. Not only did it tell me the brand and product, it also gave me the capsule count.
I’m pleased with the results.
Seeing AI: Person
A hallmark feature of the app is for it to tell users who’s around them, how those people look, how old they are, and how they are feeling.
The app said I was “happy” in this photo of me smiling.
Unfortunately, it added six years to my age. It got closer to my actual age when I frowned but said I looked more “neutral” than “sad.”
Here I am with a super-frown. That seemed to fix things (though definitely not my image).
I then tried to trick the app with a pair of sunglasses, but it one-upped me with its response and punished me further by saying I was 31 years old.
Despite my bitter attitude, I’d give Person a solid B rating.
Seeing AI: Scene
As the name implies, this experimental setting tells a user what’s going on around them. Just aim your camera and snap a photo of any scene and it will analyze it for you. The results of my test ranged from on-point to wildly inaccurate.
Here I am sitting on a bench in the park. The app’s description isn’t wrong, but additional details would have made it more useful.
It did a great job with this photo of some guy’s puppy, only forgetting the words “cute” and “beefy.”
This next description could be very useful for the visually impaired. Perhaps as camera technology advances (HoloLens?), it will be able to tell users how far away they are from objects.
It had some trouble identifying a treadmill.
Overall the app did a solid job, especially considering it was just released this week with a number of features still in beta. I could see this being a useful tool for the blind and visually impaired, but there’s still a lot of work for Microsoft to do to iron out some of its annoyances and inconsistencies.
That being said, Seeing AI is a truly novel idea—one designed to help the blind and visually impaired see the world by listening.
Seeing AI is available today for free on iOS. Microsoft has not announced plans to launch the app on Android.