Article Lead Image

A night of drinking with beer recommendation apps

It can be hard for a person to define their palate, so how well could a computer fare?

 

AJ Dellinger

Tech

Posted on Jan 30, 2015   Updated on May 29, 2021, 3:59 pm CDT

Walk into your local liquor store or look down the length of the bar at your favorite pub and you will find a glass bottle menagerie. The popularity of craft beers has been skyrocketing, propelled along by an explosion of microbreweries. It’s been a boon for booze lovers, but it can make it hard to find a starting point. So where do you turn for suggestions?

That’s where a couple apps would like to offer up their services. For all those times you find yourself staring aimlessly at aisles of beer, there’s Next Glass. And when you’re stuck scrolling down a beer list with no end in sight, there’s Picky Pint. These apps are fun, new, and engaging, but most importantly: Do they work?

To find out, I sat down with Dan Schwalbach—a beer manager at Steve’s Wine, Beer, Spirits in Madison, Wis. and Cicerone certified beer server. (Full disclosure, he’s also my cousin.) We also brought in two guinea pigs, Kelsey and Hannah, to test the apps’ ratings. 

The test 

For the decidedly unscientific study, I purchased a mix-matched six pack of ales with varying scores from the beer rating site BeerAdvocate, plus a late addition of a favorite from the beer expert. The lineup was as follows, presented for your judgement and beer snobbery:

  • New Glarus Spotted Cow (85 BeerAdvocate Score)

  • Deschutes Black Butte Porter (94)

  • Widmer Brothers Okto Festival Ale (79)

  • Left Hand Brewing Company Milk Stout (90)

  • Blue Moon Belgian White (78)

  • Great Lakes Brewing Company Christmas Ale (89)

  • Lagunitas Brewery Daytime (90)

The first lesson learned in this experience was that it’s surprisingly difficult to find a particularly bad craft beer. While popular picks like America’s best-selling beer Bud Light usually sank into the “poor” or “awful” territory, the vast majority of the options in the extensive beer fridge at the grocery store were rated higher than 70, which is the cutoff for an “OK” rating. 

AJ Dellinger

“There are a lot of mediocre breweries that have distribution because they have some sort of marketability, whether that’s because they’re local, have quirky names and packaging, or otherwise,” explains Schwalbach. “But at the same time, the quality and innovation of craft beer is rising constantly with the popularity of it so what was considered a good beer might not seem as good now, just because of context.”

He also explained that, while the market is growing to potentially unstable levels, there are plenty of worthwhile drinks to be… drank. “There’s a lot of good beer out there now, and I think eventually stores start running out of shelf space to carry everything and maybe some breweries suffer because of that.”

Fixing the limitations of current distribution methods for beers is a long term goal of Next Glass, but its immediate function is helping users pick something worth drinking. It does so with a Pandora-like system, breaking down every element of a beer through a scientific analysis and learning the chemistry of a user’s taste preferences with a Pandora-like machine-learning method.

“The machine learning algorithms take into account all of the data, so these subtle tastes are accounted for.”

“It’s all proprietary. We don’t pull anything from anywhere (Beer Advocate, RateBeer, or otherwise). We collect 20,000+ chemical attributes from our high-res LC mass spectrometer for each bottle and record more than a dozen objective non-chemical attributes (like label design, stopper type, region, price, etc.) to drive our recommendations,” explained Next Glass COO Trace Smith.

There was some skepticism within the testing group about how accurate Next Glass could truly be. After all, it can be hard for a person to define their palate, so how well could a computer fare?

Smith is plenty confident in his product, and with good reason: “When we deliver a score above 85 to a user, they rate that beer with three or four stars 96 percent of the time, so it’s been very accurate thus far,” he said.

“The machine learning algorithms take into account all of the data, so these subtle tastes are accounted for,” Smith stated. If a drinker can’t quite articulate what they like, odds are with enough use, this app can do it for them.

Picky Pint counts on a less sophisticated approach. The app uses reviews from RateBeer, a crowdsourced scoring system that uses a Bayesian weighted mean so “more ratings increase the score’s validity.” The scores are used as a general guide, adjusted based on customizable user preferences.

“The app makes suggestions based on a beer’s rating and your set preferences (which now includes just bitterness, but we’re looking to expand that in the future). If you’ve set in your preferences that you don’t enjoy very bitter beers, for instance, the app makes sure it doesn’t suggest very bitter beers to you, and they’ll be greyed out and won’t receive a medal, even if it’s the best at the bar according to its score,” Picky Pint creator Mike Parks explained.

The bitterness ratings are expressed in International Bitter Units (IBUs) and are accompanied with a color-coded system in the app. But does that necessarily work?

A bitter pill 

Schwalbach found the emphasis on bitterness interesting, if not misguided. He suggested that bitterness could be tough to quantify depending on a person’s palate.

“You can have a 35 IBU that tastes more bitter depending on the other ingredients in the beer,” he said. “Most people who don’t have a palate for bitterness probably won’t know what IBUs are.”

Parks’ reasoning was simple: Bitterness is a big barrier to entry for craft beers. “For newcomers to craft beer, the level of bitterness of many craft beers is often the first thing they notice when given recommendations from friends…I’ve had personal experience in this too when I was becoming interested in craft beer, where my beer-savvy friends would suggest a beer they’d love, but I’d find it way too bitter to drink. Having the beer bitterness show up prominently is a good early-warning system for the newcomer, who may not know, for example, about which styles are especially hoppy.

Picky PInt

“The database we use has a pretty thorough data set on beer bitterness, and the fact that bitterness is expressed as a number makes it much easier to sort and deal with on the technical side than many other subjective metrics such as the style or descriptive adjectives like ‘fruity,’ ‘floral,’ ‘hoppy,’ ‘heavy,’ etc. All of these contribute to how prominently we show the bitterness data versus other information.”

As the beers were cracked open and the testing process began, we quickly ran into the limitations of both apps.

Technical issues

Schwalbach and Kelsey both rated an upwards of 20 beers in Next Glass before starting, while Hannah ranked just 10. What was quickly discovered was the more ratings the better, and just giving positive feedback does no good.

While Next Glass tosses beers at you and asks for ratings on a four star system, Hannah went rogue and provided primarily positive reviews to drinks she knew she liked—mostly hard ciders and fruitier beers.

The result: Next Glass assumed she’d drink just about anything put in front of her. Every score was a 90 or higher.

I asked Smith at Next Glass if he’d seen this happen. “We haven’t seen users rating one type of beer cause all scores to skew highly,” he said. “Of course, if you rate everything with four stars, the algorithms are going to assume you like everything!

“Knowing what one does not like is as valuable to us as knowing what one does like. I can say that if she’s only rating ciders/fruit beers and is expecting a lot of differentiation when she scans/looks for scores for stouts and porters, she’s not going to get as much as someone who has related a breadth of stouts and porters.”

Likewise, Schwalbach found some of his ratings to be off base on his early ratings. Next Glass scored the Black Butte Porter as a 73.7 for him, which he found to be considerably lower. “I ranked one darker beer and gave it two stars. It probably would have given a better rating if I had more in my profile,” he said.

Kelsey had the most luck with Next Glass. Though none of the seven drinks that were scanned scored above the near-sure thing threshold of 85, she found three out of five beers with available ratings to be accurate with her personal tastes. It’s likely no coincidence that she rated the most beers.

For Picky Pint, there is no individualized score provided. Designed specifically to help make a selection off a beer list at a restaurant or bar, the app scans a photo of available drinks and picks a top three based on the bitterness preferences.

Appy hour 

I had the tasters rank the beers according to their personal preference after testing and compared it to the top three suggestions generated by Picky Pint.

It managed to select Schwalbach’s top three in the same order that he chose them. Given that he had the best idea of what he liked and had the best understanding of the IBU system, this perhaps is to be expected—and it should be noted that the three selections also had the highest overall scores from both RateBeer and BeerAdvocate.

For Kelsey and Hannah, success with Picky Pint proved to be more elusive.

Picky Pint selected the Black Butte Porter as the best choice for both, a decision both disagreed with, and managed only one selection that matched with their respective rankings.

And therein lies the pitfalls of bitterness. As Schwalbach had suggested at the start of the process, it’s a difficult task to rectify the IBU score with the actual overall taste. While Kelsey identified her preferences being anywhere under 20 to 40 IBUs, one of her top choices—New Glarus Spotted Cow—is identified in the app as being in the 40 to 60 range. (This information appears to be incorrect. Most information available online indicate Spotted Cow falling somewhere around 20 IBU.)

Picky Pint also had some issues on the technical side. On iOS it ran smooth, though it would on occasion clear an inputted beer menu upon leaving the app. On Android devices, we ran into a bug that led to a failure to generate ratings. Parks was well aware of the issues when asked about them.

“The Android app tends to work better on newer phones (better cameras, more RAM), and we’ve had some technical problems (especially with older Samsung devices) around image processing, which we’re working through and updating as we have fixes,” he said.

Hannah, the owner of the Android device that had such challenges, commented that she found the app to be irritating. For context, this was said after the sixth beer of the evening was cracked open.

“I don’t think either one could pull me away from Untappd.”

The experience with Next Glass was better received. It had some trouble scanning labels—especially when it came to Great Lakes’ Christmas Ale, which it struggled to differentiate from other Great Lakes labels—but when the augmented display worked it was incredibly cool. It definitely has the layer of polish that makes it appealing, a “wow” factor that makes it worth taking your phone out.

The apps both have some aging to do (though, fortunately, both apps’ founders clearly care about beer and their companies). With Next Glass, it’s clear how the app progresses with you. The more beer you end up sampling, the better the suggestion engine gets. For Picky Pint, it’ll take the user better understanding their own taste buds to figure out what they like while Parks continues to tweak his product.

“I like a pretty wide variety of styles of beers and [the apps] may just not have enough information yet. I’d have to rate more beers. It might take a long time to do. I don’t know if I would do it,” Schwalbach said.

He also brought up a bigger issue for both these apps, and any service entering into the beer rating space: The goliath that is Untappd.

Untapped

“I don’t think either one could pull me away from Untappd, he said. “I have all my information there already. Next Glass has a cool interface, but I think a lot of people who do go out and try a lot of beers already have an idea of what they like.”

Schwalbach noted that Untappd suggestion engine is anything but perfect and had at times suggested he try extraordinarily rare beers; once he was recommended an ale produced seasonally by a California-based brewery and made in such small batches that it was only sold directly from the brewers. Obviously that’s not an ideal recommendation.

It also points to the reality that all of these apps can cohabitate on your phone: Next Glass for bottle browsing, Picky Pint for scanning beer lists, and Untappd for keeping track of it all. Much like the beers themselves, there is an app for every occasion. If only someone could combine them all…

And of course, if you have the option, get advice from an expert. At the end of the day, Schwalbach’s suggestions for the testers matched with their personal picks five out of six times.  

Every once in awhile, you can’t beat the human touch. 

Photo by AJ Dellinger 

Share this article
*First Published: Jan 30, 2015, 7:30 am CST