This Ph.D. student developed an algorithm to find the best rapper

Eminem live

Photo via EMR/Flickr (CC BY 2.0)

‘I wanted to illustrate how rappers actually put a lot of effort to their lyrics in order to make them sound good.’

Ever since the Sugarhill Gang hired a live band to play Chic’s “Good Times” on a loop so it could rap over the music, hip-hop fans have waged an impossible conversation. Like the baseball in The Sandlot, it’s a circular, never-ending game to anoint and elevate voices to all-time great status: Who’s the best rapper? 

The problem with this conversation is that by establishing ground rules and objective patterns, two things happen. First, the culture becomes a tepid, used tub that grows cranky and discourages innovation (as modern geniuses like Kendrick Lamar too often feel a duty to chase ghosts left by Nas and Tupac). 

Second, academic dorks make it a pathological endeavor to license pointless studies about frivolous factors like vocabulary. There is no corollary between word count and a medium constructed around charisma and performance—but that’s why I love the Raplyzer.

The Raplyzer is an algorithm built to analyze rhymes and patterns. Specifically, this computer program’s rankings scan for conventional, multi-syllabic patterns in rap. It exists to simply measure who uses the most multi-syllabic patterns in their writing. 

In rap lyrics, assonance, where words don’t have necessarily the same ending but they share a vowel sound, is the most typical form of rhyming nowadays. In multi-syllable rhymes (multis), it is not only the last syllable but multiple syllables that share a vowel sound. For example: 

“This is a job – I get paid to sling some raps,
What you made last year was less than my income tax

The program is a creation by Aalto University Ph.D. student Eric Malmi. For it, Malmi used a speech synthesizer called eSpeak to phonetically transcribe rap lyrics and and detect rhymes. It would then assign a “rhyme factor” to artists that computes the average rhyme length (the granular methodology is worth a read).

“My research area is data mining, which means that I work on developing algorithms for extracting useful information from massive data sets,” Malmi tells the Daily Dot via email. “I enjoy both programming and rap music a lot so I thought it would be cool if I could somehow combine the two. Also, I think many people who don’t listen to rap themselves fail to appreciate how technically elaborate rap lyrics often are. So I wanted to illustrate how rappers actually put a lot of effort to their lyrics in order to make them sound good.”

Perhaps my favorite nugget in the study is that controversial Chicago teenager Chief Keef posted a rhyme factor that landed him at number seven. Keef is loathed by purists for his drill sound, codeine-mushy stylings, violent imagery, and the perception that his is a dumbed-down commercial approach. But in the agnostic world of data mining, not so much.

“Many readers were expecting Eminem to be in the top but he was ranked only 39th.”

“I was slightly surprised to find out how young some of the artists with the highest Rhyme factors are, like Earl Sweatshirt and Chief Keef,” he says. “I have to admit that I don’t know his music particularly well but based on the little I’ve listened to recently, I think part of that criticism might be due to people not agreeing with the lifestyle his lyrics reflect. This should, of course, be distinguished from the technical skills of the rapper.”

Malmi himself makes for an expertly unbiased researcher on this particular project. He’s a big Finnish rap fan, but was only familiar with “about 30 of the 94” artists analyzed (going into the project, he says Eminem, Jay Z, Lecrae, and Jedi Mind Tricks were personal American favorites). Two Finnish guys, Redrama and Paleface, were likewise high-charting.

Rhyme factor rankings

1. Inspectah Deck, 1.187
2. Rakim, 1.180
3. Redrama, 1.168
4. Shai Linne, 1.152
5. Earl Sweatshirt, 1.152
6. AZ, 1.144
7. Chief Keef, 1.144
8. ASAP Rocky, 1.132
9. Paleface, 1.132
10. Tech N9ne, 1.127
11. Kool G Rap, 1.123
12. MF Doom, 1.120
13. Slaughterhouse, 1.117
14. Sage Francis, 1.115
15. Elzhi, 1.105
16. Vinnie Paz, 1.097
17. R.A. The Rugged Man, 1.096
18. Andy Mineo, 1.087
19. Talib Kweli1.083
20. 2 Chainz, 1.075

Of course, no half-interested fan one would argue that Earl Sweatshirt is a more important artist than guys he beat like Biggie (number 30) and Scarface (number 42). Clearly, there are limitations; voice and message can’t always come across. Even global favorite Eminem ranked a paltry 39th, after all.

“Many readers were expecting Eminem to be in the top but he was ranked only 39th.” Malmi says. “He definitely uses a lot of multis and internal rhymes, but knowing that he often constructs them by bending, that is by pronouncing the words differently, this was not that big of a surprise to me.”

Malmi is aware of the Raplyzer’s fundamental limitations, too: “I think the greatest limitation was the fact that some artists bend the words more often than others and Raplyzer only recognizes the standard pronunciation… in order to analyze different aspects of charisma and delivery, like how well does the artist manage to stay on beat, would require the algorithm to analyze audio files instead of merely looking at the lyrics. This should definitely be doable but perhaps not within the time limits of a hobby project.”

For his part, Malmi is working on another component called Battle Bot that can detect “meaningful rhymes.” He hopes to eventually take his research in artificial intelligence and program an original, A.I.-rooted program that can write you a “coherent” rap song.

Earl Sweatshirt better watch his back.

Photo via EMR/Flickr (CC BY 2.0)

Upstream
Hip-hop 2014: An 85-song yearbook of unimpeachable bangers
Highlights from a year dominated by Rich Gang, DJ Mustard, and, sigh, Iggy Azalea.
From Our VICE Partners

Pure, uncut internet. Straight to your inbox.