Which Artists Were Omitted From The “Largest Vocabulary In Hip-Hop” Charts?

05.05.14 3 years ago 25 Comments


If you listen to rap and your friends know you listen to rap, there’s a 90% chance you’ve received a link to the set of infographics labeled “The Largest Vocabulary in Hip Hop”, showing the most verbose rhymers in rap, within the past 24 hours. I got it in my inbox early Sunday morning and, thinking it wouldn’t really flood the net until Monday, I marked the email as one to check later.

Wrong. That link has since been sent to me no less than eight times since yesterday morning.

The premise, as explained by creator Matt Daniels, data scientist and coder:

“35,000 words covers 3-5 studio albums and EPs. I included mixtapes if the artist was just short of the 35,000 words. Quite a few rappers don’t have enough official material to be included (e.g., Biggie, Kendrick Lamar). As a benchmark, I included data points for Shakespeare and Herman Melville, using the same approach (35,000 words across several plays for Shakespeare, first 35,000 of Moby Dick).

“I used a research methodology called token analysis to determine each artist’s vocabulary. Each word is counted once, so pimps, pimp, pimping, and pimpin are four unique words. To avoid issues with apostrophes (e.g., pimpin’ vs. pimpin), they’re removed from the dataset. It still isn’t perfect. Hip hop is full of slang that is hard to transcribe (e.g., shorty vs. shawty), compound words (e.g., king shit), featured vocalists, and repetitive choruses.”

With those parameters in mind, were there any artists who didn’t make the cut but should have? I know next to nil about Aesop Rock so I can’t speak on the number one ranking, but seeing the Wu at number two was somewhat expected as were Kool Keith and Twista placed so high. Still, I feel like a few artists are possibly missing from the database. Or better yet, which dark horse picks made the list?

To see the full charts and accompanying breakdowns, click here.

Around The Web