Using Data to Pick the Best Tracks to Promote on a Playlist

Previous Posts in Series:

1) How a Series of Unfortunate Events Unlocked East Africa’s Largest Music Catalog Release

2) Learning from Related Artists to Discover New Genres/Tags

~4 minute Read

“What tracks in my catalog are most appropriate for this playlist?”


Like any label or content owner in this digital age, we wanted to strategize with Melodica about how to best get their content discovered and heard by anyone curious. As Abdul Karim, the Director of Melodica said when discussing this catalog’s digital release: 


“We are one of the lucky few to have such a library of African music and so we must help people find the melodies that are in their hearts. The world is bigger than one Kenyan customer.”


But how could we get this unique content heard by more people?


The answer, especially in the streaming era, often points to Playlists. We needed to get our content onto more playlists. (Said every Label, ever)

Our content ranged from upbeat Indian Ocean dance grooves to a danceable style of Western Kenyan Blues. 


We searched on Spotify for Playlists based on terms we thought might be relevant to the catalog like “Afrobeat” and found the two playlists below to be promising candidates:

Afrobeat Essentials

African Blues

There was a lot of variety on these playlists, but was there enough diversity in styles that our content would fit in? Especially with our catalog constantly growing in size, how could we find the tracks--among thousands--that would be the “most appropriate” or “most similar” to the songs already in the playlist?


We could rely on someone close to the catalog’s “gut instinct”. But that would usually mean only the small segment of the catalog that they know best ever gets promoted. We could also spend days listening with trained ears to the playlist and then listening to our entire catalog to find close matches. 


We wanted to develop a method that allowed us to leverage data science and tell us mathematically which songs were most appropriate for this playlist and which songs they were most similar to.


To do this, we make use of Spotify’s Track Attribute’s API to pull down data for all Attribute Data for every track in our catalog. This data describes a track on a 0 to 1 scale with attributes like Accousticness, Instrumentalness, Valence (happiness), Energy, Danceability and more. When joined with other attributes such as Tempo and Key, there are 10 total factors on which we can judge “similarity”. 

Example of the Output from the Attributes API Endpoint

Example of the Output from the Attributes API Endpoint



After reviewing a number of methods to calculate “similarity”, we settled on a good ol’ fashioned “distance calculation”. You might remember the distance formula between points in two-dimensions from your geometry class; this is like that, but in 10 dimensional space.

What 3 attributes look like to visualize how distance could be calculated between 3 dimensions

What 3 attributes look like to visualize how distance could be calculated between 3 dimensions





We did some analyses to figure out which dimensions might be the most important to measure. Things like “time_in_ms” (time of the song displayed in milliseconds) were not that crucial and settled on what we felt like were the 10 most important (subscribe to learn which!). 




At the end of some trial and error, we finally developed our “Playlist Similarity Module”. In the v1 of this module, you are able to take any Spotify playlist ID and our software pings Spotify to pull down all of the tracks on that playlist, as well as their associated Track Attributes. 





We then do some data science magic, and we’re able to figure out which tracks are the most mathematically similar to the tracks in the selected playlist, based on the differences of all the Track Attributes. An example of this output for the initial Melodica uploads is below:






We wanted this feature to help us do research on Playlists we thought would be great homes for our catalog. And when your catalog contains 100s of Thousands of tracks, as some of our clients do, we wanted a method that could score the entire catalog to find the most similar tracks

Screen Shot 2020-05-12 at 11.31.11 AM.png


This tool has helped us discover new artists, genres, to cross promote with. We now have ideas for audiences based on these genres and artists that might enjoy our content!


Subscribe to learn how we combined our learnings from Spotify’s Related Artists and our Playlist Similarity Tool to dive even deeper into potential new audiences. We also learn more about which Track Attributes make songs sound “similar” for Playlists/DJs!