What sets successful songs apart from the rest?

This project is an attempt to answer that question, at least provisionally, using data from Spotify.

Spotify, and other streaming services, don't presently publish streaming data on the vast majority of songs in their catalog, but they have cracked the window open just a little with Spotify Charts—a website that publishes daily and weekly charts capturing the top 200 songs on the platform.

Crucially, this is the only place on the internet where you can see a song's actual performance with audiences in terms of streams over time. The charts go back less than two years (exactly how far back varies a bit by region and weekly vs. daily cadence), so there's no information on streaming prior to mid-2015. It represents the best dataset on music streaming that’s currently available to the public.

I gathered about 16 months of data from it to see what I could learn. Here's the first thing I found:

Most Hit Songs Have a Short Lifespan

Every Song on the Spotify Global Top 200 over 16 months

search artists:

Notes: Each line in this chart represents the streaming performance of a single song, for each week that song cracked the global Top 200 chart—1180 songs in total. These songs are graphed relative to their first week on the chart. You might also notice this graph has a sort of "shelf" at the bottom, where the lines disappear between 1-2MM streams per week—this represents the bottom of the Top 200 chart. Once a song drops off the chart, there's no way to know how many plays it gets after that point. I wrote a small utility to grab these charts and combine them into the master spreadsheet that drives this chart, and the other visualizations in this project.

So, while a few songs have a long lifespan on the charts, the vast majority are on and then off again in a matter of a few short weeks. (About 50% drop off the charts in less than five weeks, 70% in less than ten.) Their trajectory continues, but under the surface—once they drop off the charts, they disappear from our view.

Artists, of course, can release albums or EPs with multiple songs aimed at the charts, and if they're big enough, even some "throwaway" tracks can show up in the dataset (e.g. the single weirdest song in the dataset, "Frank's Song" by Kanye West).

To get a better sense of which artists are capturing an audience of a given size, here's what that same graph looks like when you count streams by artist instead of by song:

Every Artist that Cracked the Charts

streams per week for all artists

search artists:

Notes: Each line is the streaming performance of a single artist across all their charting songs for a given week — 383 artists in total. Unlike the first chart, this one is anchored in real time, not relative to any particular song's first charting date.

The major limitation of this dataset is the invisibility of songs before and after they crack the charts. There's a world of fascinating music below the surface that we just can't see—even the songs that do well enough to show up in the dataset tend to peak quickly, then drop off the charts in short order.

Still, for songs that do surface for more than a few weeks—the biggest hits of their time—we have just enough data to start to compare them. Here, we have the three biggest hits from the top three artists in the dataset by total streams: Drake, Justin Bieber, and The Weeknd.

Song-to-Song Comparison

Nine of the biggest hit songs in the dataset

Streams per week from each song's first charting week

Notes: Each song is represented as a set of concentric circles, the color of which conforms to their streaming performance for each week in the dataset. Like the first line graph, these circles start from the first week a song charted. The smallest, middle circle is the song's first week on the chart—songs that experience a late surge in popularity will show rings of color a few weeks out from the center. The lightest colored circles represent weeks where the song was not on the chart. (Most songs were not on the charts for the full span of the data set, either because they dropped off, or because they weren't released yet when the Spotify started publishing these data.) Hover over a circle for details on the song's performance that week.

All the above charts are derived from the performance data I scraped from Spotify Charts, but Spotify also publishes an excellent API, which exposes detailed metadata about every song in it’s database. If you know which songs you’re interested in, the Spotify API can tell you an awful lot about them, which means we can start to look at the features of successful pop songs as a group. Here's what the songs look like when we compare total streams to a few of Spotify's algorithmically-derived audio features:

The Properties of Popular Music

total streams vs. spotify audio features

search artists:

Notes: Each dot represents a song—the higher up a dot is, the more users streamed it while it was on the charts. Use the search bar to isolate individual artists, and hover over a dot to highlight its position on each of the other plots. I wrote another small utility to grab these metadata features from the Spotify API for each song on the charts.

As individual measures, Spotify’s derived characteristics tell the kind of stories you might expect to hear about the music that makes the charts—more successful music is more danceable than not, for example, and music that sounds more electronic is better represented than acoustic music.

But taken as a group, these characteristics paint a picture of remarkable diversity—when songs at the extremes of these measures can still succeed (and in some cases climb to the top of the charts), you have a pretty vibrant pop ecosystem.

Finally, looking at these features in light of performance has the potential to reflect how popular music is changing over time—a first step toward systemitizing a view of something that is notoriously subjective and changeable: our collective taste.

At Least Some of Our Taste in Music Is Seasonal

moving average of song characteristics

Spotify Audio Features (Jul 2015 - Jun 2016)

Notes: Each line represents the average value of an audio features (e.g., energy) each week in the year from July 2015 through Jun 2016. The average is weighted by popularity, so songs with more streams in a given week have more influence on the average than those with less.

There is presently less than two years of data available on song performance. While we can look at these features over weeks and months, aggregate cultural preferences move over years and decades—there simply isn't enough information yet to draw any strong conclusions about trends or seasonality.

While this is an early view, it is heartening that one notorious cultural effect is fairly clear—just look at what Christmas music does to those averages! Energy and danceability drop for the better part a month in December. It's a tantalizing hint of what we might be able to see with a few more years of data. Here's hoping Spotify continues to publish these charts.