Originally, this project started out with me, stuck in the car listening to the same music to and from work every day. I got sick of it and wanted to find out just how much music was being over played so I designed some scripts to gather data from each radio station in South Australia.
It started out with a single Python script pulling data from each stations' online media player (which shows the current song playing) and storing it in a database. Using some simple database rules it was easy to prevent duplicate entries, resulting in reasonably accurate data.
On Thursday 14th of August I posted a link to my findings on various social media websites and on Friday 15th of August The Advertiser and Tone Deaf published my findings.
The feedback was astounding. Depressingly almost no-one was surprised by the results of the monitoring, as even listening to most radio stations infrequently it's easy to tell repetition is constant.
Within 2 days of being posted on social media and published on news websites, Nova 91.9 reduced the information it provided publicly to the point where Nova no longer updated its 'now playing' feed.
I do not know if this was a mistake, a technical issue or perhaps a deliberate choice made by the team at Nova, but the feed came back online after 6 days, or 2 hours after a quick tweet.
After releasing the initial findings to the public there was a large amount of interest in tracking more stations, and a huge response from people in other states to start tracking their radio stations.
However my simple database structure was not up to the task of tracking all these new stations, it needed reworking to store more fields like
callsign and multiple tables to keep everything clean.
With the suggestions from people all over Australia I've increased the amount of stations monitored from 6 to 32.
There was a complication however; most radio stations don't sanitise their output causing (among other things) incorrect play counts and duplicate artists.
In order to solve this I used the MusicBrainz JSON Web Service and checked every unique occurrence of every track and artist combo.
Using the Levenshtein distance I then calculated how closely the original data was to what Spotify suggested. If the result was close enough I used the new data, and in the rare cases where it did not match anything I used my own sanitisation and look-up methods to keep data identical between the radio stations.
Currently the radio statistics pages show data from the last 30 days in order to keep things fresh.