Last Sunday was the Finnish Music Day. That reminded me of the historically important data set of the Helsinki Philharmonic Orchestra. The data includes details not just from the concerts performed by the Orchestra itself, but also from all concerts it has arranged.
Timing was appropiate also otherwise. Some days ago I wrote a short review of the new book Using OpenRefine. The tool is not completely new to me but it’s been a while when I launched it, a couple of years I guess. At that time it was still called Google Refine.
After opening a new OpenRefine project with the data file, I first transformed all date strings to proper date format. Then I edited and merged a few composer name variants with clustering (with both methods available), and finally exported three columns of the data (id, date, composer) with the help of the custom tabular exporter.
This small web app offers an interactive peak into the data, but only a subset of it. The number of different composers is very big, and I didn’t even consider curating all of the data. Therefore, I ended up presenting summary statistics of only those composers whose works have been played in at least 50 concerts during some decade. Note that the user experience on the reactive plots is somewhat hampered by the fact that the colours do not remain the same. I’m still a newbie user of the famous R ggplot2 library, so bear with me.
Behind the tab labelled Other stats, you’ll find a non-reactive table that shows which composers – in all of the data – have been most popular on a concert basis.