Finding Musically Meaningful Words using Sparse Canonical Component Analysis

Work by David Torres, Douglas Turnbull, Luke Barrington, and Prof. Gert Lanckriet

One of our main projects in CAL is the modeling of words that humans use to describe music. These models are learned by extracting song labels from examples of music and associated text. The quality of these song labels largely dictates the quality of our statistical models; therefore, we would like to explore methods that discover high quality, "musically meaningful" words.

Naturally, branding a word as "musically meaningful" is a subjective task; but, not all labels have the same degree of subjectivity. For example, based on human surveys that we conducted, we found that some classes of labels are used fairly consistently. For instance, it is no surprise that instrumentation labels such as "this song contains vocals" seem to be used objectively by people.

In this project we attempt to discover objective words by finding words that are highly correlated with the audio content of the songs they describe. Words that are correlated with audio may serve as better objective song labels since the correlations may be the basis for why a human would use the label. Examples of this could be instrumentation words as described above, or even emotional labels such as "this song is happy" which may exhibit some kind of audio structure.

A statistical tool that is helpful for doing this is sparse canonical correlation analysis (Sparse CCA). Sparse CCA looks for strong correlations between data which is expressed in two fundamentally different ways. In this case, labels collected for a song are one view of the data, and the audio signal (audio features) of the song is the other view. Intuitively sparse CCA discovers a subset of a collection of words that is highly correlated with with the audio signal.


Relevant Publications
Sriperumbudur, Torres & Lanckriet(2007) - Sparse Eigen Methods by D.C. Programming. To appear in International Conference on Machine Learning (ICML), 2007 bib