Kernel Combination for Musical Feature Integration

We apply a new machine learning tool, kernel combination, to the task of semantic music retrieval. We use 4 different types of acoustic content and social context feature sets to describe a large music corpus and derive 4 individual kernel matrices from these feature sets. Each kernel is used to train a support vector machine (SVM) classifier for each semantic tag (e.g., ‘aggressive’, ‘classic rock’, ‘distorted electric guitar’) in a large tag vocabulary. We examine the individual performance of each feature kernel and then show how to learn an optimal linear combination of these kernels using semi-definite programming. We find that the retrieval performance of the SVMs trained using the combined kernel is superior to SVMs trained using the best individual kernel for a large number of tags. In addition, the weights placed on individual kernels in the linear combination reflect the relative importance of each feature set when predicting a tag.

Subset of 61 words from the CAL500 vocabulary used for kernel combination experiments


Relevant Publications
Barrington, Yazdani, Turnbull and Lanckriet. Combining Feature Kernels for Semantic Music Retrieval. to appear in ISMIR 2008.
Lanckriet, Cristianini, Bartlett, El Ghaoui, and Jordan. Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5:27–72, 2004