Greedy Sparse Approximation and the Automatic Description of Audio and Music Data

About

The project is two-year postdoc grant for B. L. Sturm funded by the Danish Council for Independent Research | Technology and Production Sciences.

Abstract

Yet to be developed are automatic ways to describe audio and music data to a degree such that one can search its content as easily as text. Though low-level methods are widely used, they face a glass ceiling. Greedy methods of sparse approximation provide attractive complements because they adapt models to data, and thus provide content descriptions more concrete than low-level inner products in frames. However, problems with greedy methods prohibit their broader and better application to processing audio and music data because the models they provide can be polluted by artifacts, e.g., objects that are not present in the data. Previous work addresses these problems, but at increased computational complexity, diminished freedom in the analysis, without much success in reducing artifacts, and have yet to be applied to real audio data. This project discoveres, analyzes, and incorporates novel measures of model artifacts into the sparse approximation process, and tests the new methods in four specific applications to processing audio and music data.