Instrument learning and sparse NMD for automatic polyphonic music transcription
In this paper, an Automatic Music Transcription (AMT) algorithm based on a supervised Non-negatve Matrix Decomposition (NMD) is discussed. In particular, a novel approach for enhancing the sparsity of the solution is proposed. It consists of a two-step processing in which the NMD is solved joining a `2 regularization and a threshold filtering. In the first step, the NMD is performed with the `2 regularization in order to get an overall selection of the notes most likely appearing in the monotimbral musical excerpt. In the second step, a threshold filtering followed by another `2 regularized NMD are repeatedly performed in order to progressively reduce the dictionary matrix and to refine the notes transcription. Furthermore, a useroriented instrument learning procedure has been conceived and proposed. The proposed AMT system has been tested upon the dataset collected by the LabROSA laboratories considering the transcription of three different pianos. Moreover, it has been validated through a comparison with a regularized NMD and with three open source AMT software. The results prove the effectiveness of the proposed two-step processing in enhancing the sparsity of the solution and in improving the transcription accuracy. Moreover, the proposed system shows promising performance in both multi-F0 and note tracking tasks, obtaining in most tests better transcription accuracy than the competing algorithms.