Audio signal separation through deep generative adversarial network
Componente | Categoria |
---|---|
Emanuele Rodola' | Tutor di riferimento |
The problem of recognizing and therefore distinguishing different audio sources is an easy problem for humans: recognizing the various instruments in a song and what they are playing is a simple task even for an untrained ear, as well as concentrating on a person speaking during a dinner or an event full of people.
Recognizing and identifying different audio sources present within a mixture of signals is a relatively accessible task even for a shallow neural network, but isolating the individual sources and hence separating the different contributions is not easy at all, even for an advanced neural architecture.
This project aims to elaborate and train an innovative deep neural network capable of separating a mixture of audio signals even very different from each other, in order to obtain a clear and clean separation of the various components of the raw signal, exploiting a new class of machine learning frameworks called generative adversarial network.
This approach has been widely used in the generative field to create new and original audio signals (such as songs or people's voices) starting from existing samples, but has never been used to solve this separation problem and I strongly believe that this method, applied to this kind of problem, will lead to a new state of the art in the field of audio separation.