Ricerc@Sapienza

Audio signal separation through deep generative adversarial network

Anno

2020

Proponente Michele Mancusi - Assegnista di ricerca

Struttura

DIPARTIMENTO DI INFORMATICA

Sottosettore ERC del proponente del progetto

PE6_11

Componenti gruppo di ricerca

Componente	Categoria
Emanuele Rodola'	Tutor di riferimento

Abstract

The problem of recognizing and therefore distinguishing different audio sources is an easy problem for humans: recognizing the various instruments in a song and what they are playing is a simple task even for an untrained ear, as well as concentrating on a person speaking during a dinner or an event full of people.
Recognizing and identifying different audio sources present within a mixture of signals is a relatively accessible task even for a shallow neural network, but isolating the individual sources and hence separating the different contributions is not easy at all, even for an advanced neural architecture.
This project aims to elaborate and train an innovative deep neural network capable of separating a mixture of audio signals even very different from each other, in order to obtain a clear and clean separation of the various components of the raw signal, exploiting a new class of machine learning frameworks called generative adversarial network.
This approach has been widely used in the generative field to create new and original audio signals (such as songs or people's voices) starting from existing samples, but has never been used to solve this separation problem and I strongly believe that this method, applied to this kind of problem, will lead to a new state of the art in the field of audio separation.

ERC

PE6_7, PE7_7

Keywords:

APPRENDIMENTO AUTOMATICO, INTELLIGENZA ARTIFICIALE, ELABORAZIONE DEI SEGNALI