HYD3A: Hypercomplex Deep Learning for 3D Audio Analysis

Anno
2019
Proponente Danilo Comminiello - Professore Associato
Sottosettore ERC del proponente del progetto
PE6_11
Componenti gruppo di ricerca
Componente Categoria
Antonello Rizzi Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Elio Di Claudio Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Raffaele Parisi Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Simone Scardapane Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Paolo Giannitrapani Dottorando/Assegnista/Specializzando componente non strutturato del gruppo di ricerca / PhD/Assegnista/Specializzando member non structured of the research group
Massimo Panella Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Componente Qualifica Struttura Categoria
Indro Spinelli M.Sc. Student DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE, ELETTRONICA E TELECOMUNICAZIONI Altro personale aggregato Sapienza o esterni, titolari di borse di studio di ricerca / Other aggregate personnel Sapienza or other institution, holders of research scholarships
Eleonora Grassucci M.Sc. Student DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE, ELETTRONICA E TELECOMUNICAZIONI Altro personale aggregato Sapienza o esterni, titolari di borse di studio di ricerca / Other aggregate personnel Sapienza or other institution, holders of research scholarships
Abstract

Next-generation technologies, ranging from driverless cars to immersive virtual reality, are expected to understand and analyse the surrounding world through a range of high-resolution sensors. In particular, 3D audio sensors will endow them with a clear spatial sense of the environment on-par with the auditory human system. Exploiting this information can provide agents and autonomous applications with the capability of localizing and conveying sounds more efficiently and with a higher level of perceptual awareness. At the same time, analysing 3D raw audio data in real-time poses new, significant research and implementation challenges preventing a successful deployment. Algorithms should be able to understand the spatial distribution of audio sources in the sound field while, at the same time, allowing for efficient inference from the raw waveforms in a variety of applications.
The aim of the HYD3A project (pronounced as ¿idea¿) is to design a family of deep learning algorithms tailored to such 3D audio signals for deployment in immersive environments. To accomplish this goal, the algorithms will leverage a new generation of deep neural networks to model and learn signals in hypercomplex (e.g., quaternion) domains.
Prior research has shown that 3D audio can be naturally modelled in a hypercomplex representation. HYD3A will build upon these insights to design a set of deep networks for analysing 3D audio coming from a variety of microphone sensors. Hypercomplex deep networks have the potential to reduce significantly the network complexity with respect to state-of-the-art competitors (thus simplifying their implementation on-device), while allowing for a more accurate learning and optimization procedure.
HYD3A is expected to have a positive impact, both in research and industry, for a range of problems involving the analysis of 3D audio, including immersive sound localization, audio enhancement, acoustic scene recognition, and audio super-resolution.

ERC
PE6_11, PE7_7, PE6_7
Keywords:
APPRENDIMENTO AUTOMATICO, ELABORAZIONE DEI SEGNALI, RETI NEURALI, ACQUISIZIONE E MODELLAZIONE DI DATI 3D, TECNOLOGIE DELL¿INFORMAZIONE E DELLA COMUNICAZIONE

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma