Ricerc@Sapienza

HYD3A: Hypercomplex Deep Learning for 3D Audio Analysis

Anno

2019

Proponente Danilo Comminiello - Professore Associato

Struttura

DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE, ELETTRONICA E TELECOMUNICAZIONI

Sottosettore ERC del proponente del progetto

PE6_11

Componenti gruppo di ricerca

Componente	Categoria
Antonello Rizzi	Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Elio Di Claudio	Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Raffaele Parisi	Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Simone Scardapane	Componenti strutturati del gruppo di ricerca / Structured participants in the research project
Paolo Giannitrapani	Dottorando/Assegnista/Specializzando componente non strutturato del gruppo di ricerca / PhD/Assegnista/Specializzando member non structured of the research group
Massimo Panella	Componenti strutturati del gruppo di ricerca / Structured participants in the research project

Componente	Qualifica	Struttura	Categoria
Indro Spinelli	M.Sc. Student	DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE, ELETTRONICA E TELECOMUNICAZIONI	Altro personale aggregato Sapienza o esterni, titolari di borse di studio di ricerca / Other aggregate personnel Sapienza or other institution, holders of research scholarships
Eleonora Grassucci	M.Sc. Student	DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE, ELETTRONICA E TELECOMUNICAZIONI	Altro personale aggregato Sapienza o esterni, titolari di borse di studio di ricerca / Other aggregate personnel Sapienza or other institution, holders of research scholarships

Abstract

Next-generation technologies, ranging from driverless cars to immersive virtual reality, are expected to understand and analyse the surrounding world through a range of high-resolution sensors. In particular, 3D audio sensors will endow them with a clear spatial sense of the environment on-par with the auditory human system. Exploiting this information can provide agents and autonomous applications with the capability of localizing and conveying sounds more efficiently and with a higher level of perceptual awareness. At the same time, analysing 3D raw audio data in real-time poses new, significant research and implementation challenges preventing a successful deployment. Algorithms should be able to understand the spatial distribution of audio sources in the sound field while, at the same time, allowing for efficient inference from the raw waveforms in a variety of applications.
The aim of the HYD3A project (pronounced as ¿idea¿) is to design a family of deep learning algorithms tailored to such 3D audio signals for deployment in immersive environments. To accomplish this goal, the algorithms will leverage a new generation of deep neural networks to model and learn signals in hypercomplex (e.g., quaternion) domains.
Prior research has shown that 3D audio can be naturally modelled in a hypercomplex representation. HYD3A will build upon these insights to design a set of deep networks for analysing 3D audio coming from a variety of microphone sensors. Hypercomplex deep networks have the potential to reduce significantly the network complexity with respect to state-of-the-art competitors (thus simplifying their implementation on-device), while allowing for a more accurate learning and optimization procedure.
HYD3A is expected to have a positive impact, both in research and industry, for a range of problems involving the analysis of 3D audio, including immersive sound localization, audio enhancement, acoustic scene recognition, and audio super-resolution.

ERC

PE6_11, PE7_7, PE6_7

Keywords:

APPRENDIMENTO AUTOMATICO, ELABORAZIONE DEI SEGNALI, RETI NEURALI, ACQUISIZIONE E MODELLAZIONE DI DATI 3D, TECNOLOGIE DELL¿INFORMAZIONE E DELLA COMUNICAZIONE