End-to-End Learning for 3D Acoustic Scene Analysis (ELeSA)
We use our auditory system not only to listen and recognize sounds, but also to make spatial sense of the surrounding environment and navigate in it. The sense of spatial immersion in a sound field allows the user to clearly understand every sound surrounding in it, as well as any acoustic environment characterized by certain sounds. The ELeSA project is mainly focused on the 3D acoustic scene analysis and understanding to detect, localize and classify sound sources and perfectly describe their nature. This goal also entails the audio quality enhancement of the signals recorded within the acoustic scene surrounding the user by means of 3D microphone arrays. 3D acoustic scene analysis can have a great impact in many applications including audio virtual reality, speech and sound recognition, safe and security. However, the same approach can be applied in many other fields of applications, from telecommunications to electronics to physics and manufacturing industry.
In order to accomplish this goal, suitable and powerful algorithms can be developed and implemented based on the advanced machine learning paradigm called end-to-end learning. Using traditional machine learning methods, the output of the models is as much accurate as the choice of the feature selected for a specific task. However, an optimal feature selection involves an a priori knowledge of the input signals, which is not always possible in practical applications. End-to-end learning directly processes raw data, thus enabling the processing of more complex structured data and resulting in more natural and reliable output. The only drawback of end-to-end learning is the huge amount of computational resources required. However, this issue can be easily solved by using high performance GPU servers.
In the medium and long terms, we expect the ELeSA project to have a positive impact, both on industry and on the research community for the wide range of solutions that can be applied to several scenarios.