audio processing | Ricerc@Sapienza

ISPAMM - Intelligent Signal Processing and MulitiMedia

Le attività di ricerca del gruppo includono: Machine Learnng for Signal Processing, Adaptive Audio Array Processing, Blind Signal Processing, Audio Processing and Computer Music, Neural Networks for Signal Processing, Optimization Algorithms for Mach

Informazioni

Struttura
Spin OFF
KET
Keywords
Progetti principali
Altre informazioni

Componenti del gruppo

Nome	E-mail

Collaboratori nazionali ed internazionali

Laboratori

Pubblicazioni

Brevetti

Progetti di ricerca

ERC

ERC	Descrizione

Infrastrutture

Leggi tutto su ISPAMM - Intelligent Signal Processing and MulitiMedia

Deep recurrent neural networks for audio classification in construction sites

In this paper, we propose a Deep Recurrent Neural Network (DRNN) approach based on Long-Short Term Memory (LSTM) units for the classification of audio signals recorded in construction sites. Five classes of multiple vehicles and tools, normally used in construction sites, have been considered. The input provided to the DRNN consists in the concatenation of several spectral features, like MFCCs, mel-scaled spectrogram, chroma and spectral contrast. The proposed architecture and the feature extraction have been described.

Guest editorial special issue on computational intelligence for end-to-end audio processing

The goal of this special issue is to understand how and to what extent novel computational intelligence techniques based on the emerging end-to-end learning paradigm can be efficiently employed in Digital Audio, in the light of all aforementioned aspects.

A CNN approach for audio classification in construction sites

Convolutional Neural Networks (CNNs) have been widely used in the field of audio recognition and classification, since they often provide positive results. Motivated by the success of this kind of approach and the lack of practical methodologies for the monitoring of construction sites by using audio data, we developed an application for the classification of different types and brands of construction vehicles and tools, which operates on the emitted audio through a stack of convolutional layers.