Ricerc@Sapienza

Interpretability in Machine Learning with Applications to Genomics and Finance

Anno

2020

Proponente Aristidis Anagnostopoulos - Professore Ordinario

Struttura

DIPARTIMENTO DI INGEGNERIA INFORMATICA, AUTOMATICA E GESTIONALE -ANTONIO RUBERTI-

Sottosettore ERC del proponente del progetto

PE6_11

Componenti gruppo di ricerca

Componente	Categoria
Luca Becchetti	Componenti strutturati del gruppo di ricerca
Andrea Mastropietro	Dottorando/Assegnista/Specializzando componente non strutturato del gruppo di ricerca
Stefano Piersanti	Dottorando/Assegnista/Specializzando componente non strutturato del gruppo di ricerca

Componente	Qualifica	Struttura	Categoria
Evangelos Evangelou	Assistant Professor	School of Medicine, University of Ioannina, Greece	Altro personale aggregato Sapienza o esterni, titolari di borse di studio di ricerca
Georgios Markopoulos	Postdoc	School of Medicine, University of Ioannina, Geece	Altro personale aggregato Sapienza o esterni, titolari di borse di studio di ricerca

Abstract

The high performance of neural networks in classification and prediction tasks, makes them being applied to practically all areas where large amounts of data are available. Given their complicated mathematical structure, they have almost always been considered black boxes, without providing information about the mechanisms that contribute to the output. Yet, for some applications, two notable examples being bioinformatics and finance, is of paramount importance to know the criteria that lead to a given output.
The goal of this project is twofold. First it will advance significantly the state of the art by proposing, analyzing, and evaluating techniques for understanding how input features in neural-network-based classification functions interact with each other and what is the effect of these interactions to the network output.
Second it will apply these findings to the fields of bioinformatics and of finance. Regarding the former, the application scenario is the discovery of epistatic interactions, which regards the detection of the interaction between genomes in the human genome. The second application is predictions of company default: what factors can lead to a company default within a 12-month period?
Both of these applications will be based on the analysis of precious and high quantity and quality datasets.
The team is composed from experts in Data Science and Big Data analysis, as well as experts in the two application areas, who will collaborate to bring advancements to the areas of computer science, as well as specific contributions to the respective fields.

ERC

PE6_6

Keywords:

APPRENDIMENTO AUTOMATICO, INGEGNERIA INFORMATICA, BIOSTATISTICA, BIOINFORMATICA, FINANZA