Nome e qualifica del proponente del progetto: 
sb_p_2056714
Anno: 
2020
Abstract: 

The high performance of neural networks in classification and prediction tasks, makes them being applied to practically all areas where large amounts of data are available. Given their complicated mathematical structure, they have almost always been considered black boxes, without providing information about the mechanisms that contribute to the output. Yet, for some applications, two notable examples being bioinformatics and finance, is of paramount importance to know the criteria that lead to a given output.
The goal of this project is twofold. First it will advance significantly the state of the art by proposing, analyzing, and evaluating techniques for understanding how input features in neural-network-based classification functions interact with each other and what is the effect of these interactions to the network output.
Second it will apply these findings to the fields of bioinformatics and of finance. Regarding the former, the application scenario is the discovery of epistatic interactions, which regards the detection of the interaction between genomes in the human genome. The second application is predictions of company default: what factors can lead to a company default within a 12-month period?
Both of these applications will be based on the analysis of precious and high quantity and quality datasets.
The team is composed from experts in Data Science and Big Data analysis, as well as experts in the two application areas, who will collaborate to bring advancements to the areas of computer science, as well as specific contributions to the respective fields.

ERC: 
PE6_6
Componenti gruppo di ricerca: 
sb_cp_is_2608725
sb_cp_is_2769392
sb_cp_is_2599446
sb_cp_is_2636179
sb_cp_is_2863490
sb_cp_es_393674
sb_cp_es_393675
Innovatività: 

The wide success of neural networks in the last 10 years has made them one of the main solutions used in many machine learning and data science applications. Their main drawback that is hindering their use in some areas is the issue of "explainability," which is the capacity to be able not only to give an answer, but also to explain it. In certain areas, this has slown down significantly the application of neural networks.

The innovation of this projects stems from addressing this crucial issue, and it is for this reason that we selected two applications that have suffered particularly from this phenomenon.

As we can see in the section of the state of the art, it is only in the last 2-3 years that the scientific research has addressed this topic. The reason is that the problem is technically hard, as neural networks are notoriously complicated. Thus there is a need to design tools for analyzing them.

Looking at the particular applications, most of the biostatistic approaches are based on classic statistical techniques (single and multi variate linear/logistic regression). Findings that can shed some light on why a classifier adopts a decision, can be translated to knowledge of what mutation is responsible for increasing the blood pressure by 10mmHg.

Thus, the potential impact of such a research can be enormous not only in the areas of computer science, but also in the respective fields.

One additional important asset of this project, is the quantity and quality of data used, which are of paramount importance for the type of research carried. For the application area of bioinformatics, the dataset is the largest publicly available collection world-wide of genetic information (500K patients, 73M mutation). Given that the quantity of potential feature interactions is enourmous, a large number of patients gives the statistical power to discover even less strong ones. For the application in finance, the dataset available by the Banca d¿Italia (unfortunately, non available publicly) allows to perform analysis at a scale that has not been reported in the finance literature, and as a result come up with new findings.

Codice Bando: 
2056714

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma