machine learning | Ricerc@Sapienza

Mitch: A Machine Learning Approach to the Black-Box Detection of CSRF Vulnerabilities

Cross-Site Request Forgery (CSRF) is one of the oldest and simplest attacks on the Web, yet it is still effective on many websites and it can lead to severe consequences, such as economic losses and account takeovers. Unfortunately, tools and techniques proposed so far to identify CSRF vulnerabilities either need manual reviewing by human experts or assume the availability of the source code of the web application. In this paper we present Mitch, the first machine learning solution for the black-box detection of CSRF vulnerabilities.

SF-UDA-3D: Source-Free Unsupervised Domain Adaptation for LiDAR-Based 3D Object Detection

3D object detectors based only on LiDAR point clouds hold the state-of-the-art on modern street-view benchmarks. However, LiDAR-based detectors poorly generalize across domains due to domain shift. In the case of LiDAR, in fact, domain shift is not only due to changes in the environment and in the object appearances, as for visual data from RGB cameras, but is also related to the geometry of the point clouds (e.g., point density variations).

Digital biomarker-based individualized prognosis for people at risk of dementia

Background: Research investigating treatments and interventions for cognitive decline fail due to difficulties in accurately recognizing behavioral signatures in the presymptomatic stages of the disease. For this validation study, we took our previously constructed digital biomarker-based prognostic models and focused on generalizability and robustness of the models.

Molecular design aided by random forests and synthesis of potent trypanocidal agents as cruzain inhibitors for Chagas disease treatment

Cruzain is an established target for the identification of novel trypanocidal agents, but how good are in vitro/in vivo correlations? This work describes the development of a random forests model for the prediction of the bioavailability of cruzain inhibitors that are Trypanosoma cruzi killers. Some common properties that characterize drug-likeness are poorly represented in many established cruzain inhibitors. This correlates with the evidence that many high-affinity cruzain inhibitors are not trypanocidal agents against T. cruzi.

Claim watching and individual claims reserving using classification and regression trees

We present an approach to individual claims reserving and claim watching in general insurance based on classification and regression trees (CART). We propose a compound model consisting of a frequency section, for the prediction of events concerning reported claims, and a severity section, for the prediction of paid and reserved amounts. The formal structure of the model is based on a set of probabilistic assumptions which allow the provision of sound statistical meaning to the results provided by the CART algorithms.

Multiresolution topological data analysis for robust activity tracking

Multidimensional sensors represent an increasingly popular, yet challenging data source in modern statistics. Using tools from the emerging branch of Topological Data Analysis (TDA), we address two issues frequently encountered when analysing sensor data, namely their (often) high dimension and their sensibility to the reference system. We show how topological invariants provide a tool for detecting change--points which is robust with respect to both the time resolution we consider and the sensor placement.

Machine learning and network medicine: a novel approach for precision medicine and personalized therapy in cardiomyopathies

: The early identification of pathogenic mechanisms is essential to predict the incidence and progression of cardiomyopathies and to plan appropriate preventive interventions. Noninvasive cardiac imaging such as cardiac computed tomography, cardiac magnetic resonance, and nuclear imaging plays an important role in diagnosis and management of cardiomyopathies and provides useful prognostic information.Most molecular factors exert their functions by interacting with other cellular components, thus many diseases reflect perturbations of intracellular networks.

Coronavirus disease (COVID-19): a machine learning bibliometric analysis

Background/Aim: To evaluate the research trends in coronavirus disease (COVID-19). Materials and Methods: A bibliometric analysis was performed using a machine learning bibliometric methodology. Information regarding publication outputs, countries, institutions, journals, keywords, funding and citation counts was retrieved from Scopus database. Results: A total of 1883 eligible papers were returned. An exponential increase in the COVID-19 publications occurred in the last months.

Evaluating the predictions of the protein stability change upon single amino acid substitutions for the FXN CAGI5 challenge

Frataxin (FXN) is a highly conserved protein found in prokaryotes and eukaryotes that is required for efficient regulation of cellular iron homeostasis. Experimental evidence associates amino acid substitutions of the FXN to Friedreich Ataxia, a neurodegenerative disorder. Recently, new thermodynamic experiments have been performed to study the impact of somatic variations identified in cancer tissues on protein stability.

Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images
of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL
maps are derived through computational staining using a convolutional neural network trained to
classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and