clustering

Facing Big Data by an agent-based multimodal evolutionary approach to classification

Multi-agent systems recently gained a lot of attention for solving machine learning and data mining problems. Furthermore, their peculiar divide-and-conquer approach is appealing when large datasets have to be analyzed. In this paper, we propose a multi-agent classification system able to tackle large datasets where each agent independently explores a random small portion of the overall dataset, searching for meaningful clusters in proper subspaces where they are well-formed (i.e., compact and populated).

Graph Fourier transform for directed graphs based on Lovász extension of min-cut

A key tool to analyze signals defined over a graph is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been proposed, based on the eigen-decomposition of either the graph Laplacian or adjacency matrix. In this paper, we introduce an alternative approach, valid for the general case of directed graphs, that builds the graph Fourier basis as the set of orthonormal vectors that minimize a well-defined continuous extension of the graph cut size, known as Lovász extension.

A learning intelligent system for classification and characterization of localized faults in Smart Grids

The worldwide power grid can be thought as a System of Systems deeply embedded in a time-varying, non-deterministic and stochastic environment. The availability of ubiquitous and pervasive technology about heterogeneous data gathering and information processing in the Smart Grids allows new methodologies to face the challenging task of fault detection and modeling. In this study, a fault recognition system for Medium Voltage feeders operational in the power grid in Rome, Italy, is presented.

An agent-based algorithm exploiting multiple local dissimilarities for clusters mining and knowledge discovery

We propose a multi-agent algorithm able to automatically discover relevant regularities in a given dataset, determining at the same time the set of con?gurations of the adopted parametric dissimilarity measure that yield compact and separated clusters. Each agent operates independently by performing a Markovian random walk on a weighted graph representation of the input dataset. Such a weighted graph representation is induced by a speci?c parameter con?guration of the dissimilarity measure adopted by an agent for the search.

A generalized framework for ANFIS synthesis procedures by clustering techniques

The application of machine learning and soft computing techniques for function approximation is a widely explored topic in literature. Neural networks, evolutionary algorithms and support vector machines proved to be very effective, although these models suffer from very low level of interpretability by human operators. Conversely, Adaptive Neuro Fuzzy Inference Systems (ANFISs) demonstrated to be very accurate models featured by a considerable degree of interpretability. In this paper, a general framework for ANFIS training by clustering is proposed and investigated.

A cluster-based dissimilarity learning approach for localized fault classification in Smart Grids

Modeling and recognizing faults and outages in a real-world power grid is a challenging task, in line with the modern concept of Smart Grids. The availability of Smart Sensors and data networks allows to “x-ray scan” the power grid states. The present paper deals with a recognition system of fault states described by heterogeneous information in the real-world power grid managed by the ACEA company in Italy.

Frame-by-frame Wi-Fi attack detection algorithm with scalable and modular machine-learning design

The popularity of Wi-Fi networks coupled with the intrinsic vulnerability of wireless interfaces has promoted the investigation and proposal of traffic analysis and anomaly detection algorithms targeted to that application. We propose a scalable and modular algorithm architecture to set up a lightweight classifier, able to detect malicious frames with high reliability, allowing a simple implementation and suitable for real-time operations.

Characterizing the heterogeneity of European higher education institutions combining cluster and efficiency analyses

The heterogeneity of the Higher Education (HE) Institutions is one of the main critical issues to address properly the assessment of systemic performance. We adopt a multi-level perspective by combining national (macro) and institution (micro) level data and analyses. We combine clustering and efficiency analysis to characterize the heterogeneity of HE systems (at the national level) exploiting micro level data. We show also the potential of using micro level data to characterize national level performance.

An INDCLUS-type model for occasion-specific complementary partitions

Abstract Si presenta un modello di tipo INDCLUS per partizionare le unità nel caso di dati di prossimità a tre vie tenendo conto delle differenze sistematiche esistenti tra le similarità a coppie rilevate in diverse occasioni. In particolare, si assume che la struttura di prossimità di ciascuna occasione si componga di due partizioni complementari: un sottogruppo di unità definisce dei gruppi comuni a tutte le occasioni, mentre le rimanenti unità sono allocate a dei gruppi specifici per ogni occasione.

Rootclus. Searching for "ROOT CLUSters" in three-way proximity data

In the context of three-way proximity data, an INDCLUS-type model is presented to address the issue of subject heterogeneity regarding the perception of object pairwise similarity. A model, termed ROOTCLUS, is presented that allows for the detection of a subset of objects whose similarities are described in terms of non-overlapping clusters (ROOT CLUSters) common across all subjects. For the other objects, Individual partitions, which are subject specific, are allowed where clusters are linked one-to-one to the Root clusters.

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma