granular computing | Ricerc@Sapienza

Complexity vs. performance in granular embedding spaces for graph classification

The most distinctive trait in structural pattern recognition in graph domain is the ability to deal with the organization and relations between the constituent entities of the pattern. Even if this can be convenient and/or necessary in many contexts, most of the state-of the art classiﬁcation techniques can not be deployed directly in the graph domain without ﬁrst embedding graph patterns towards a metric space. Granular Computing is a powerful information processing paradigm that can be employed in order to drive the synthesis of automatic embedding spaces from structured domains.

An enhanced filtering-based information granulation procedure for graph embedding and classification

Granular Computing is a powerful information processing paradigm for synthesizing advanced pattern recognition systems in non-conventional domains. In this paper, a novel procedure for the automatic synthesis of suitable information granules is proposed. The procedure leverages a joint sensitivity-vs-specificity score that accounts the meaningfulness of candidate information granules for each class considered in the classification problem at hand.

Mining m-grams by a granular computing approach for text classification

Text mining and text classification are gaining more and more importance in AI related research fields. Researchers are particularly focused on classification systems, based on structured data (such as sequences or graphs), facing the challenge of synthesizing interpretable models, exploiting gray-box approaches. In this paper, a novel gray-box text classifier is presented. Documents to be classified are split into their constituent words, or tokens. Groups of frequent m tokens (or m-grams) are suitably mined adopting the Granular Computing framework.

On the optimization of embedding spaces via information granulation for pattern recognition

Embedding spaces are one of the mainstream approaches when dealing with structured data. Granular Computing, in the last decade, emerged as a powerful paradigm for the automatic synthesis of embedding spaces that, at the same time, yield an interpretable model on the top of meaningful entities known as information granules. Usually, in these contexts, one aims at finding the smallest set of information granules in order to boost the model interpretability while keeping satisfactory performances.

An ecology-based index for text embedding and classification

Natural language processing and text mining applications have gained a growing attention and diffusion in the computer science and machine learning communities. In this work, a new embedding scheme is proposed for solving text classification problems. The embedding scheme relies on a statistical assessment of relevant words within a corpus using a compound index originally proposed in ecology: this allows to spot relevant parts of the overall text (e.g., words) on the top of which the embedding is performed following a Granular Computing approach.

Exploiting cliques for granular computing-based graph classification

The most fascinating aspect of graphs is their ability to encode the information contained in the inner structural organization between its constituting elements. Learning from graphs belong to the so-called Structural Pattern Recognition, from which Graph Embedding emerged as a successful method for processing graphs by evaluating their dissimilarity in a suitable geometric space.

Granular computing techniques for bioinformatics pattern recognition problems in non-metric spaces

Computational intelligence and pattern recognition techniques are gaining more and more attention as the main computing tools in bioinformatics applications. This is due to the fact that biology by definition, deals with complex systems and that computational intelligence can be considered as an effective approach when facing the general problem of complex systems modelling. Moreover, most data available on shared databases are represented by sequences and graphs, thus demanding the definition of meaningful dissimilarity measures between patterns, which are often non-metric in nature.

The universal phenotype

Commentary on: Martino, A, Giuliani, A, Todde, V, Bizzarri, M, Rizzi, A, 2019, “Metabolic Networks Classification
Knowledge Discovery by Information Granulation” Computers in Biology and Chemistry, pp. 107187. DOI: 10.1016/j.
compbiolchem.2019.107187

(Hyper)Graph embedding and classification via simplicial complexes

This paper investigates a novel graph embedding procedure based on simplicial complexes. Inherited from algebraic topology, simplicial complexes are collections of increasing-order simplices (e.g., points, lines, triangles, tetrahedrons) which can be interpreted as possibly meaningful substructures (i.e., information granules) on the top of which an embedding space can be built by means of symbolic histograms. In the embedding space, any Euclidean pattern recognition system can be used, possibly equipped with feature selection capabilities in order to select the most informative symbols.

Stochastic information granules extraction for graph embedding and classification

Graphs are data structures able to efﬁciently describe real-world systems and, as such, have been extensively used in recent years by many branches of science, including machine learning engineering. However, the design of efﬁcient graph-based pattern recognition systems is bottlenecked by the intrinsic problem of how to properly match two graphs. In this paper, we investigate a granular computing approach for the design of a general purpose graph-based classiﬁcation system.