AraCorPy (Arabish Corpus with Python)

Anno
2019
Proponente -
Struttura
Sottosettore ERC del proponente del progetto
SH4_9
Componenti gruppo di ricerca
Componente Categoria
Arianna D'Ottone Tutor di riferimento
Abstract

The aim of this project is to establish an international collaboration between the proponent and the research group GETALP (Study Group for Machine Translation and Automated Processing of Languages and Speech) of the Laboratoire d'Informatique de Grenoble (LIG) of the Université Grenoble Alpes.
The collaboration is aimed at making available, in Open Access, the linguistic data resulting from my PhD research, on the Tunisian dialect of Tunisia, through a web interface that allows to interact with linguistic data. The information that is intended to be made available to the user through the system are: POS tagging, stemming, lemmatization, glossing, transliteration, diatopic and diachronic information (at least at an initial level, but that in the future could be expanded with additional levels of interaction with the data). Through the web platform it would be possible to insert, in the search engine of the system, input of linguistic strings in Italian, English or Tunisian (graphically encoded in all possible writing systems: Romanization, Arabic characters, Arabish - Tunisian written in Roman script and numbers, used by native speakers on social networks). As this system was designed, it is not intended to provide the user with an automatic translation of the strings, but an apparatus of fundamental and synthetic information to allow the user to reconstruct himself a translation, in English or Italian, potentially perfect of the text entered.
In order to support the collaboration with the GETALP group, a parallel training of the proposing subject in computational linguistics is foreseen, through the training path outlined in the appropriate sections.

ERC
SH4_9, SH3_12, SH4_8
Keywords:
ARABISTICA, DIALETTOLOGIA, LINGUISTICA COMPUTAZIONALE, ANALISI DEI DATI TESTUALI, COMUNICAZIONE DIGITALE

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma