The aim of this project is to establish an international collaboration between the proponent and the research group GETALP (Study Group for Machine Translation and Automated Processing of Languages and Speech) of the Laboratoire d'Informatique de Grenoble (LIG) of the Université Grenoble Alpes.
The collaboration is aimed at making available, in Open Access, the linguistic data resulting from my PhD research, on the Tunisian dialect of Tunisia, through a web interface that allows to interact with linguistic data. The information that is intended to be made available to the user through the system are: POS tagging, stemming, lemmatization, glossing, transliteration, diatopic and diachronic information (at least at an initial level, but that in the future could be expanded with additional levels of interaction with the data). Through the web platform it would be possible to insert, in the search engine of the system, input of linguistic strings in Italian, English or Tunisian (graphically encoded in all possible writing systems: Romanization, Arabic characters, Arabish - Tunisian written in Roman script and numbers, used by native speakers on social networks). As this system was designed, it is not intended to provide the user with an automatic translation of the strings, but an apparatus of fundamental and synthetic information to allow the user to reconstruct himself a translation, in English or Italian, potentially perfect of the text entered.
In order to support the collaboration with the GETALP group, a parallel training of the proposing subject in computational linguistics is foreseen, through the training path outlined in the appropriate sections.
This research brings with itself a high innovative potential since it aims at the creation of a virtual interface, with a great capacity to adapt to research developments. In fact, this project aims to combine the efforts of experts in different fields and researchers, in the structural phase of implementation of the interface as well as in the phase of making available the research data of all those involved in Tunisian want to make available the fruits of their research in Open Access.
As we have already seen in the section dedicated to the state of the art, there is no similar tool for the Tunisian dialect. Moreover, this project combines dialectological knowledge, the result of direct contact with Tunisian culture, with the most modern technologies made available by modernity.
Another innovative factor is certainly the fact that it deals with the graphic variety of the Tunisian used online, that is the Arabish (see above), still very little studied.
Future developments of this collaboration in general, and of the virtual interface in particular, can only have blurred boundaries as they allow the development of a wide range of analyses, progresses and extensions to various sectors, such as the use of this tool to support the teaching of the variety of Tunisian Arabic, as well as the inclusion in the data system of other Arabic dialects.