The Knowledge Acquisition Bottleneck Problem in Multilingual Word Sense Disambiguation

04 Pubblicazione in atti di convegno
Pasini Tommaso

Word Sense Disambiguation (WSD) is the task of identifying the meaning of a word in a given context. It lies at the base of Natural Language Processing as it provides semantic information for words. In the last decade, great strides have been made in this field and much effort has been devoted to mitigate the knowledge acquisition bottleneck problem, i.e., the problem of semantically annotating texts at a large scale and in different languages. This issue is ubiquitous in WSD as it hinders the creation of both multilingual knowledge bases and manually-curated training sets. In this work, we first introduce the reader to the task of WSD through a short historical digression and then take the stock of the advancements to alleviate the knowledge acquisition bottleneck problem. In that, we survey the literature on manual, semi-automatic and automatic approaches to create English and multilingual corpora tagged with sense annotations and present a clear overview over supervised models for WSD. Finally, we provide our view over the future directions that we foresee for the field.

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma