Proceedings DOI: 10.1109/CiSt56084.2023

The paper Challenges and Progress in Constructing Arabic Dialect Corpora and Linguistic tools: A Focus on Moroccan and Tunisian Dialects by Ouafae Nahli (CNR-ILC | CWALM Research Unit 2 Leader), Elisa Gugliotta (CNR-ILC | CWALM Research Unit 2), Nadia Khlif (University Mohammed I of Morocco | CNR-ILC | CWALM Research Unit 2) and Giulia Benotto (CNR-ILC | CWALM Research Unit 2) was published in 2023 7th IEEE Congress on Information Science and Technology (CiSt).

Given the lack of resources for Arabic dialects, the construction of corpora, lexical resources, and tools is a non-trivial challenge. The focus of the paper is to describe its authors in-progress work to address these deficiencies. The paper starts with Moroccan and Tunisian dialects to provide annotated corpora and corpus-based lexical resources. Its authors also aim to extend an existing morphological engine with linguistic resources built ad hoc for each dialect. In addition, they aim to develop an component to be integrated in the morphological engine to better address linguistic and sociolinguistic characteristics while preserving the integrity of dialectal texts.

Paper DOI: 10.1109/CiSt56084.2023.10410009 | Insertion into IEEE Xplore: 05/02/2024 | Publisher: IEEE

A Paper on Arabic Dialects Published in IEEE CiSt’23 Proceedings