CWALM – A lexical corpus-based model of Contemporary Written Arabic is a project aiming at creating a lexicographic resource for Contemporary Written Arabic (CWA) that takes into account materials whose features are found in real-world Arabic written texts, regardless of a preliminary classification on the basis of their linguistic nature.

CWALM will provide the scientific community of Social Sciences and Humanities with:

  • a new theoretical approach that will overcome the traditional description of the Arabic linguistic system in terms of diglossia and will interpret Arabic as a linguistic complex;

  • a final test model that will aim to be the first large-scale validated CWA resource providing objective and substantial data to test competing theories on the linguistic status of the Arabic language and prove the extensibility of the model to the complete coverage of CWA;

  • a lexicographic resource that, in the long term, may have a positive social impact on the inclusion of Arabic-speaking communities and play a crucial role in fostering social dialogue with and within Arabic-speaking minorities in Italy.

CWALM is co-funded by the Italian Ministry of University and Research under the PRIN 2020 Funding Programme and boasts the participation of the Roma Tre University, the Institute for Computational Linguistics “A. Zampolli” of the National Research Council of Italy and the Free University of Languages ​​and Communication IULM.

CWALM started on 1st June 2022 and will end on 31st May 2025.