Arabic has been traditionally described in terms of diglossia: two distinct levels of the same language – an upper, written, formal one (Classical/Standard Arabic) and a lower, oral, informal one (different varieties of spoken Arabic, the so-called Arabic dialects) – are mixed by speakers through a ‘code-switching’ or a ‘code-mixing’.

The project aims at creating a lexicographic resource for Contemporary Written Arabic (CWA) that takes into account materials whose features are found in real-world Arabic written texts, regardless of a preliminary classification on the basis of their linguistic nature.

Theoretical Approach

The project will provide with a new theoretical approach that will overcome the traditional description of the Arabic linguistic system in terms of diglossia and will interpret Arabic as a linguistic complex.

This new approach may allow Arabic to be analyzed in the same way as other languages have been for some decades within the corpus linguistics tradition, namely as a language whose lexicon (and whose grammar) can be described in a variety-neutral way on the basis of the analysis of a representative corpus of the language that is defined according to a series of external, objective criteria (such as timespan, genres, areas).

Final Test Model

The project will produce a final test model that will aim to be the first large-scale validated CWA resource providing objective and substantial data to:

  • test competing theories on the linguistic status of the Arabic language;
  • prove the extensibility of the model to the complete coverage of CWA.
Final Outcome

The resulting lexical resource design will encourage new approaches in terms of Arabic language teaching and learning, overcoming the long-standing issue of diglossia.

Given the objective corpus-based CWA description, in the long term the foreseen lexicographic resource may play a crucial role in fostering social dialogue with and within Arabic-speaking minorities in Italy.

In fact, the outcome of the project may have a positive social impact on the inclusion of Arabic-speaking communities, since it will contribute to produce teaching and learning materials that will be closer to the actual linguistic reality of native Arabic speakers, also in connection with the existing (but still very limited) presence of Arabic among languages taught in Italian high schools.