ARABIC LEXICON

Morphology of words in Arabic

In most languages, common nouns, adjectives and verbs can take very various forms in sentences, depending on the grammatical rules of the language. This is especially true in the Arabic language, where a single root of three consonants can generate hundreds of different forms.

While traditional dictionaries cover only a small fraction of the whole range of forms found in texts, our technology has been used to generated a database of 65 000 entries with their 6 millions of forms, covering more than 98 % of the forms found in any sort of text (literature, newspaper articles etc.), the remaining 2% including proper names.

Arabic Lexicon interfaces with Unitex, which is an open source corpus processing system for language processing, developed by Gaspard Monge Laboratory (LIGM UPEM).

Unitex Arabic has been presented to prestigious organizations, like Al-Ghazali Institute of La Grande Mosquée de Paris and L’Institut du Monde Arabe. It now can be used in a wide range of domains, like text editors, digitalization of printed documents, data mining in Arabic web contents and e-learning of Arabic.

 

Applications

  • Orthographic correction
  • Automatic typing word completion
  • E-reputation analysis on web sites
  • E-learning of the Arabic language
  • Digitalization of documents

 

Competitive advantages

  • Accuracy
  • Exhaustivity
  • Responsiveness

 

Intellectual property

Copyright

 

Keywords

Semitic languages - Arabic - Orthography - Grammar - Unitex

Télécharger la fiche de technologie

Tout s’accélère.
Et vous ?

Erganeo se tient à votre écoute.

votre sujet
Pour toute information concernant les données personnelles, consultez les mentions légales.