DiaLEX
Arabic Dialects Full-Form Lexicon
Covers all major Arabic dialects
Currently over 140 million entries
Ideal for NLP, including MT and speech
Overview
While Modern Standard Arabic is used as the official language of 22 Arab League nations, Arabs normally use one of the 30 or so modern dialects for communicating with family and friends. However, Arabic dialects don’t have a formal written language nor a standard orthography, resulting in a lack of applications and technologies that support them.
Our Arabic Dialects Full-Form Lexicon, or DiaLEX, has been developed to address this lack of support. DiaLEX is a comprehensive computational lexicon covering several major Arabic dialects and subdialects, including Egyptian, Kuwaiti, Qatari, Emirati, Saudi Arabian Najdi, Saudi Arabian Hejazi, and Palestinian.
Based on ArabLEX, our full-form lexicon for Modern Standard Arabic, DiaLEX will cover all inflected, declined, and cliticized wordforms. It is ideally suited for morphological analysis, machine translation, and speech technology applications.
Distinctive Features
- Extremely comprehensive — currently over 140 million entries
- Exhaustively lists all inflected, declined, and cliticized wordforms
- Numerous orthographic variants
- Includes high frequency proper nouns
- Fully vocalized and unvocalized Arabic
- Accurate phonemic and phonetic transcriptions and transliteration
- All wordforms are cross-referenced to their lemma
Sample for Egyptian Arabic
* Select one of the tabs below.
ARABIC_V | ARABIC_U | LEMMA | GEN | NUM | NPG2 |
---|---|---|---|---|---|
بِيتْ | بيت | بِيتْ | M | S | 000 |
اِلْبِيتْ | البيت | بِيتْ | M | S | 000 |
بِيتِي | بيتي | بِيتْ | M | S | S1C |
بِيتَكْ | بيتك | بِيتْ | M | S | S2M |
بِيتِكْ | بيتك | بِيتْ | M | S | S2F |
بِيتُو | بيتو | بِيتْ | M | S | S3M |
بِيتُهْ | بيته | بِيتْ | M | S | S3M |
بِيتْهَا | بيتها | بِيتْ | M | S | S3F |
بِيتْنَا | بيتنا | بِيتْ | M | S | P1C |
بِيتْكُو | بيتكو | بِيتْ | M | S | P2C |
بِيتْكُمْ | بيتكم | بِيتْ | M | S | P2C |
بِيتْهُمْ | بيتهم | بِيتْ | M | S | P3C |
بِيتِينْ | بيتين | بِيتْ | M | D | 000 |
اِلْبِيتِينْ | البيتين | بِيتْ | M | D | 000 |
بِيتِينِي | بيتيني | بِيتْ | M | D | S1C |
بِيتِينَكْ | بيتينك | بِيتْ | M | D | S2M |
بِيتِينِكْ | بيتينك | بِيتْ | M | D | S2F |
بِيتِينُو | بيتينو | بِيتْ | M | D | S3M |
بِيتِينُهْ | بيتينه | بِيتْ | M | D | S3M |
بِيتِينْهَا | بيتينها | بِيتْ | M | D | S3F |
بِيتِينَّا | بيتينا | بِيتْ | M | D | P1C |
بِيتِينْكُو | بيتينكو | بِيتْ | M | D | P2C |
بِيتِينْكُمْ | بيتينكم | بِيتْ | M | D | P2C |
بِيتِينْهُمْ | بيتينهم | بِيتْ | M | D | P3C |
بِيُوتْ | بيوت | بِيتْ | M | P | 000 |
بُيُوتْ | بيوت | بِيتْ | M | P | 000 |
اِلْبِيُوتْ | البيوت | بِيتْ | M | P | 000 |
اِلْبُيُوتْ | البيوت | بِيتْ | M | P | 000 |
بِيُوتِي | بيوتي | بِيتْ | M | P | S1C |
بُيُوتِي | بيوتي | بِيتْ | M | P | S1C |
بِيُوتَكْ | بيوتك | بِيتْ | M | P | S2M |
بُيُوتَكْ | بيوتك | بِيتْ | M | P | S2M |
بِيُوتِكْ | بيوتك | بِيتْ | M | P | S2F |
بُيُوتِكْ | بيوتك | بِيتْ | M | P | S2F |
بِيُوتُو | بيوتو | بِيتْ | M | P | S3M |
بِيُوتُهْ | بيوته | بِيتْ | M | P | S3M |
بُيُوتُو | بيوتو | بِيتْ | M | P | S3M |
بُيُوتُهْ | بيوته | بِيتْ | M | P | S3M |
بِيُوتْهَا | بيوتها | بِيتْ | M | P | S3F |
بُيُوتْهَا | بيوتها | بِيتْ | M | P | S3F |
بِيُوتْنَا | بيوتنا | بِيتْ | M | P | P1C |
بُيُوتْنَا | بيوتنا | بِيتْ | M | P | P1C |
بِيُوتْكُو | بيوتكو | بِيتْ | M | P | P2C |
بِيُوتْكُمْ | بيوتكم | بِيتْ | M | P | P2C |
بُيُوتْكُو | بيوتكو | بِيتْ | M | P | P2C |
بُيُوتْكُمْ | بيوتكم | بِيتْ | M | P | P2C |
بِيُوتْهُمْ | بيوتهم | بِيتْ | M | P | P3C |
بُيُوتْهُمْ | بيوتهم | بِيتْ | M | P | P3C |
ARABIC | LEMMA | TENSE | VOICE | NPG |
---|---|---|---|---|
كَتَبْتْ | كَتَبْ | perfect indicative | A | S1C |
كَتَبْتْ | كَتَبْ | perfect indicative | A | S2M |
كَتَبْتِي | كَتَبْ | perfect indicative | A | S2F |
كَتَبْ | كَتَبْ | perfect indicative | A | S3M |
كَتَبِتْ | كَتَبْ | perfect indicative | A | S3F |
كَتَبْنَا | كَتَبْ | perfect indicative | A | P1C |
كَتَبْتُوا | كَتَبْ | perfect indicative | A | P2C |
كَتَبْتُو | كَتَبْ | perfect indicative | A | P2C |
كَتَبُوا | كَتَبْ | perfect indicative | A | P3C |
كَتَبُو | كَتَبْ | perfect indicative | A | P3C |
أَكْتِبْ | كَتَبْ | imperfect subjunctive | A | S1C |
اَكْتِبْ | كَتَبْ | imperfect subjunctive | A | S1C |
تِكْتِبْ | كَتَبْ | imperfect subjunctive | A | S2M |
تِكْتِبِي | كَتَبْ | imperfect subjunctive | A | S2F |
يِكْتِبْ | كَتَبْ | imperfect subjunctive | A | S3M |
تِكْتِبْ | كَتَبْ | imperfect subjunctive | A | S3F |
نِكْتِبْ | كَتَبْ | imperfect subjunctive | A | P1C |
تِكْتِبُوا | كَتَبْ | imperfect subjunctive | A | P2C |
تِكْتِبُو | كَتَبْ | imperfect subjunctive | A | P2C |
يِكْتِبُوا | كَتَبْ | imperfect subjunctive | A | P3C |
يِكْتِبُو | كَتَبْ | imperfect subjunctive | A | P3C |
بَكْتِبْ | كَتَبْ | imperfect indicative | A | S1C |
بِتِكْتِبْ | كَتَبْ | imperfect indicative | A | S2M |
بِتِكْتِبِي | كَتَبْ | imperfect indicative | A | S2F |
بِيِكْتِبْ | كَتَبْ | imperfect indicative | A | S3M |
بِتِكْتِبْ | كَتَبْ | imperfect indicative | A | S3F |
بِنِكْتِبْ | كَتَبْ | imperfect indicative | A | P1C |
بِتِكْتِبُوا | كَتَبْ | imperfect indicative | A | P2C |
بِتِكْتِبُو | كَتَبْ | imperfect indicative | A | P2C |
بِيِكْتِبُوا | كَتَبْ | imperfect indicative | A | P3C |
بِيِكْتِبُو | كَتَبْ | imperfect indicative | A | P3C |
اِكْتِبْ | كَتَبْ | imperative | A | S2M |
اِكْتِبِي | كَتَبْ | imperative | A | S2F |
اِكْتِبُوا | كَتَبْ | imperative | A | P2C |
اِكْتِبُو | كَتَبْ | imperative | A | P2C |
هَكْتِبْ | كَتَبْ | simple future | A | S1C |
حَكْتِبْ | كَتَبْ | simple future | A | S1C |
هَتِكْتِبْ | كَتَبْ | simple future | A | S2M |
حَتِكْتِبْ | كَتَبْ | simple future | A | S2M |
هَتِكْتِبِي | كَتَبْ | simple future | A | S2F |
حَتِكْتِبِي | كَتَبْ | simple future | A | S2F |
هَيِكْتِبْ | كَتَبْ | simple future | A | S3M |
حَيِكْتِبْ | كَتَبْ | simple future | A | S3M |
هَتِكْتِبْ | كَتَبْ | simple future | A | S3F |
حَتِكْتِبْ | كَتَبْ | simple future | A | S3F |
هَنِكْتِبْ | كَتَبْ | simple future | A | P1C |
حَنِكْتِبْ | كَتَبْ | simple future | A | P1C |
هَتِكْتِبُوا | كَتَبْ | simple future | A | P2C |
هَتِكْتِبُو | كَتَبْ | simple future | A | P2C |
حَتِكْتِبُوا | كَتَبْ | simple future | A | P2C |
حَتِكْتِبُو | كَتَبْ | simple future | A | P2C |
هَيِكْتِبُوا | كَتَبْ | simple future | A | P3C |
هَيِكْتِبُو | كَتَبْ | simple future | A | P3C |
حَيِكْتِبُوا | كَتَبْ | simple future | A | P3C |
حَيِكْتِبُو | كَتَبْ | simple future | A | P3C |
اِتْكَتَبْتْ | كَتَبْ | perfect passive | P | S1C |
اِتْكَتَبْتْ | كَتَبْ | perfect passive | P | S2M |
اِتْكَتَبْتِي | كَتَبْ | perfect passive | P | S2F |
اِتْكَتَبْ | كَتَبْ | perfect passive | P | S3M |
اِتْكَتَبِتْ | كَتَبْ | perfect passive | P | S3F |
اِتْكَتَبْنَا | كَتَبْ | perfect passive | P | P1C |
اِتْكَتَبْتُوا | كَتَبْ | perfect passive | P | P2C |
اِتْكَتَبْتُو | كَتَبْ | perfect passive | P | P2C |
اِتْكَتَبُوا | كَتَبْ | perfect passive | P | P3C |
اِتْكَتَبُو | كَتَبْ | perfect passive | P | P3C |
أَتْكِتِبْ | كَتَبْ | imperfect passive | P | S1C |
اَتْكِتِبْ | كَتَبْ | imperfect passive | P | S1C |
تِتْكِتِبْ | كَتَبْ | imperfect passive | P | S2M |
تِتْكِتْبِي | كَتَبْ | imperfect passive | P | S2F |
يِتْكِتِبْ | كَتَبْ | imperfect passive | P | S3M |
تِتْكِتِبْ | كَتَبْ | imperfect passive | P | S3F |
نِتْكِتِبْ | كَتَبْ | imperfect passive | P | P1C |
تِتْكِتْبُوا | كَتَبْ | imperfect passive | P | P2C |
تِتْكِتْبُو | كَتَبْ | imperfect passive | P | P2C |
يِتْكِتْبُوا | كَتَبْ | imperfect passive | P | P3C |
يِتْكِتْبُو | كَتَبْ | imperfect passive | P | P3C |
هَتْكِتِبْ | كَتَبْ | future passive | P | S1C |
حَتْكِتِبْ | كَتَبْ | future passive | P | S1C |
هَتِتْكِتِبْ | كَتَبْ | future passive | P | S2M |
حَتِتْكِتِبْ | كَتَبْ | future passive | P | S2M |
هَتِتْكِتْبِي | كَتَبْ | future passive | P | S2F |
حَتِتْكِتْبِي | كَتَبْ | future passive | P | S2F |
هَيِتْكِتِبْ | كَتَبْ | future passive | P | S3M |
حَيِتْكِتِبْ | كَتَبْ | future passive | P | S3M |
هَتِتْكِتِبْ | كَتَبْ | future passive | P | S3F |
حَتِتْكِتِبْ | كَتَبْ | future passive | P | S3F |
هَنِتْكِتِبْ | كَتَبْ | future passive | P | P1C |
حَنِتْكِتِبْ | كَتَبْ | future passive | P | P1C |
هَتِتْكِتْبُوا | كَتَبْ | future passive | P | P2C |
هَتِتْكِتْبُو | كَتَبْ | future passive | P | P2C |
حَتِتْكِتْبُوا | كَتَبْ | future passive | P | P2C |
حَتِتْكِتْبُو | كَتَبْ | future passive | P | P2C |
هَيِتْكِتْبُوا | كَتَبْ | future passive | P | P3C |
هَيِتْكِتْبُو | كَتَبْ | future passive | P | P3C |
حَيِتْكِتْبُوا | كَتَبْ | future passive | P | P3C |
حَيِتْكِتْبُو | كَتَبْ | future passive | P | P3C |
Practical Applications
CJKI’s full-form lexicons can bring the following benefits to various NLP applications:
Machine translation
Greatly enhanced translation quality
Morphological analysis
Significantly simplified algorithms
Pedagogical applications
Automatic conjugation systems
Named-entity recognition (NER)
Dramatically improved
Reference Documents
Related Resources
ArabLEX
Arabic Full-Form Lexicon Includes all inflected, declined, and conjugated forms
APD: Arabic Phonetic Database
Phonemic transcriptions for core Arabic vocabulary
Palestinian Arabic Text-to-Speech System
A TTS system developed specifically for Palestinian Arabic