Arab Names in Arabic

Over 220,000 vocalized variants

Every name is normalized and vocalized

Covers common spelling mistakes

Overview

The complexity of the Arabic script gives rise to a variety of Arabic spelling variants and spelling errors, which can lead to various problems in Arabic information processing. To this end, CJKI has developed the Database of Arab Names in Arabic (DANA), a one-of-a-kind resource that covers several hundred thousand Arabic script variants and common spelling mistakes.

A key feature of DANA is that every Arabic name is normalized and vocalized to produce a database of error-free, fully sanitized Arabic canonical forms. The vocalization was performed by a team of editors with the aid of tools and interfaces designed to achieve maximum efficiency.

The canonical forms are used both as a basis for creating accurate romanized variants for our Database of Arab Names (DAN), which contains over 6.5 million romanized variants, as well as Arabic orthographic variants for DANA.

Arab Names in Arabic

Practical Applications

DANA can be used for a variety of applications, including:

Machine translation

Compliance and risk management

Anti-money laundering and fraud detection

Database cleansing and normalization

Information retrieval and query processing

Entity recognition and extraction

Anti-terror and immigration control

Related Resources

DANA exists both as a standalone database, or it can be paired with our Database of Arab Names (DAN), and our other Arab proper noun databases.

DANA

Database of Arabic Names

Arabic personal names and their romanized variants

XOFAC

Expanded OFAC

Variants of Arab names in OFAC’s SDN list

DAFNA

Foreign Names in Arabic

Non-Arab names in Arabic and their variants