Spanish Wordlist

Over 100,000 entries

Covers proper nouns and technical terms

Ideal for speech technology

Overview

CJKI maintains comprehensive monolingual wordlists for Chinese, Japanese, Korean (CJK) and Arabic covering some 30 million entries (and expecting to exceed 45 million entries soon).

Our Spanish Wordlist (SWL) covers about 100,000 canonical forms for general vocabulary includes part-of-speech codes and semantic classification type codes. This database is suitable for a variety of NLP applications for information retrieval like search engines, morphological analysis tools like tokenizers, and speech technology applications like text-to-speech synthesis.

The related SFULEX (Spanish Full-Form Lexicon) contains about 1,000,000 entries million entries in the monolingual edition and 26,000,000 entries in the bilingual edition.

Spanish Sample

Sample coming soon

Practical Applications

CJKI’s Comprehensive Wordlists are being used by some of the world’s leading IT companies for a variety of natural language processing applications, including:

Information retrieval

Morphological analysis

Word segmentation