Author: Diaa Eldin Mohamed Mohamed Elsayed Abofayed/ Title: Standardization, enrichment, and computerization of Arabic-English Dictionaries /

Search In this Thesis

العنوان

Standardization, enrichment, and computerization of Arabic-English Dictionaries /

الناشر

Diaa Eldin Mohamed Mohamed Elsayed Abofayed ,

المؤلف

Diaa Eldin Mohamed Mohamed Elsayed Abofayed

تاريخ النشر

2016

عدد الصفحات

135 Leaves :

الفهرس

Only 14 pages are availabe for public view

from

153

from

153

Abstract

Building software systems or applications for natural language processing (NLP) often requires large and rich quantities of information and that information is stored in a lexicon, lexical database, or knowledge base. The manual construction of these lexicons or databases requires linguistic experts and takes long time, large cost, and considerable effort. Subsequently, the need and necessity of automatic methods for constructing lexicons or lexical databases from Machine Readable Dictionaries (MRDs) is emerged. This research field goes in to two directions: computerizing the traditional dictionaries directly and extracting linguistic information to build lexicons or databases. In the beginning of MRD research, the ultimate goal was to convert any traditional dictionary into standalone complete lexicon or lexical knowledge base. Unfortunately, MRD research concluded that MRDs are neither efficient nor sufficient, in terms of the quantity and quality of information, to build standalone complete lexicon or lexical knowledge base. Consequently, lexicon should be built from more than one lexical resource such as dictionaries, corpora, etc. Furthermore, the need of diversity resources to build lexicon leads to the necessity of sharing, integration, and standardization of these resources. Although MRD research has been weakened and the research interest is shifted to the corpora as better resources for lexical information, the MRD research is still important for linguistically poor languages such as Arabic. Furthermore, Arabic has a special reason for use MRD research; the Arabic language has rich and huge heritage of old and modern traditional dictionaries that need to be structured, computerized, and standardized. This thesis proposes and implements a general methodology of computerization, enrichment, and standardization for Arabic dictionaries in general and Arabic-English dictionaries in particular. The study includes three tasks: (a) structuring definitions of an Arabic-English dictionary and extracting lexical information from these definitions; (b) enrichment of linguistic information in the definitions which already automated in the first task, including supplement incomplete information or supplying new information; and (c) ISO standardization for all information in dictionary definitions as well as enriching information