Search In this Thesis
   Search In this Thesis  
العنوان
Data integration framework for multi-objective queries /
الناشر
Ali Eid Ali Zidane Elqutaany ,
المؤلف
Ali Eid Ali Zidane Elqutaany
هيئة الاعداد
باحث / Ali Eid Ali Zidane Elqutaany
مشرف / Osman Hegazi
مشرف / Ali H. Elbastawissy
مناقش / Osman Hegazi
تاريخ النشر
2019
عدد الصفحات
161 Leaves :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
Information Systems
تاريخ الإجازة
17/11/2019
مكان الإجازة
جامعة القاهرة - كلية الحاسبات و المعلومات - Information Systems
الفهرس
Only 14 pages are availabe for public view

from 179

from 179

Abstract

Nowadays, organizations cannot satisfy their information needs from one data source. Moreover, multiple data sources across the organization fuels the need for data integration. Data integration system{u2019}s users pose their queries to the integration system in terms of an integrated schema and expect duplicate-free and complete answers. In order to meet users{u2019} expectations; data integration is not limited to getting the answers from the sources, but it is extended to detect and resolve the data quality problems appeared due to the integration. Three processes: data integration, entity matching and entity resolution are mandatory for an integration framework to provide duplicate free and complete answers for user{u2019}s queries. The existing data integration frameworks are performing their processes independently from each other, where the data is integrated from the sources, then the duplicates are detected regardless how data was integrated, and finally the duplicates are resolved regardless how the other two processes were performed. In this thesis, a new data integration framework is introduced to provide complete and duplicate free answers for user{u2019}s queries, as it performs all its processes with complete interfacing and interleaving. The interfacing and interleaving between the processes provide significant enhancements in the effectiveness and completeness of the provided answers. The most crucial component in any data integration framework is the mappings of the data sources to the integrated schema, hence the first contribution in the proposed framework is a new mapping approach which introduced to map not only the elements of the integrated schema as performed by the existing approaches, but also it maps other elements required in detecting and resolving the duplicates. This approach provides means to facilitate future extensibility of the integration system and provides a linkage between the processes of the framework