Search In this Thesis
   Search In this Thesis  
العنوان
Clustering and Relating Research Papers using Self-Organizing Maps
المؤلف
Ahmed,Reham Fathy Mahmoud .
هيئة الاعداد
باحث / Reham Fathy Mahmoud Ahmed
مشرف / . Hani M. K. Mahdi
مشرف / Cherif Ramzi Salama
مناقش / . Hani M. K. Mahdi
مناقش / Cherif Ramzi Salama
تاريخ النشر
2021.
عدد الصفحات
122p
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة الكهربائية والالكترونية
الناشر
تاريخ الإجازة
1/1/2021
مكان الإجازة
جامعة عين شمس - كلية الهندسة - Computer and Systems Engineering
الفهرس
Only 14 pages are availabe for public view

from 120

from 120

Abstract

Thesis Summary
A Self-Organizing Map (SOM) is a powerful tool for data analysis, clustering, and dimensionality reduction. It is an unsupervised artificial neural network that maps a set of n-dimensional vectors to a two-dimensional topographic map. Being unsupervised, SOMs need little input to be successfully deployed. The only inputs needed by a SOM are its own parameters such as its size, number of iterations, and its initial learning rate. The quality and accuracy of the solution offered by a SOM depend on choosing the right values for such parameters. Different attempts have been made to use the genetic algorithm to optimize these parameters for random inputs or for specific applications such as the traveling salesman problem. To the best knowledge of the authors, no roadmaps for selecting these parameters were presented in the literature. In this thesis, we present the first results of a proposed roadmap for optimizing these parameters using the genetic algorithm and we show its effectiveness by applying it on the classical color clustering problem as a case study.
With the huge amount of published research papers, retrieving relevant information is a difficult task for any researcher. Effective clustering algorithms can help improve and simplify the retrieval process. After testing our proposed approach on the case study, we applied our proposed approach on automatic clustering of text documents. The proposed method is applied to cluster 3 scientific papers datasets using their keywords. Similar research papers were mapped closer to each other.
This thesis is divided into 7 Chapters as follows: chapter 1 is an introduction to the research in this thesis. Chapter 2 discusses document clustering. It defines document clustering highlighting the difference
viiiviii
between clustering and classification. The chapter then elaborates on the text documents clustering problem and its details, the text document pre-processing steps, word embedding, clustering algorithms, and clustering techniques. Chapter 3 introduces Self Organizing Maps, their properties, topologies, steps and applications. Chapter 4 briefly explains the genetic algorithms describing its steps, crossover and mutation operators, selection methods, its advantages and disadvantages, and genetic algorithm applications. Chapter 5 describes the proposed method and how it can applied on any clustering problem such as the colors case study or on clustering research papers. Chapter 6 lists the obtained results for both clustering problem showing that we outperform previous clustering techniques. Chapter 7 concludes the thesis’ work and discusses potentialdirections for future work.