Author: Ahmed,Reham Fathy Mahmoud ./ Title: Clustering and Relating Research Papers using Self-Organizing Maps

Search In this Thesis

العنوان

Clustering and Relating Research Papers using Self-Organizing Maps

المؤلف

Ahmed,Reham Fathy Mahmoud .

هيئة الاعداد

باحث / Reham Fathy Mahmoud Ahmed

مشرف / . Hani M. K. Mahdi

مشرف / Cherif Ramzi Salama

مناقش / . Hani M. K. Mahdi

مناقش / Cherif Ramzi Salama

تاريخ النشر

2021.

عدد الصفحات

122p

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة الكهربائية والالكترونية

الناشر

تاريخ الإجازة

1/1/2021

مكان الإجازة

جامعة عين شمس - كلية الهندسة - Computer and Systems Engineering

الفهرس

Only 14 pages are availabe for public view

from

120

from

120

Abstract

Thesis Summary
A Self-Organizing Map (SOM) is a powerful tool for data analysis, clustering, and dimensionality reduction. It is an unsupervised artificial neural network that maps a set of n-dimensional vectors to a two-dimensional topographic map. Being unsupervised, SOMs need little input to be successfully deployed. The only inputs needed by a SOM are its own parameters such as its size, number of iterations, and its initial learning rate. The quality and accuracy of the solution offered by a SOM depend on choosing the right values for such parameters. Different attempts have been made to use the genetic algorithm to optimize these parameters for random inputs or for specific applications such as the traveling salesman problem. To the best knowledge of the authors, no roadmaps for selecting these parameters were presented in the literature. In this thesis, we present the first results of a proposed roadmap for optimizing these parameters using the genetic algorithm and we show its effectiveness by applying it on the classical color clustering problem as a case study.
With the huge amount of published research papers, retrieving relevant information is a difficult task for any researcher. Effective clustering algorithms can help improve and simplify the retrieval process. After testing our proposed approach on the case study, we applied our proposed approach on automatic clustering of text documents. The proposed method is applied to cluster 3 scientific papers datasets using their keywords. Similar research papers were mapped closer to each other.
This thesis is divided into 7 Chapters as follows: chapter 1 is an introduction to the research in this thesis. Chapter 2 discusses document clustering. It defines document clustering highlighting the difference
viiiviii
between clustering and classification. The chapter then elaborates on the text documents clustering problem and its details, the text document pre-processing steps, word embedding, clustering algorithms, and clustering techniques. Chapter 3 introduces Self Organizing Maps, their properties, topologies, steps and applications. Chapter 4 briefly explains the genetic algorithms describing its steps, crossover and mutation operators, selection methods, its advantages and disadvantages, and genetic algorithm applications. Chapter 5 describes the proposed method and how it can applied on any clustering problem such as the colors case study or on clustering research papers. Chapter 6 lists the obtained results for both clustering problem showing that we outperform previous clustering techniques. Chapter 7 concludes the thesis’ work and discusses potentialdirections for future work.