Search In this Thesis
   Search In this Thesis  
العنوان
Enhancing Feature selection for High
Dimensional Data /
المؤلف
Insawi, Jomana Yousef Ragheb.
هيئة الاعداد
باحث / جمانة يوسف راغب عيساوي
مشرف / عربي السيد كشك
مناقش / أنس يوسف
مناقش / أحمد شرف الدين
الموضوع
Computer Science - Vocational guidance. Electronic data processing - Vocational guidance.
تاريخ النشر
2023.
عدد الصفحات
176 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
علوم الحاسب الآلي
تاريخ الإجازة
20/7/2022
مكان الإجازة
جامعة المنوفية - كلية الحاسبات والمعلومات - قسم علوم الحاسب
الفهرس
Only 14 pages are availabe for public view

from 176

from 176

Abstract

Feature selection is an important process for finding the minimal subset of features from original dataset by removing redundant and irrelevant features. This process aims to improve the performance of the machine learning algorithm, minimize the computational time, reduce memory storage requirements, and reduce the complexity of building the classification model. Complete search is one of the effective approaches for feature selection issues, but it searches all combinations of features to select the relevant ones. However, due to its high cost, a complete search is not feasible, especially when dealing with high-dimensional datasets. Therefore, meta-heuristic algorithms have been widely used to deal with high-dimensional search spaces due to their stochastic behavior that helps to select the most optimal and relevant features without searching all combinations and without being trapped in the local optima.
This thesis proposes new hybrid-wrapper based meta-heuristic approaches for feature selection tasks to select the minimum number of features without significantly reducing the performance of classification for high-dimensional datasets with different characteristics. These hybrid approaches depend on the Particle Swarm Optimization exploration ability and the Grey Wolf Optimization exploitation capability to prevent the searching mechanism from being trapped on the local optima.
For evaluation purposes, our work is assessed using nine of the most challenging medical datasets. These datasets are of high-dimensional, low-instance properties. Our work is also compared against some well-known meta-heuristic algorithms that have been utilized in recent researches to resolve feature selection tasks, such as Genetic Algorithm, Particle Swarm Optimization, Grey Wolf Optimization, Ant Lion Optimization, Whale Optimization Algorithm, Bat Optimization, Dragonfly Algorithm, Harris Hawks Optimization, Gravitational Search Algorithm, Teaching Learning Based Optimization, and Salp Optimization Algorithm. The K-Nearest Neighbors (KNN) classification algorithm is used for measuring the classification performance of all approaches in this thesis. The obtained outcomes show that our approaches outperformed other optimizers in selecting minimal relevant features with high classification accuracy. This means that our approaches succeeded in achieving a global optimal feature subset without being trapped in a local optimum.