Search In this Thesis
   Search In this Thesis  
العنوان
Mining Customer Reviews on the Web /
المؤلف
Sayed, Alhassan Mohamed Mabrouk.
هيئة الاعداد
باحث / الحسن محمد مبروك
مشرف / صلاح محمد معوض
مشرف / محمد سيد قايد
مشرف / Rebeca P. Diaz Redondo
الموضوع
Consumer behavior Data processing. Consumers Research Data processing. Data mining. Marketing research.
تاريخ النشر
2021.
عدد الصفحات
151 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Computer Science Applications
الناشر
تاريخ الإجازة
2/1/2021
مكان الإجازة
جامعة بني سويف - كلية العلوم - الرياضيات وعلوم الحاسب
الفهرس
Only 14 pages are availabe for public view

from 171

from 171

Abstract

Recently, E-Commerce (EC) websites provide a large amount of useful information (e.g., product details and customer opinions) that exceed the human cognitive processing capacity. Within this context, we propose a novel approach to compare a set of products of the same type on EC websites. It provides Summarization and Exploration Opinions (SEOpinion) for each product. Our approach combines web scrapping and two main phases: Hierarchical Aspect Extraction (HAE) and Hierarchical Aspect-based Opinion Summarization (HAOS). First, the web scrapping crawls the product information from EC websites. The extracted product information includes two types of data: product details and customer reviews, which are, respectively, the input to the two previously mentioned phases. Product details are provided and embedded in the site template (HTML tags), while customer reviews are provided and attached as plain text. After that, and using the crawled product details, the HAE phase constructs a hierarchical-aspect set that will be used to describe the product. In parallel, but using the customer reviews, the HAOS phase obtains the opinion summarization. Our approach tries to improve the performance of opinion summarization by applying Deep Learning (DL) techniques based on BERT embedding. We have accomplished several experiments with the aim of testing the feasibility of DL models in both phases. For these tests, we have built a corpus in the computers domain (laptops to be more specific) using the information gathered the top five EC websites in this area. The experimental results showed that recurrent neural network (RNN) achieved better results (77.4% and 82.6% in terms of F1-measure for the first and second phases, respectively) than the convolutional neural network (CNN) and the support vector machine (SVM) technique.