Author: Sabbeh, Sahar Fawzy./ Title: Web mining for web personalization /

Search In this Thesis

العنوان

Web mining for web personalization /

المؤلف

Sabbeh, Sahar Fawzy.

هيئة الاعداد

باحث / سحر فوزي سبح

مشرف / علاء الدين محمد رياض

مشرف / حمدي كمال المنير

مناقش / مفرح محمد سالم

مناقش / محي محمد هدهود

الموضوع

web personalization.

تاريخ النشر

2008.

عدد الصفحات

123 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Information Systems

تاريخ الإجازة

1/1/2008

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - نظم المعلومات

الفهرس

Only 14 pages are availabe for public view

from

123

from

123

Abstract

There is no doubt that the World Wide Web is now considered as a main source of information acquisition, its importance is increasing noticeably. This huge and heterogeneous amount of information resulted in encountering difficulties navigating the web for user. To address this information overload problem, web sites provide personalized recommendations to end users to achieve efficient navigation. Personalization provides users with the information they need without asking for it explicitly. In this context, Web Mining has shown to be a practical technique to discover information hidden into Web-related data, in particular, Web Usage Mining extracts knowledge from web users’ navigational data using data Mining (DM) techniques. The knowledge extracted from the analysis of historical information of a web server log can be used to develop personalization.
One of the most promising applications of web personalization is Recommender systems. Recommender systems aim at providing personalized links to users without explicit user request based on tracking users’ navigation using server log (usage data).
Typically, recommender systems are made up of two phases. One, which is usually executed off-line. This phase analyzes server access logs in order to find suitable patterns. The second phase is usually executed online, involves pattern analysis which classifies the active requests according to the off-line analysis. The two phases recommender system suffers from loosely coupled integration of the system with the web server ordinary activity and asynchronous cooperation among recommender system components. As the off-line component has to be periodically performed in order to keep the patterns up-to-dated, but the frequency of the updates is a problem that has to be solved on a case specific basis.
In e-learning domain there has been the disadvantage of system isolation in terms of not taking into consideration the open web environment and the similarity of presented learning documents to user needs.
In this work, we try to address the abovementioned shortcomings by presenting a completely online recommender system that collapses both offline and online modules of the typical recommender system into a single module providing users with a set of recommended pages and a set of ranked documents that are thought to be relevant to user request. User request can be identified either implicitly through system observation of user navigation or explicitly through an ordinary user search interface. Additionally, the proposed system integrates the surrounding open web in recommendation generation process, namely, Google Scholar. Thus, system outputs are as follows:
1- A set of ranked pages based the analysis of usage data.
2- A set of ranked documents from a local database. These documents are ranked based on the relevancy to user request. Which, as denoted, can be acquired implicitly or explicitly.
3- A set of learning documents from a remote web site, namely, Google Scholar. Thus, instead of having a direct interaction between users and the open web, system provides users with a set of recommendations from a remote site that are thought to be relevant to users without asking for it explicitly.
In order to measure the relevancy of the ranked recommendations, three performance measures, are used to evaluate the recommended documents. Namely, Precision, Recall and F-measure, thus, results on a sample of learning documents against a set of user queries showed that ranked materials are nearly 80% efficient with respect to user request. In addition a comparison between the relevancy of ranked documents presented by proposed system and non-ranked documents presented by old system is made, showing that the ranked documents are more relevant to user request.