Search In this Thesis
   Search In this Thesis  
العنوان
ranking algorithms on the world wide web /
المؤلف
M.farag, fatima el-sayed A.mouti.
الموضوع
information systems.
تاريخ النشر
2004
عدد الصفحات
1 VOL. (various paging’s) :
الفهرس
Only 14 pages are availabe for public view

from 205

from 205

Abstract

The web became popular in less than ten years and has grown exponentially to an estimated number of pages of over 15 billion. This exponential growth poses a difficult scalability problem to the web surfers, particularly in how to retrieve the most relevant web documents and accordingly in how to rank them reasonably so that the mostly related documents to the posed user query are on top of the answer set, thus reduce the load that lies on the users in surfing all the result set which is almost impossible. To solve such a problem, ranking techniques were introduced. There are many ranking techniques in the literature. The early techniques were based on extracting the most identifying keywords from the content of the web pages. Although, such techniques are successful in the bibliometrics field, they did not show a big success on the web due to the diversity and large volume of data available on the web. Other techniques based on the link structure between the web pages were later introduced like the PageRank and the Kleinberg algorithms and showed some success. However, both these techniques still tend to turn-up many non-relevant documents, which make their retrieval precision very low. We developed a hybrid technique which is based on combining between the content- and link- based ranking schemes. Specifically, the vector-space and the PageRank models were used for that purpose. We set seven experiments to test the validity and usefulness of this technique compared to the other ones. The results indicate that our technique is better than some of the other techniques, specifically in terms of efficiency.
We also performed a comparative study between the most famous web search engines and measured their effectiveness in order to determine which of them perform better, thus would be an aid in determining which ranking strategy used is better when applied on the web. Based on this comparison, it is found that Google performs the best in terms of precision and recall, followed by Go, which means that link-based strategy work well on the web collections in general. However, in order to have substantial improvement, a combination between this technique and the web pages’ content proved to be a mandate.