Search In this Thesis
   Search In this Thesis  
العنوان
Developing a Predictive Model for Message
Propagation on Online Social Networks /
المؤلف
Elsharkawy,Sarah Abdelwahab Ali.
هيئة الاعداد
باحث / Sarah Abdelwahab Ali Elsharkawy
مشرف / Mohamed Ismail Roushdy
مشرف / Ghada Nasr Ali Hassan
مشرف / Tarek Mohamed Nabhan
تاريخ النشر
2018
عدد الصفحات
159p.:
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
Computer Science (miscellaneous)
تاريخ الإجازة
1/1/2018
مكان الإجازة
جامعة عين شمس - كلية الحاسبات والمعلومات - علوم الحاسب
الفهرس
Only 14 pages are availabe for public view

from 159

from 159

Abstract

In online social networks such as Twitter, tweeting allows users to share a
variety of content to their own followers. As tweets are retweeted from user to
user, large cascades of tweets propagation are formed. The growth of cascades
over time signals the popularity or lack thereof of the subject matter. The k-core
of an information graph is a common measure of a node connectedness in diverse
applications. The k-core decomposition algorithm categorizes nodes into k-shells
based on their connectivity. Previous research claimed that the super-spreaders
are those located at the k-core of a social graph and the nodes become of less
importance as they get assigned to a k-shell away from the k-core.
A meme represents an idea or a topic that spreads among users of an online
social network. Current research on modelling information di usion in social
media focuses on studying retweet cascades of individual tweets independently.
However, as a meme spreads, it evolves, and users adopt the meme in varying
manners. While retweet cascades can model the propagation of a single piece of
information among users, they are not useful in studying the propagation of the
whole meme.
In this thesis, we aim to study the information di usion from a wider perspective
where the information propagation of a meme is tracked rather than individual
tweets. And also, investigate the in
uence e ect of the super-spreaders,
located at the k-core, on the meme cascade growth.
First, the cascade growth of retweet cascades and the various features that
govern the growth are studied. We pose the question of whether the same feature
set can be used for cascade growth prediction of any dataset on Twitter.
Two types of growth prediction are addressed: structural and temporal. First,
a de nition of structural and temporal growth is devised. Then, an approach to
select the best of these features based on the dataset for better accuracy results is
proposed. We present and discuss the results of the most discriminating features
in predicting cascades’ growth and provide evidence that the pre-selection of features
improved the accuracy of the prediction task on the datasets. Moreover, an
evidence that the features governing the cascade growth vary from one dataset
to another is found.
Next, we generalize the modelling of retweet cascades to a modelling of the
di usion of a meme. To construct the meme adoption graph (MAG), messages
related to a meme are identi ed from the social network stream. Then, a recent
clustering algorithm is utilized to automatically extract and cluster tweets.
Next, three epidemic cascade construction models are evaluated and compared
to construct the MAG and represent a meme di usion. Also, a set of structural
characteristics derived from the MAG that describe the underlying meme
adoption pattern are proposed. An empirical study, using four real-world Twitter
datasets, is performed to demonstrate the e ectiveness of the proposed MAG.
Moreover, we work towards evaluating the in
uence span of the social media
super-spreaders, located at the k-core, in terms of the number of k-shells that
their in
uence is capable of reaching. Our methodology is based on the observation
that the k-core size is directly correlated to the graph size under certain
conditions. These conditions are explained and the correlation is utilized to assess
the e ectiveness of the k-core nodes for in
uence dissemination. The results
of the carried out experiments show a high correlation between the k-core size
and the sizes of the inner k-shells in the examined datasets. However, the correlation
starts to decrease in the outer k-shells. Further investigations have shown
that the k-shells, that were less correlated, exhibited a higher presence of spam
accounts.
Finally, the e ectiveness of using the k-core nodes, as seed nodes, for in
uence
maximisation is inspected. A measure is proposed to estimate the relative
strength of the k-core as an in
uence source among other sources of in
uence
contributing to the cascade development. And, we propose combining that measure
along with the correlation between the inner k-core size and the cascade
size to determine the in
uence domination of the k-core nodes, and hence the
e ectiveness of targeting these speci c nodes for in
uence maximization.