Author: Elsharkawy,Sarah Abdelwahab Ali./ Title: Developing a Predictive Model for Message<br>Propagation on Online Social Networks /

Search In this Thesis

العنوان

Developing a Predictive Model for Message
Propagation on Online Social Networks /

المؤلف

Elsharkawy,Sarah Abdelwahab Ali.

هيئة الاعداد

باحث / Sarah Abdelwahab Ali Elsharkawy

مشرف / Mohamed Ismail Roushdy

مشرف / Ghada Nasr Ali Hassan

مشرف / Tarek Mohamed Nabhan

تاريخ النشر

2018

عدد الصفحات

159p.:

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

1/1/2018

مكان الإجازة

جامعة عين شمس - كلية الحاسبات والمعلومات - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

159

from

159

Abstract

In online social networks such as Twitter, tweeting allows users to share a
variety of content to their own followers. As tweets are retweeted from user to
user, large cascades of tweets propagation are formed. The growth of cascades
over time signals the popularity or lack thereof of the subject matter. The k-core
of an information graph is a common measure of a node connectedness in diverse
applications. The k-core decomposition algorithm categorizes nodes into k-shells
based on their connectivity. Previous research claimed that the super-spreaders
are those located at the k-core of a social graph and the nodes become of less
importance as they get assigned to a k-shell away from the k-core.
A meme represents an idea or a topic that spreads among users of an online
social network. Current research on modelling information diusion in social
media focuses on studying retweet cascades of individual tweets independently.
However, as a meme spreads, it evolves, and users adopt the meme in varying
manners. While retweet cascades can model the propagation of a single piece of
information among users, they are not useful in studying the propagation of the
whole meme.
In this thesis, we aim to study the information diusion from a wider perspective
where the information propagation of a meme is tracked rather than individual
tweets. And also, investigate the in
uence eect of the super-spreaders,
located at the k-core, on the meme cascade growth.
First, the cascade growth of retweet cascades and the various features that
govern the growth are studied. We pose the question of whether the same feature
set can be used for cascade growth prediction of any dataset on Twitter.
Two types of growth prediction are addressed: structural and temporal. First,
a denition of structural and temporal growth is devised. Then, an approach to
select the best of these features based on the dataset for better accuracy results is
proposed. We present and discuss the results of the most discriminating features
in predicting cascades’ growth and provide evidence that the pre-selection of features
improved the accuracy of the prediction task on the datasets. Moreover, an
evidence that the features governing the cascade growth vary from one dataset
to another is found.
Next, we generalize the modelling of retweet cascades to a modelling of the
diusion of a meme. To construct the meme adoption graph (MAG), messages
related to a meme are identied from the social network stream. Then, a recent
clustering algorithm is utilized to automatically extract and cluster tweets.
Next, three epidemic cascade construction models are evaluated and compared
to construct the MAG and represent a meme diusion. Also, a set of structural
characteristics derived from the MAG that describe the underlying meme
adoption pattern are proposed. An empirical study, using four real-world Twitter
datasets, is performed to demonstrate the eectiveness of the proposed MAG.
Moreover, we work towards evaluating the in
uence span of the social media
super-spreaders, located at the k-core, in terms of the number of k-shells that
their in
uence is capable of reaching. Our methodology is based on the observation
that the k-core size is directly correlated to the graph size under certain
conditions. These conditions are explained and the correlation is utilized to assess
the eectiveness of the k-core nodes for in
uence dissemination. The results
of the carried out experiments show a high correlation between the k-core size
and the sizes of the inner k-shells in the examined datasets. However, the correlation
starts to decrease in the outer k-shells. Further investigations have shown
that the k-shells, that were less correlated, exhibited a higher presence of spam
accounts.
Finally, the eectiveness of using the k-core nodes, as seed nodes, for in
uence
maximisation is inspected. A measure is proposed to estimate the relative
strength of the k-core as an in
uence source among other sources of in
uence
contributing to the cascade development. And, we propose combining that measure
along with the correlation between the inner k-core size and the cascade
size to determine the in
uence domination of the k-core nodes, and hence the
eectiveness of targeting these specic nodes for in
uence maximization.