الفهرس | Only 14 pages are availabe for public view |
Abstract Online social networks (OSNs) have become one of the most important sources of personal data. Today, millions of people use social media sites such as Twitter, Facebook, and LinkedIn to communicate with one another. The increasing amount of data collected from these sites attract the interest of analysts and researchers in extracting knowledge from them. Thus, sharing social network data for analysis purposes has become necessary, as structural data analysis and analyzing interactions can assist a range of sectors, including marketing and business. Social data contains a lot of sensitive information about people, so publishing it in its raw form without perturbation could leave it vulnerable to many attacks regarding privacy. Network data anonymization is a successful procedure used by researchers to prevent an adversary from revealing the user’s identity. Such an attack is called a re-identification attack. However, Anonymization is a tricky task where the original graph structure should be maintained as much as feasible within the anonymization process. The methodology of this thesis seeks to produce a high utility method that could preserve the topological structure of the graph during the anonymization process. Most existing solutions used edge-perturbation methods directly without any concern regarding the structural information of the graph. While that preserving the graph structure during the anonymization process requires keeping the most important edges in the graph without any modifications. The Proposed Framework introduces a high utility K-degree anonymization method (KDA) that could utilize edge betweenness centrality (EBC) as a measure to map the edges that have a central role in the graph. To demonstrate the efficiency of this work, the experimental results proved that preserving the edges with high EBC values during the modification process of the graph would lead the anonymization algorithm to better preservation for the most important structural properties of the graph such as Average Path Length (APL), Closeness (CLN), and Betweenness (BTW). This method also proved its efficiency for preserving the community structure for many algorithms such as Girvan Newman (GN), Label Propagation (LP), and Walktrap (WT) as a trade-off between graph utility and privacy. |