Search In this Thesis
   Search In this Thesis  
العنوان
Novel classification feature sets for source code plagiarism detection of java files /
الناشر
Eman Hosam Adel Elsayed ,
المؤلف
Eman Hosam Adel Elsayed
هيئة الاعداد
باحث / Eman Hosam Adel Elsayed
مشرف / Magda B. Fayek
مشرف / Amir F. Sorial
مشرف / Mayada M. Ali
تاريخ النشر
2021
عدد الصفحات
85 P. ;
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
هندسة النظم والتحكم
تاريخ الإجازة
24/10/2021
مكان الإجازة
جامعة القاهرة - كلية الهندسة - Computer Engineering
الفهرس
Only 14 pages are availabe for public view

from 104

from 104

Abstract

In programming learning environments, the pressure of delivering many assignments makes plagiarism become the easiest solution. This problem of plagiarism threatens the learning process and obstructs the evaluation fairness. Therefore, fast, automatic and accurate detection of source code plagiarism becomes of the essence.This research proposes novel classification feature sets to detect whether a Java file is plagiarized.The proposed feature sets are based on using histograms to summarize the similarity matrix of function signatures and comparing the lexical code similarity of each individual class pair. For testing the effectiveness, a source code plagiarism dataset that consists of 12K Java files was used. The results show a 4% improvement in F-Measure. A re-annotation to the dataset is performed and improves F-Measure by 7.5%.