Author: AbdelAziz, Nesma AbdelAziz Hassan ./ Title: Automation of Medical Entities Relationships Extraction in Precision Medicine Domain Using Text Mining /

Search In this Thesis

العنوان

Automation of Medical Entities Relationships Extraction in Precision Medicine Domain Using Text Mining /

المؤلف

AbdelAziz, Nesma AbdelAziz Hassan .

هيئة الاعداد

باحث / م/ نسمة عبد العزيز حسن عبد العزيز

مشرف / أرانيا أحمد عبد العظيم أبو السعود

مشرف / ابراهيم اسماعيل ابراهيم

مشرف / عمرو محمد رفعت جودي

الموضوع

Medical Entities .

تاريخ النشر

2023.

عدد الصفحات

139 p. ;

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة الكهربائية والالكترونية

تاريخ الإجازة

1/1/2023

مكان الإجازة

جامعة الفيوم - كلية الهندسة - قسم الهندسة الكهربية

الفهرس

Only 14 pages are availabe for public view

from

139

from

139

Abstract

More than 30 million biomedical research articles have been published, and this information is extensively distributed and plays a significant role in the advancement of biomedical science. Mining relations between medical entities and understanding the interactions is essential for a variety of tasks, including fundamental biomedical research, drug discovery, and precision medicine. In the proposed work two models are introduced, the main aim of both systems are to solve the relation extraction task. The first proposed approach takes advantage of pre-trained language models to extract the word embeddings of our input text rather than using traditional feature extraction techniques and then feed the extracted embedding to machine learning classifiers to perform the classification task. The proposed classification model was tested and evaluated on two benchmark datasets ChemProt and DDI. It is proved that traditional machine learning techniques can compete with recent approaches. Experiments on the ChemProt dataset show that the performance of our model’s f1-score is 74.6% and on the DDI dataset f1-score is 73.7% which is higher than other models. The Second proposed model aim to solve a real-world problem, The pipeline first retrieves articles mainly abstracts from PubMed (is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics), the resulting abstracts are then subjected to filtering and preprocessing to extract predefined relations from all abstracts and then the generated dataset is used to fine-tune a Bidirectional Encoder Representations from Transformers (BERT) model to carry out relation extraction. In the ChemProt dataset, our method achieved an F1-score of 85.8% and achieved a F1-score of 88.5% on DDI dataset.