Author: Alhathik,Haitham Ahmed Mohammed ./ Title: DATA ANALYSIS TOOLS FOR BIOINFORMATICS\

Search In this Thesis

العنوان

DATA ANALYSIS TOOLS FOR BIOINFORMATICS\

المؤلف

Alhathik,Haitham Ahmed Mohammed .

هيئة الاعداد

باحث / هيثم احمد محمد

مشرف / ياسر مصطفى قدح

مشرف / ناهد حسين سلومة

مناقش / ساميى عبد الرازق مشالى

تاريخ النشر

2007.

عدد الصفحات

136p.;

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

هندسة النظم والتحكم

تاريخ الإجازة

1/5/2007

مكان الإجازة

جامعة القاهرة - كلية الهندسة - كهربة اتصالات

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Bioinformatics, the application of computational techniques to analyze the information associated
with bio-molecules on a large-scale, has now firmly established itself as a discipline in molecular
biology, and encompasses a wide range of subject areas from structural biology, genomics to gene
expression studies. Proteomics, is the branch in Bioinformatics that studies the proteins
structure, interactions, and functions within cells and organisms using computational and
statistical approaches. Discovering the protein interactions and its binding sites plays an
important role in the biological activities research and the drug design . The number of available
protein structures still lags far behind the number of known protein sequences, which makes it
important to predict the interactions using only sequence information.
Our goal is to identify the proteins interactions and extract the binding sites only from the
primary structure features. The main feature used in this work is the sequence alignment scores,
which is the resultant from applying Smith-Waterman algorithm on the two protein sequences.
The statistical t-test shows a significant difference between the alignment scores of interacting
proteins and non-interacting proteins. Other non parametric classifiers are also be used to predict
the proteins interactions.
The identification of the binding sites between interacting proteins is investigated using two
novel techniques. The techniques are based on sequence mutation analysis. The investigations show a
need to study the factors affecting the sequence alignment of the mutant sequences in more details
to develop the binding sites extraction.
The results of this thesis can be considered to be profound step in the path of addressing the
problem of extracting the binding sites of interacting proteins in the future taking into account
the results and limitations outlined in this work.