![]() | Only 14 pages are availabe for public view |
Abstract Bioinformatics, the application of computational techniques to analyze the information associated with bio-molecules on a large-scale, has now firmly established itself as a discipline in molecular biology, and encompasses a wide range of subject areas from structural biology, genomics to gene expression studies. Proteomics, is the branch in Bioinformatics that studies the proteins structure, interactions, and functions within cells and organisms using computational and statistical approaches. Discovering the protein interactions and its binding sites plays an important role in the biological activities research and the drug design . The number of available protein structures still lags far behind the number of known protein sequences, which makes it important to predict the interactions using only sequence information. Our goal is to identify the proteins interactions and extract the binding sites only from the primary structure features. The main feature used in this work is the sequence alignment scores, which is the resultant from applying Smith-Waterman algorithm on the two protein sequences. The statistical t-test shows a significant difference between the alignment scores of interacting proteins and non-interacting proteins. Other non parametric classifiers are also be used to predict the proteins interactions. The identification of the binding sites between interacting proteins is investigated using two novel techniques. The techniques are based on sequence mutation analysis. The investigations show a need to study the factors affecting the sequence alignment of the mutant sequences in more details to develop the binding sites extraction. The results of this thesis can be considered to be profound step in the path of addressing the problem of extracting the binding sites of interacting proteins in the future taking into account the results and limitations outlined in this work. |