الفهرس | Only 14 pages are availabe for public view |
Abstract Metagenomics holds a great promise for a better understanding of the function and diversity of viral and microbial communities because only a minority of viruses and microorganisms can be isolated in pure culture. Viral sequence identification is considered one of the essential steps in analyzing metagenomic data. Although various methods use homology and statistical methods to identify viral sequences, these methods encounter many limitations because of the limited genomic databases and the high viral genome diversity. In this thesis, an attention deep neural network model was used for identifying viral reads among metagenomic data. This method is used to purify mixed metagenomic data from viral contamination. The proposed neural network model is able to outperform state-of-the-art tools of viral identification from high throughput sequences on the same testing data. According to these results, our model would help to understand viruses in various microbial communities and discovering new viruses. |