Author: Mohamed, Sara Maher Salem./ Title: Enhancement text to image synthesis using deep learning /

Search In this Thesis

العنوان

Enhancement text to image synthesis using deep learning /

المؤلف

Mohamed, Sara Maher Salem.

هيئة الاعداد

باحث / ساره ماهر سالم محمد

مشرف / ماهر شديد زايد

مشرف / محمد لؤي رمضان

مشرف / ماهر شديد زايد

تاريخ النشر

2021

عدد الصفحات

103p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

النظرية علوم الحاسب الآلي

تاريخ الإجازة

1/1/2021

مكان الإجازة

جامعة بنها - كلية العلوم - قســـــم الرياضيات

الفهرس

Only 14 pages are availabe for public view

from

133

from

133

Abstract

Abstract
Text to image synthesis is a currently developing technology with multiple real-life applications. The main objective of this thesis is to develop an intelligent system to convert Arabic text to realistic image, which helps in raising the efficiency of image and text processing. Text-image is not different from language translation problems. In the same way, similar semantics can be encoded in two different languages, images and text are two different languages to encode related information. None the less, these problems are totally different because text-image or image-text conversions are highly multimodal problems. In this work, we propose our model for generating 256×256 realistic images from Arabic text descriptions. The relation between an Arabic word in a sentence and its part in an image with Deep Attentional Multimodal Deep Similarity Model is implemented in this work. The DAMSM learns two neural networks that map sub-regions of the image and Arabic words of the sentence to a common semantic space. It achieves strong performance on Arabic-text encoder and image encoder.
In this work, Generative Adversarial Networks and their application in the problem of text to image synthesis are presented.
iv
An explanation of how the current state-of-the-art models of text-to-image work at the intersection between Computer Vision and Natural Language Models is showed.
The model is trained from scratch to the Modified-Arabic dataset from Caltech-UCSD Birds 200-2011 dataset. The proposed model provides a new model for converting Arabic text into realistic images. A mutation occurs in the use of Arabic as the first use to convert Arabic texts into real images.
This approach is implemented by using Python programming language. The experimental results showed that the proposed text to image system has better performance with respect to new Arabic text to image systems. The proposed model boosts a good reported inception score by 3.42 ± .05 on the CUB dataset.