Author: Mazen، Fatma Mazen Ali./ Title: Face expressions recognition<br>Face expressions recognition using deep learning /

Search In this Thesis

العنوان

Face expressions recognition
Face expressions recognition using deep learning /

المؤلف

Mazen، Fatma Mazen Ali.

هيئة الاعداد

باحث / فاطمة مازن على مازن

مشرف / رانيا أحمد عبد العظيم ابو السعود

مشرف / عمرو محمد رفعت

مناقش / احمد علي نشات

الموضوع

deep learning

تاريخ النشر

2021

عدد الصفحات

84 p ;

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

الهندسة الكهربائية والالكترونية

تاريخ الإجازة

2/2/2021

مكان الإجازة

جامعة الفيوم - كلية الهندسة - الهندسة الكهربية

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Human face expression recognition is one of the most challenging tasks in social
communication. It plays a crucial role in the area of computer vision and human-machine
interaction. It is an active research area that has massive applications in the medical field, crime
investigation, marketing, online learning, automobile safety, and video games. The first part of
this thesis describes a deep neural network model-based framework for recognizing the seven
main types of facial expression, which are found in all cultures. These are anger, disgust, fear,
happiness, sadness, surprise, and neutrality. The proposed methodology involves four stages:
(a) pre-processing the FER2013 dataset through relabeling to avoid misleading results, and
getting rid of non-face and non-frontal faces; (b) design of an efficient stable Cycle Generative
Adversarial Network (CycleGAN), which provides unsupervised expression-to-expression
translation. The CycleGAN has been designed and trained with a new cycle consistency loss.
(c) Generating new images to overcome the class imbalance, especially for the disgust class;
and finally (d) building the deep neural network architecture for recognizing the face sign
expression, using the pre-trained VGG-Face model with vggface weights.
The model has been tested on the original version of the FER2013 dataset and the
modified balanced version. Also, the designed model has been utilized for identifying
face sign expressions in real-time images after detecting faces using Multi-task Cascaded
Convolutional Networks (MTCNN). Results show that the model is robust. The designed
model run time to recognize a face sign is 0.44 seconds. Besides, the average test
accuracy has been increased from 64% for the original FER2013 dataset to 91.76% for
the modified balanced version using the same transfer learning model.
The second part of the thesis encompasses the design of a GPU-accelerated face expression
recognition system for expression recognition in real-time video sequences. Any face
expression recognition system encompasses two basic stages, face detection for face
localization and facial expression recognition for expression classification. Unfortunately, face
detection algorithms require intensive computational power, which makes them an inadequate
choice for performing face detection tasks in real-time video sequences. To capture real-time
video streams in python, the OpenCV library which is an open-source python has been used.
To overcome processing limitations, computations should be pushed to the graphics
processing unit (GPU) using NVIDIAs Compute Unified Device Architecture (CUDA). But
the available OpenCV versions, unfortunately, don’t have CUDA support to achieve optimal
performance on GPU-enabled workstations. For optimal utilization of hardware resources,
there should have been a solution to cope with this gap between available hardware resources
and python libraries that use CPU as backend. OpenCV CUDA module which is a set of
classes and functions to utilize CUDA computational capabilities is the clue. It is an influential
tool for the fast implementation of CUDA-accelerated computer vision algorithms. This part of
the thesis encompasses the compilation of the OpenCV library from scratch with CUDA and
CUDNN support, which is the cornerstone of the real-time face expression system from video
streams. In the Face Expression Recognition stage, the compiled model on the new relabeled
balanced FER2013 dataset has been used. The designed scheme was employed in real-time
video processing to classify frames into one of the universal facial expressions Anger, Disgust,
Fear, Happiness, Sadness, Surprise, and Neutrality.
For the face detection stage, Haar Cascaded and deep learning were used and tested
using both CPU and GPU as backend and the results were compared. In terms of Frame
Per Second (FPS) metric of the overall video stream starting which includes face
detection and facial expression recognition, there was a great improvement in FPS after
using GPU as backend in both Haar and deep learning thanks to the CUDA module of the
newly compiled version of OpenCV. Using OpenCV’s Deep Neural Network (DNN)
module with NVIDIA GPUs, CUDA, and cuDNN, the FPS has been improved
from 7.41 on the CPU to 23.12 achieving 312.01% faster inference for feature-based
approach. The inference speed has been also improved by up to 169.74% for the deep
learning-based approach as the FPS has been increased from 30.30 using CPU as a
backend to 51.43 using GPU as backend. Deep learning is recommended to be used in the
face detection stage as it was found to be more accurate and faster than the Haar cascade.