Spoken Arabic digits recognition using deep learning / AbdulAziz Saleh Mahfoudh Ba Wazir

Date of Award

9-1-2018

Thesis Type

masters

Document Type

Thesis (Restricted Access)

Divisions

eng

Department

Faculty of Engineering

Institution

University of Malaya

Abstract

The dissertation proposes an Arabic digits speech recognition model utilizing recurrent neural network. Speech Recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after been processed for noise reduction and digits seperation. Digit speeches extracted features are fed into a network with long short-term memory (LSTM) cells. The LSTM cells have the capability to solve problems associated with temporal dependencies and require learning long-term and solve the vanishing gradient problems associated with RNN. A dataset of 1040 samples of spoken Arabic digits from different dialects is used in this study where 840 samples used to train the network and another 200 samples are used for testing purpose. The model training is carried out using GPU. The LSTM model learning parameters is tuned for optimization purpose to achieve higher accuracy of 94% during model training. The testing results of the finest tuned parameters model shows that the LSTM model is 69% accurate in recognizing spoken Arabic digits samples. Model highest accuracy obtained when recognizing the digit zero with 80%.

Note

Dissertation (M.A.) - Faculty of Engineering, University of Malaya, 2018.

9521-abdulaziz.pdf (1693 kB)

This document is currently not available here.

Share

COinS