![]() | Only 14 pages are availabe for public view |
Abstract Human-computer interaction (HCI) is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use. There are several interaction styles for human computer interaction such as command line interface, menus, natural language, and others. Speech is one of the main types of communication in human daily life. For centuries people have tried to develop machines that can understand and produce speech as humans do so naturally. There exist people that have some disability preventing them to use normal devices to interact with computer. Speech may be a good alternative way to do this and also for normal people. Automatic speech recognition (ASR) systems can fall into several categories according to the nature of the utterance they are thought to recognize. These categories are based on the ability of ASR to specify when a speaker starts and finishes an utterance. ASR systems should be able to differentiate among spoken word correctly. This thesis presents a model for isolated word speech recognition. The most important parts of a speaker recognition system are the feature extraction and the recognition methods.The main objective is to extract, characterize and recognize the acoustic information in the speech signal. Mel Frequency Cepestral Coefficient (MFCCS) features are the most widely used feature in speech recognition and also in speaker recognition. Recently wavelet features and in particular wavelet packet features (WPF) is used and provide better recognition rate compared to MFCCS. Multiresolution capabilities along with the powerful orthonormal bases provided by wavelet packets allow an effective manipulation of the frequency subbands. Most research used WPF in speaker recognition, but here it is used for speech recognition. Admissible wavelet packet (AWP) features provides better recognition rate compared to many others features. In the proposed model end point detection and wavelet packet decomposition along with hamming window is used to extract speech feature. These features become an input to the Gaussian mixture hidden Markov model recognizer. The performance of our proposed model is compared with the AWP. The result shows that our feature extraction method achieves recognition rate of 98% compared with the result given using AWP method which achieve 96.5%. |