Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A study of prosodic features for Indonesian speech recognition

View through CrossRef
Utterance-type information has been used been used in spoken dialogue system, speech recognition system and translation machine. In a typical spoken dialogue system, a user can ask question or give information to the system. In another side, the spoken dialogue system should be capable of recognizing its user intention to give the correct response to him/her. In this dissertation, the automatic utterance-type recognizer is proposed to distinguish declarative questions from statements in Indonesian speech. Since utterances in these two types have the same words with the same order and differ only in their intonations, their classification requires not only a word recognizer, but also an intonation recognizer. At first, the utterance-type recognizer is designed based on Fujisaki model. The utterance-type recognizer uses a combination of the Fujisaki-model-parameters as the features to recognizt the two utterance type. The best performance of the Fujisaki model based utterance-type recognizer is achieved using a combination of a fraction value of F[subscript b] : F[subscript b]/100 the amplitude of last accent command, and the magnitude of last phrase command as the input of the neural neetworks. However, the Fujisaki parameters extractor is too complicated to be implemented in an automatic recognition system. Therefore, the utterance-type recognizer is developed using the polynomial coefficients of the pitch contours of the sentence's final word. The automatic utterance-type recognizer using polynomial expansion consists of a pitch contour extractor, normalizer, feature extractor, classifier, and an automatic utterance segmentation module. The pitch contour of each utterance type i analyzed to investigate the final word of the two utterance type. To create the automatic utterance segmentation module, an Indonesian acoustic model is designed. The evaluation confirms that the method using the final word and polynomial expansion is effective to distinguish declarative questions and statements in Indonesian speech.
Office of Academic Resources, Chulalongkorn University
Title: A study of prosodic features for Indonesian speech recognition
Description:
Utterance-type information has been used been used in spoken dialogue system, speech recognition system and translation machine.
In a typical spoken dialogue system, a user can ask question or give information to the system.
In another side, the spoken dialogue system should be capable of recognizing its user intention to give the correct response to him/her.
In this dissertation, the automatic utterance-type recognizer is proposed to distinguish declarative questions from statements in Indonesian speech.
Since utterances in these two types have the same words with the same order and differ only in their intonations, their classification requires not only a word recognizer, but also an intonation recognizer.
At first, the utterance-type recognizer is designed based on Fujisaki model.
The utterance-type recognizer uses a combination of the Fujisaki-model-parameters as the features to recognizt the two utterance type.
The best performance of the Fujisaki model based utterance-type recognizer is achieved using a combination of a fraction value of F[subscript b] : F[subscript b]/100 the amplitude of last accent command, and the magnitude of last phrase command as the input of the neural neetworks.
However, the Fujisaki parameters extractor is too complicated to be implemented in an automatic recognition system.
Therefore, the utterance-type recognizer is developed using the polynomial coefficients of the pitch contours of the sentence's final word.
The automatic utterance-type recognizer using polynomial expansion consists of a pitch contour extractor, normalizer, feature extractor, classifier, and an automatic utterance segmentation module.
The pitch contour of each utterance type i analyzed to investigate the final word of the two utterance type.
To create the automatic utterance segmentation module, an Indonesian acoustic model is designed.
The evaluation confirms that the method using the final word and polynomial expansion is effective to distinguish declarative questions and statements in Indonesian speech.

Related Results

Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
Discontinuous noun phrases in Vietnamese
Discontinuous noun phrases in Vietnamese
Since Vietnamese is an isolating language, word order plays an important role in identifying the function of a particular word. Yet in some contexts word order may be flexible espe...
PECULIARITIES OF SPEECH BEHAVIOR OF SEAFARERS FROM INDIA AND CHINA
PECULIARITIES OF SPEECH BEHAVIOR OF SEAFARERS FROM INDIA AND CHINA
The article is devoted to the investigation of peculiarities of speech behavior of Indian and Chinese seafarers. Language, being the most important communicative tool as a general ...
Prosodic Structure and Rhythmic Patterns in Zhuang Folk Songs: A Metrical Phonological Perspective
Prosodic Structure and Rhythmic Patterns in Zhuang Folk Songs: A Metrical Phonological Perspective
This study systematically examines the prosodic characteristics of Zhuang folk songs, an important intangible cultural heritage of China, to understand interface mechanisms between...
Individual Differences in Early Disambiguation of Prosodic Grouping
Individual Differences in Early Disambiguation of Prosodic Grouping
Prosodic cues help to disambiguate incoming information in spoken language perception. In structurally ambiguous coordinate utterances, such as three-name sequences, the intended g...
Interrogative prosodic structure
Interrogative prosodic structure
Abstract This chapter examines the internal prosodic structure of wh- expressions and the prosodic integration of interrogative items in Ikpana wh- questions. Insigh...
Pertinent Prosodic Features for Speaker Identification by Voice
Pertinent Prosodic Features for Speaker Identification by Voice
Most existing systems of speaker recognition use “state of the art” acoustic features. However, many times one can only recognize a speaker by his or her prosodic features, especia...
On the Transcription of Kazakh Oral Text
On the Transcription of Kazakh Oral Text
The article examines the concept of transcription, its types and the use of transcription in oral speech. In scientific literature, transcription is presented as a transmission in ...

Back to Top