Javascript must be enabled to continue!
A neural network based lexical stress pattern classifier
View through CrossRef
Background and Objectives:
In dysprosodic speech, the prosody does not match the expected intonation pattern and can result in robotic-like speech, with each syllable produced with equal stress. These errors are manifested through inconsistent lexical stress as measured by perceptual judgments and/or acoustic variables. Lexical stress is produced through variations in syllable duration, peak intensity and fundamental frequency.
The presented technique automatically evaluates the unequal lexical stress patterns Strong-Weak (SW) and Week-Strong (WS) in American English continuous speech production based upon a multi-layer feed forward neural network with seven acoustic features chosen to target the lexical stress variability between two consecutive syllables.
Methods:
The speech corpus used in this work is the PTDB-TUG. Five females and three males were chosen to form a training set and one female and one male for testing. The CMU pronouncing dictionary with lexical stress levels marked was used to assign stress levels to each syllable in all words in the speech corpus.
Lexical stress is phonetically realized through the manipulation of signal intensity, the fundamental frequency (F0) and its dynamics and the syllable/vowel duration. The nucleus duration, syllable duration, mean pitch, maximum pitch over nucleus, the peak-to-peak amplitude integral over syllable nucleus, energy mean and maximum energy over nucleus were calculated for each syllable in the collected speech.
As lexical stress errors are identified by evaluating the variability between consecutive syllables in a word, we computed the pairwise variability index ("PVI") for each acoustic measure. The PVI for any acoustic feature f_i is given by:
PVI_i= (f_i1-f_i2)/(( f_i1+f_i2)/2)(1), where f_i1,f_i2 are the acoustic features of the first and second syllables consecutively.
A multi-layer feed forward neural network which consisted of input, hidden and output layers was used to classify the stress patterns in the words in the database.
Results:
The presented system had an overall accuracy of 87.6%. It correctly classified 92.4% of the SW stress patterns and 76.5% of the WS stress pattern.
Conclusions:
A feed-forward neural network was used to classify between the SW and WS stress patterns in American English continuous speech with overall accuracy around 87 percent.
Hamad bin Khalifa University Press (HBKU Press)
Title: A neural network based lexical stress pattern classifier
Description:
Background and Objectives:
In dysprosodic speech, the prosody does not match the expected intonation pattern and can result in robotic-like speech, with each syllable produced with equal stress.
These errors are manifested through inconsistent lexical stress as measured by perceptual judgments and/or acoustic variables.
Lexical stress is produced through variations in syllable duration, peak intensity and fundamental frequency.
The presented technique automatically evaluates the unequal lexical stress patterns Strong-Weak (SW) and Week-Strong (WS) in American English continuous speech production based upon a multi-layer feed forward neural network with seven acoustic features chosen to target the lexical stress variability between two consecutive syllables.
Methods:
The speech corpus used in this work is the PTDB-TUG.
Five females and three males were chosen to form a training set and one female and one male for testing.
The CMU pronouncing dictionary with lexical stress levels marked was used to assign stress levels to each syllable in all words in the speech corpus.
Lexical stress is phonetically realized through the manipulation of signal intensity, the fundamental frequency (F0) and its dynamics and the syllable/vowel duration.
The nucleus duration, syllable duration, mean pitch, maximum pitch over nucleus, the peak-to-peak amplitude integral over syllable nucleus, energy mean and maximum energy over nucleus were calculated for each syllable in the collected speech.
As lexical stress errors are identified by evaluating the variability between consecutive syllables in a word, we computed the pairwise variability index ("PVI") for each acoustic measure.
The PVI for any acoustic feature f_i is given by:
PVI_i= (f_i1-f_i2)/(( f_i1+f_i2)/2)(1), where f_i1,f_i2 are the acoustic features of the first and second syllables consecutively.
A multi-layer feed forward neural network which consisted of input, hidden and output layers was used to classify the stress patterns in the words in the database.
Results:
The presented system had an overall accuracy of 87.
6%.
It correctly classified 92.
4% of the SW stress patterns and 76.
5% of the WS stress pattern.
Conclusions:
A feed-forward neural network was used to classify between the SW and WS stress patterns in American English continuous speech with overall accuracy around 87 percent.
Related Results
Classification of Bisyllabic Lexical Stress Patterns Using Deep Neural Networks
Classification of Bisyllabic Lexical Stress Patterns Using Deep Neural Networks
Background and Objectives: As English is a stress-timed language, lexical stress plays an important role in the perception and processing of speech by native speakers. Incorrect st...
Depth-aware salient object segmentation
Depth-aware salient object segmentation
Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
Lexical Richness of Chinese College Students’ Spoken English
Lexical Richness of Chinese College Students’ Spoken English
Lexical richness has been considered one of the most effective methods of assessing writing proficiency. However, the studies on spoken English lexical richness for EFL Chinese stu...
Overcoming lexical interference in Chinese students learning Russian
Overcoming lexical interference in Chinese students learning Russian
Background. The article addresses the issue of lexical interference among Chinese students learning Russian as a foreign language. This phenomenon is due to significant differences...
State of stress in the conterminous United States
State of stress in the conterminous United States
Inferring principal stress directions from geologic data, focal mechanisms, and in situ stress measurements, we have prepared a map of principal horizontal stress orientations for ...
Neural stemness contributes to cell tumorigenicity
Neural stemness contributes to cell tumorigenicity
Abstract
Background: Previous studies demonstrated the dependence of cancer on nerve. Recently, a growing number of studies reveal that cancer cells share the property and ...
Inversion using adaptive physics‐based neural network: Application to magnetotelluric inversion
Inversion using adaptive physics‐based neural network: Application to magnetotelluric inversion
ABSTRACTA new trend to solve geophysical problems aims to combine the advantages of deterministic inversion with neural network inversion. The neural networks applied to geophysica...

