Javascript must be enabled to continue!
Symmetric Combined Convolution with Convolutional Long Short-Term Memory for Monaural Speech Enhancement
View through CrossRef
Deep neural network-based approaches have obtained remarkable progress in monaural speech enhancement. Nevertheless, current cutting-edge approaches remain vulnerable to complex acoustic scenarios. We propose a Symmetric Combined Convolution Network with ConvLSTM (SCCN) for monaural speech enhancement. Specifically, the Combined Convolution Block utilizes parallel convolution branches, including standard convolution and two different depthwise separable convolutions, to reinforce feature extraction in depthwise and channelwise. Similarly, Combined Deconvolution Blocks are stacked to construct the convolutional decoder. Moreover, we introduce the exponentially increasing dilation between convolutional kernel elements in the encoder and decoder, which expands receptive fields. Meanwhile, the grouped ConvLSTM layers are exploited to extract the interdependency of spatial and temporal information. The experimental results demonstrate that the proposed SCCN method obtains on average 86.00% in STOI and 2.43 in PESQ, which outperforms the state-of-the-art baseline methods, confirming the effectiveness in enhancing speech quality.
Title: Symmetric Combined Convolution with Convolutional Long Short-Term Memory for Monaural Speech Enhancement
Description:
Deep neural network-based approaches have obtained remarkable progress in monaural speech enhancement.
Nevertheless, current cutting-edge approaches remain vulnerable to complex acoustic scenarios.
We propose a Symmetric Combined Convolution Network with ConvLSTM (SCCN) for monaural speech enhancement.
Specifically, the Combined Convolution Block utilizes parallel convolution branches, including standard convolution and two different depthwise separable convolutions, to reinforce feature extraction in depthwise and channelwise.
Similarly, Combined Deconvolution Blocks are stacked to construct the convolutional decoder.
Moreover, we introduce the exponentially increasing dilation between convolutional kernel elements in the encoder and decoder, which expands receptive fields.
Meanwhile, the grouped ConvLSTM layers are exploited to extract the interdependency of spatial and temporal information.
The experimental results demonstrate that the proposed SCCN method obtains on average 86.
00% in STOI and 2.
43 in PESQ, which outperforms the state-of-the-art baseline methods, confirming the effectiveness in enhancing speech quality.
Related Results
[RETRACTED] Rhino XL Male Enhancement v1
[RETRACTED] Rhino XL Male Enhancement v1
[RETRACTED]Rhino XL Reviews, NY USA: Studies show that testosterone levels in males decrease constantly with growing age. There are also many other problems that males face due ...
Temporal integration of monaural and dichotic frequency modulation
Temporal integration of monaural and dichotic frequency modulation
Frequency modulation (FM) detection at low modulation frequencies is commonly used as an index of temporal fine structure processing to demonstrate age- and hearing-related deficit...
Reference-Based Speech Enhancement via Feature Alignment and Fusion Network
Reference-Based Speech Enhancement via Feature Alignment and Fusion Network
Speech enhancement aims at recovering a clean speech from a noisy input, which can be classified into single speech enhancement and personalized speech enhancement. Personalized sp...
Analog Convolutional Operator Circuit for Low-Power Mixed-Signal CNN Processing Chip
Analog Convolutional Operator Circuit for Low-Power Mixed-Signal CNN Processing Chip
In this paper, we propose a compact and low-power mixed-signal approach to implementing convolutional operators that are often responsible for most of the chip area and power consu...
Binaural Hearing of Speech for Aided and Unaided Conditions
Binaural Hearing of Speech for Aided and Unaided Conditions
Differences in speech intelligibility and identification between binaural, monaural near ear, and monaural far ear conditions were studied in sound field conditions. Scores from li...
ON TYPES OF SPEECH IN THE NOVEL NEBO, TAKO DUBOKO BY VESNA KAPOR
ON TYPES OF SPEECH IN THE NOVEL NEBO, TAKO DUBOKO BY VESNA KAPOR
The paper examines models of reported speech in Vesna Kapor’s novel Nebo, tako duboko from the point of view of syntax and stylistics. According to the clas- sification by Miloš Ko...
The Neural Mechanisms of Private Speech in Second Language Learners’ Oral Production: An fNIRS Study
The Neural Mechanisms of Private Speech in Second Language Learners’ Oral Production: An fNIRS Study
Background: According to Vygotsky’s sociocultural theory, private speech functions both as a tool for thought regulation and as a transitional form between outer and inner speech. ...
Speech, communication, and neuroimaging in Parkinson's disease : characterisation and intervention outcomes
Speech, communication, and neuroimaging in Parkinson's disease : characterisation and intervention outcomes
<p dir="ltr">Most individuals with Parkinson's disease (PD) experience changes in speech, voice or communication. Speech changes often manifest as hypokinetic dysarthria, a m...

