Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

GD-Conformer: a Conformer-based gated dense encoder-decoder for monaural speech enhancement

View through CrossRef
Abstract Speech enhancement improves speech quality by mitigating noise, dereverberation, and echo. Existing methods struggle with amplitude-phase compensation, capturing temporal-frequency features, and high complexity. To address these issues, a gated dense encoder-decoder architecture with a two-stage Conformer, abbreviated as GD-Conformer, is proposed. It integrates a gated dense module, a two-stage residual Conformer module, a mask decoder and a complex decoder. The gated dense module consists of two parts: a dilated dense convolution and a gated convolution, where the former captures both global and local dependencies features, while the latter refines these distinct features accordingly. The two-stage residual Conformer focuses on the time-frequency dependence of speech, it also reduces the computational complexity. The mask decoder and the complex decoder restore spectral resolution while preserving speech fidelity. The outcomes of experiments conducted on the public dataset VoiceBank+DEMAND and DNS Challenge 2020 demonstrate that, compared with those state-of-the-art methods, the proposed GD-Conformer achieves comparable performance in terms of denoising and generalization with fewer parameters and lower computation complexity.
Springer Science and Business Media LLC
Title: GD-Conformer: a Conformer-based gated dense encoder-decoder for monaural speech enhancement
Description:
Abstract Speech enhancement improves speech quality by mitigating noise, dereverberation, and echo.
Existing methods struggle with amplitude-phase compensation, capturing temporal-frequency features, and high complexity.
To address these issues, a gated dense encoder-decoder architecture with a two-stage Conformer, abbreviated as GD-Conformer, is proposed.
It integrates a gated dense module, a two-stage residual Conformer module, a mask decoder and a complex decoder.
The gated dense module consists of two parts: a dilated dense convolution and a gated convolution, where the former captures both global and local dependencies features, while the latter refines these distinct features accordingly.
The two-stage residual Conformer focuses on the time-frequency dependence of speech, it also reduces the computational complexity.
The mask decoder and the complex decoder restore spectral resolution while preserving speech fidelity.
The outcomes of experiments conducted on the public dataset VoiceBank+DEMAND and DNS Challenge 2020 demonstrate that, compared with those state-of-the-art methods, the proposed GD-Conformer achieves comparable performance in terms of denoising and generalization with fewer parameters and lower computation complexity.

Related Results

[RETRACTED] Rhino XL Male Enhancement v1
[RETRACTED] Rhino XL Male Enhancement v1
[RETRACTED]Rhino XL Reviews, NY USA: Studies show that testosterone levels in males decrease constantly with growing age. There are also many other problems that males face due ...
Symmetric Combined Convolution with Convolutional Long Short-Term Memory for Monaural Speech Enhancement
Symmetric Combined Convolution with Convolutional Long Short-Term Memory for Monaural Speech Enhancement
Deep neural network-based approaches have obtained remarkable progress in monaural speech enhancement. Nevertheless, current cutting-edge approaches remain vulnerable to complex ac...
Reference-Based Speech Enhancement via Feature Alignment and Fusion Network
Reference-Based Speech Enhancement via Feature Alignment and Fusion Network
Speech enhancement aims at recovering a clean speech from a noisy input, which can be classified into single speech enhancement and personalized speech enhancement. Personalized sp...
Temporal integration of monaural and dichotic frequency modulation
Temporal integration of monaural and dichotic frequency modulation
Frequency modulation (FM) detection at low modulation frequencies is commonly used as an index of temporal fine structure processing to demonstrate age- and hearing-related deficit...
MD2PR: A Multi-level Distillation based Dense Passage Retrieval Model
MD2PR: A Multi-level Distillation based Dense Passage Retrieval Model
Abstract Reranker and retriever are two important components in information retrieval. The retriever typically adopts a dual-encoder model, where queries and docume...
Easily-Extendable Line Decoder with Low Transistor Count and High Power-Delay Performance
Easily-Extendable Line Decoder with Low Transistor Count and High Power-Delay Performance
Abstract An easily-extendable 12-transistor 2-4 line decoder core is presented for the random-access memory interface such as translation lookaside buffer and the first lev...
Development of a combined magnetic encoder
Development of a combined magnetic encoder
Purpose As a type of angular displacement sensor, the Hall-effect magnetic encoder incorporates many advantages. While compared with the photoelectric encoder, the magnetic encoder...
Binaural Hearing of Speech for Aided and Unaided Conditions
Binaural Hearing of Speech for Aided and Unaided Conditions
Differences in speech intelligibility and identification between binaural, monaural near ear, and monaural far ear conditions were studied in sound field conditions. Scores from li...

Back to Top