Javascript must be enabled to continue!
Extracting speech spectrogram of speech signal based on generalized S-transform
View through CrossRef
In speech signal processing, time-frequency analysis is commonly employed to extract the spectrogram of speech signals. While many algorithms exist to achieve this with high-quality results, they often lack the flexibility to adjust the resolution of the extracted spectrograms. However, applications such as speech recognition and speech separation frequently require spectrograms of varying resolutions. The flexibility of an algorithm in providing different resolutions is crucial for these applications. This paper introduces the generalized S-transform, and explains its fundamental theory and algorithmic implementation. By adjusting parameters, the proposed method flexibly produces spectrograms with different resolutions, offering a novel and effective approach to obtain speech signal spectrograms. The algorithm enhances the traditional Stockwell transform (S-transform) by incorporating a low-pass filtering function and introducing two adjustable parameters. These parameters modify the Gaussian window function of the basic S-transform, resulting with the generalized S-transform with customizable time-frequency resolution. Finally, this paper presents simulation experiments using both synthesized signals and real speech datas, comparing with the generalized S-transform with several commonly used spectrogram extraction algorithms. The experiments demonstrate that the generalized S-transform is feasible and effective, particularly when it is combined with the generalized fundamental frequency profile. The results indicate that this method is a viable and effective in obtaining spectrograms of speech signals, and has potential application in speech feature extraction and speech recognition. The pure speech dataset used in the experiments is sourced from a downloadable database and partially from a recorded speech set.
Title: Extracting speech spectrogram of speech signal based on generalized S-transform
Description:
In speech signal processing, time-frequency analysis is commonly employed to extract the spectrogram of speech signals.
While many algorithms exist to achieve this with high-quality results, they often lack the flexibility to adjust the resolution of the extracted spectrograms.
However, applications such as speech recognition and speech separation frequently require spectrograms of varying resolutions.
The flexibility of an algorithm in providing different resolutions is crucial for these applications.
This paper introduces the generalized S-transform, and explains its fundamental theory and algorithmic implementation.
By adjusting parameters, the proposed method flexibly produces spectrograms with different resolutions, offering a novel and effective approach to obtain speech signal spectrograms.
The algorithm enhances the traditional Stockwell transform (S-transform) by incorporating a low-pass filtering function and introducing two adjustable parameters.
These parameters modify the Gaussian window function of the basic S-transform, resulting with the generalized S-transform with customizable time-frequency resolution.
Finally, this paper presents simulation experiments using both synthesized signals and real speech datas, comparing with the generalized S-transform with several commonly used spectrogram extraction algorithms.
The experiments demonstrate that the generalized S-transform is feasible and effective, particularly when it is combined with the generalized fundamental frequency profile.
The results indicate that this method is a viable and effective in obtaining spectrograms of speech signals, and has potential application in speech feature extraction and speech recognition.
The pure speech dataset used in the experiments is sourced from a downloadable database and partially from a recorded speech set.
Related Results
Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recogntion
Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recogntion
Abstract
The performance of speaker recognition is very well in a clean dataset or without mismatch between training and test set. However, the performance is degraded with...
Fusion of Cochleogram and Mel Spectrogram Features for Deep Learning Based Speaker Recognition
Fusion of Cochleogram and Mel Spectrogram Features for Deep Learning Based Speaker Recognition
Abstract
Speaker recognition has crucial application in forensic science, financial areas, access control, surveillance and law enforcement. The performance of speaker reco...
Extractraction of non-stationary harmonic from chaotic background based on synchrosqueezed wavelet transform
Extractraction of non-stationary harmonic from chaotic background based on synchrosqueezed wavelet transform
The signal detection in chaotic background has gradually become one of the research focuses in recent years. Previous research showed that the measured signals were often unavoidab...
An Automated method for the analysis of bearing vibration based on spectrogram pattern matching
An Automated method for the analysis of bearing vibration based on spectrogram pattern matching
As a mean for non-intrusive inspection of bearing systems, the scope of predicting their condition from the acoustic vibrations liberated during their operation, utilizing signal p...
The Application of S‐transform Spectrum Decomposition Technique in Extraction of Weak Seismic Signals
The Application of S‐transform Spectrum Decomposition Technique in Extraction of Weak Seismic Signals
AbstractIn processing of deep seismic reflection data, when the frequency band difference between the weak useful signal and noise both from the deep subsurface is very small and h...
Machine Learning for Non-Intrusive Speech Quality Assessment
Machine Learning for Non-Intrusive Speech Quality Assessment
<p><b>This thesis presents two studies on non-intrusive speech quality assessment methods. The first applies supervised learning methods to speech quality assessment, w...
Novel/Old Generalized Multiplicative Zagreb Indices of Some Special Graphs
Novel/Old Generalized Multiplicative Zagreb Indices of Some Special Graphs
Topological descriptor is a fixed real number directly attached with the molecular graph to predict the physical and chemical properties of the chemical compound. Gutman and Trinaj...
Recursive Hilbert transform method: algorithm and convergence analysis
Recursive Hilbert transform method: algorithm and convergence analysis
Abstract
The Hilbert transform (HT) is an important method for signal demodulation and instantaneous frequency (IF) estimation. The modulus of the analytic signal construct...

