New pitch based techniques for speech enhancement and speaker count determination.

Item

Title
New pitch based techniques for speech enhancement and speaker count determination.
Identifier
AAI9908341
identifier
9908341
Creator
Lewis, Michael A. F.
Contributor
Adviser: Joseph Barba
Date
1998
Language
English
Publisher
City University of New York.
Subject
Engineering, Electronics and Electrical
Abstract
In this thesis, new pitch-based techniques are investigated with the main objective of making speaker recognition applications more robust to the variable noise conditions experienced in the real world. Cochannel interference of speech signals is a common practical problem particularly in tactical communications. We examine the problem of identifying temporal regions or frames as being either one-speaker or two-speaker speech. This identification is important in making automatic speaker and speech recognition systems more robust and is based on feature extraction and subsequent classification as performed in pattern recognition. We propose a new pitch prediction feature (PPF) which is compared with the Linear Predictive Cepstral Coefficients (LPCC) and the Mel Frequency Cepstral Coefficients (MFCC). The results show that the PPF performs better than all the other features for the closed and open set cases. The problem of automatic and accurate determination of the pitch period in noisy environments is also addressed. We propose a new pitch detection algorithm based on an iterative adaptive smoothing approach using a Gaussian Derivative (GD) filter. An adaptive gaussian derivative pitch detector was developed to work under varying noise conditions, with variable pitch periods and for different speakers. We compare the performance of the Dyadic Wavelet Transform (DyWT) algorithm with our new Adaptive Gaussian Derivative Filter (AGDF) algorithm for pitch detection of synthesized speech under different noise conditions and signal-to-noise ratios. The results show that the AGDF outperforms the DyWT pitch detection scheme at low signal-to-noise ratios for different types of noise. The AGDF and DyWT algorithms are applied to speech enhancement using an Adaptive Comb Filtering (ACF) scheme. The results of the enhanced speech signal are demonstrated and a cepstral distance measure is used to evaluate the speech enhancement algorithms' performance for various signal-to-noise ratios. We show that ACF speech enhancement lowered the cepstral distance of speech in the presence of colored and babble noise. Furthermore, there was a significant decrease in the cepstral distance of speech in the presence of white gaussian noise.
Type
dissertation
Source
PQT Legacy CUNY.xlsx
degree
Ph.D.
Item sets
CUNY Legacy ETDs