PC-Based System for Robust Speaker Recognition
Abstract
A PC-based system for robust speaker recognition is proposed. It includes three one level recognition methods and a two level classifier. New procedures for voice analysis are proposed: a) Robust periodicity/ aperiodicity separation by neural networks; b) Robust pitch period detection; c) Analysis of the temporal, spectral and cepstral speech characteristics. Several pattern recognition methods are implemented, because they allow analysis of different static and dynamic characteristics of the speech parameters:
1) Prototype distribution maps (PDM). The PDM is used because: a) weight vectors of PDM's neurons try to imitate the probability density function - pdf (whatever complex the form of the pdf is) and less significant PDM's neurons are eliminated by filtering.
2) AR-vector models (ARVM). The ARVM are used because they model the evolution of speech parameters.
3) The covariance approach combined with the arithmetic-harmonic sphericity measure, because this method performs effective speaker recognition over noisy signals.
4) Two level classifier, incorporating the discriminant capabilities and classification power of the multilayer perceptron (MLP) with the pdf's estimating, statistical modeling and compressing power of the PDM. The first level consists of several PDMs and the second - of MLP networks.
The experiments show that the proposed system is an efficient and useful tool for speaker recognition over clean and noisy signals.
1) Prototype distribution maps (PDM). The PDM is used because: a) weight vectors of PDM's neurons try to imitate the probability density function - pdf (whatever complex the form of the pdf is) and less significant PDM's neurons are eliminated by filtering.
2) AR-vector models (ARVM). The ARVM are used because they model the evolution of speech parameters.
3) The covariance approach combined with the arithmetic-harmonic sphericity measure, because this method performs effective speaker recognition over noisy signals.
4) Two level classifier, incorporating the discriminant capabilities and classification power of the multilayer perceptron (MLP) with the pdf's estimating, statistical modeling and compressing power of the PDM. The first level consists of several PDMs and the second - of MLP networks.
The experiments show that the proposed system is an efficient and useful tool for speaker recognition over clean and noisy signals.
Keywords
Speaker identification, Neural networks, Self-organizing map, MLP network, Two-level classifier
Full Text:
PDFThis work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.