The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.
Aalto

Linear Predictive Modelling of Speech – Constraints and Line Spectrum Pair Decomposition

Tom Bäckström

Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Department of Electrical and Communications Engineering for public examination and debate in Auditorium S4 at Helsinki University of Technology (Espoo, Finland) on the 5th of March, 2004, at 12 o'clock noon.

Overview in PDF format (ISBN 951-22-6947-3)   [1094 KB]
Dissertation is also available in print (ISBN 951-22-6946-5)

Abstract

In an exploration of the spectral modelling of speech, this thesis presents theory and applications of constrained linear predictive (LP) models. Spectral models are essential in many applications of speech technology, such as speech coding, synthesis and recognition. At present, the prevailing approach in speech spectral modelling is linear prediction. In speech coding, spectral models obtained by LP are typically quantised using a polynomial transform called the Line Spectrum Pair (LSP) decomposition. An inherent drawback of conventional LP is its inability to include speech specific a priori information in the modelling process.

This thesis, in contrast, presents different constraints applied to LP models, which are then shown to have relevant properties with respect to root loci of the model in its all-pole form. Namely, we show that LSP polynomials correspond to time domain constraints that force the roots of the model to the unit circle. Furthermore, this result is used in the development of advanced spectral models of speech that are represented by stable all-pole filters.

Moreover, the theoretical results also include a generic framework for constrained linear predictive models in matrix notation. For these models, we derive sufficient criteria for stability of their all-pole form. Such models can be used to include a priori information in the generation of any application specific, linear predictive model. As a side result, we present a matrix decomposition rule for Toeplitz and Hankel matrices.

This thesis consists of an overview and of the following 6 publications:

  1. Bäckström T., Alku P., Paatero T. and Kleijn B. W., 2004. A time domain interpretation for the LSP decomposition. IEEE Transactions on Speech and Audio Processing, accepted for publication. © 2004 IEEE. By permission.
  2. Kleijn B. W., Bäckström T. and Alku P., 2003. On line spectral frequencies. IEEE Signal Processing Letters 10, number 3, pages 75-77. © 2003 IEEE. By permission.
  3. Bäckström T. and Alku P., 2003. All-pole modeling technique based on weighted sum of LSP polynomials. IEEE Signal Processing Letters 10, number 6, pages 180-183. © 2003 IEEE. By permission.
  4. Alku P. and Bäckström T., 2004. Linear predictive method for improved spectral modeling of lower frequencies of speech with small prediction orders. IEEE Transactions on Speech and Audio Processing, accepted for publication. © 2004 IEEE. By permission.
  5. Bäckström T. and Alku P., 2003. A constrained linear predictive model with the minimum-phase property. Signal Processing 83, number 10, pages 2259-2264.
  6. Bäckström T., 2003. Root-exchange property of constrained linear predictive models. In: Proceedings of the 2003 IEEE Workshop on Statistical Signal Processing (SSP03). St. Louis, MO, USA, September 28 - October 1, 2003, pages 81-84. © 2003 IEEE. By permission.

Keywords: linear prediction, line spectrum pair, minimum-phase property, speech modelling

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2004 Helsinki University of Technology


Last update 2011-05-26