Springer Handbook of Speech Processing, 1st Edition

  • Published By:
  • ISBN-10: 3540491279
  • ISBN-13: 9783540491279
  • Grade Level Range: College Freshman - College Senior
  • 1176 Pages | eBook
  • Original Copyright 2007 | Published/Released January 2010
  • This publication's content originally published in print form: 2007

  • Price:  Sign in for price



From common consumer products, such as cell phones and MP3 players, to more sophisticated projects such as human-machine interfaces and responsive robots, speech technologies are ubiquitous in 21st century life. Many think that it is just a matter of time before more applications of the science of speech become inescapable in our daily life. This handbook is intended to play a fundamental role in shaping sustainable progress in speech research and development. A quickly accessible source of application-oriented, authoritative and comprehensive information about these technologies, it combines the established knowledge derived from research in such fast evolving disciplines as Signal Processing and Communications, Acoustics, Computer Science and Linguistics.

The Springer Handbook of Speech Processing focuses on three categories of readers: graduate students, professors and active researchers in academia and research labs, and engineers in industry who need to understand or implement some specific algorithms for their speech-related products. The handbook could also be used as a sourcebook for one or more graduate courses on signal processing for speech and different aspects of speech processing and applications.

Table of Contents

Front Cover.
Half Title Page.
Other Frontmatter.
Title Page.
Copyright Page.
List of Editors.
List of Authors.
List of Abbreviations.
1: Introduction to Speech Processing.
2: A Brief History of Speech Processing.
3: Applications of Speech Processing.
4: Organization of the Handbook.
5: Introduction to Speech Processing:References.
6: Production, Perception, and Modeling of Speech.
7: Physiological Processes of Speech Production.
8: Overview of Speech Apparatus.
9: Voice Production Mechanisms.
10: Articulatory Mechanisms.
11: Summary.
12: Physiological Processes of Speech Production:References.
13: Nonlinear Cochlear Signal Processing and Masking in Speech Perception.
14: Basics.
15: The Nonlinear Cochlea.
16: Neural Masking.
17: Discussion and Summary.
18: Nonlinear Cochlear Signal Processing and Masking in Speech Perception:References.
19: Perception of Speech and Sound.
20: Basic Psychoacoustic Quantities.
21: Acoustical Information Required for Speech Perception.
22: Speech Feature Perception.
23: Perception of Speech and Sound:References.
24: Speech Quality Assessment.
25: Degradation Factors Affecting Speech Quality.
26: Subjective Tests.
27: Objective Measures.
28: Conclusions.
29: Speech Quality Assessment:References.
30: Signal Processing for Speech.
31: Wiener and Adaptive Filters.
32: Overview.
33: Signal Models.
34: Derivation of the Wiener Filter.
35: Impulse Response Tail Effect.
36: Condition Number.
37: Adaptive Algorithms.
38: MIMO Wiener Filter.
39: Conclusions.
40: Wiener and Adaptive Filters:References.
41: Linear Prediction.
42: Fundamentals.
43: Forward Linear Prediction.
44: Backward Linear Prediction.
45: Levinson—Durbin Algorithm.
46: Lattice Predictor.
47: Spectral Representation.
48: Linear Interpolation.
49: Line Spectrum Pair Representation.
50: Multichannel Linear Prediction.
51: Conclusions.
52: Linear Prediction:References.
53: The Kalman Filter.
54: Derivation of the Kalman Filter.
55: Examples: Estimation of Parametric Stochastic Process from Noisy Observations.
56: Extensions of the Kalman Filter.
57: The Application of the Kalman Filter to Speech Processing.
58: Summary.
59: The Kalman Filter:References.
60: Homomorphic Systems and Cepstrum Analysis of Speech.
61: Definitions.
62: Z-Transform Analysis.
63: Discrete-Time Model for Speech Production.
64: The Cepstrum of Speech.
65: Relation to LPC.
66: Application to Pitch Detection.
67: Applications to Analysis/Synthesis Coding.
68: Applications to Speech Pattern Recognition.
69: Summary.
70: Homomorphic Systems and Cepstrum Analysis of Speech:References.
71: Pitch and Voicing Determination of Speech with an Extension Toward Music Signals.
72: Pitch in Time-Variant Quasiperiodic Acoustic Signals.
73: Short-Term Analysis PDAs.
74: Selected Time-Domain Methods.
75: A Short Look into Voicing Determination.
76: Evaluation and Postprocessing.
77: Applications in Speech and Music.
78: Some New Challenges and Developments.
79: Concluding Remarks.
80: Pitch and Voicing Determination of Speech with an Extension Toward Music Signals:References.
81: Formant Estimation and Tracking.
82: Historical.
83: Vocal Tract Resonances.
84: Speech Production.
85: Acoustics of the Vocal Tract.
86: Short-Time Speech Analysis.
87: Formant Estimation.
88: Summary.
89: Formant Estimation and Tracking:References.
90: The STFT, Sinusoidal Models, and Speech Modification.
91: The Short-Time Fourier Transform.
92: Sinusoidal Models.
93: Speech Modification.
94: The STFT, Sinusoidal Models, and Speech Modification:References.
95: Adaptive Blind Multichannel Identification.
96: Overview.
97: Signal Model and Problem Formulation.
98: Identifiability and Principle.
99: Constrained Time-Domain Multichannel LMS and Newton Algorithms.
100: Unconstrained Multichannel LMS Algorithm with Optimal Step-Size Control.
101: Frequency-Domain Blind Multichannel Identification Algorithms.
102: Adaptive Multichannel Exponentiated Gradient Algorithm.
103: Summary.
104: Adaptive Blind Multichannel Identification:References.
105: Speech Coding.
106: Principles of Speech Coding.
107: The Objective of Speech Coding.
108: Speech Coder Attributes.
109: A Universal Coder for Speech.
110: Coding with Autoregressive Models.
111: Distortion Measures and Coding Architecture.
112: Summary.
113: Principles of Speech Coding:References.
114: Voice Over IP: Speech Transmission Over Packet Networks.
115: Voice Communication.
116: Properties of the Network.
117: Outline of a VoIP System.
118: Robust Encoding.
119: Packet Loss Concealment.
120: Conclusion.
121: Voice Over IP: Speech Transmission Over Packet Networks:References.
122: Low-Bit-Rate Speech Coding.
123: Speech Coding.
124: Fundamentals: Parametric Modeling of Speech Signals.
125: Flexible Parametric Models.
126: Efficient Quantization of Model Parameters.
127: Low-Rate Speech Coding Standards.
128: Summary.
129: Low-Bit-Rate Speech Coding:References.
130: Analysis-by-Synthesis Speech Coding.
131: Overview.
132: Basic Concepts of Analysis-by-Synthesis Coding.
133: Overview of Prominent Analysis-by-Synthesis Speech Coders.
134: Multipulse Linear Predictive Coding (MPLPC).
135: Regular-Pulse Excitation with Long-Term Prediction (RPE-LTP).
136: The Original Code Excited Linear Prediction (CELP) Coder.
137: US Federal Standard FS1016 CELP.
138: Vector Sum Excited Linear Prediction (VSELP).
139: Low-Delay CELP (LD-CELP).
140: Pitch Synchronous Innovation CELP (PSI-CELP).
141: Algebraic CELP (ACELP).
142: Conjugate Structure CELP (CS-CELP) and CS-ACELP.
143: Relaxed CELP (RCELP) – Generalized Analysis by Synthesis.
144: eX-CELP.
145: Ilbc.
146: TSNFC.
147: Embedded CELP.
148: Summary of Analysis-by-Synthesis Speech Coders.
149: Conclusion.
150: Analysis-by-Synthesis Speech Coding:References.
151: Perceptual Audio Coding of Speech Signals.
152: History of Audio Coding.
153: Fundamentals of Perceptual Audio Coding.
154: Some Successful Standardized Audio Coders.
155: Perceptual Audio Coding for Real-Time Communication.
156: Hybrid/Crossover Coders.
157: Summary.
158: Perceptual Audio Coding of Speech Signals:References.
159: Text-to-Speech Synthesis.
160: Basic Principles of Speech Synthesis.
161: The Basic Components of a TTS System.
162: Speech Representations and Signal Processing for Concatenative Synthesis.
163: Speech Signal Transformation Principles.
164: Speech Synthesis Evaluation.
165: Conclusions.
166: Basic Principles of Speech Synthesis:References.
167: Rule-Based Speech Synthesis.
168: Background.
169: Terminal Analog.
170: Controlling the Synthesizer.
171: Special Applications of Rule-Based Parametric Synthesis.
172: Concluding Remarks.
173: Rule-Based Speech Synthesis:References.
174: Corpus-Based Speech Synthesis.
175: Basics.
176: Concatenative Synthesis with a Fixed Inventory.
177: Unit-Selection-Based Synthesis.
178: Statistical Parametric Synthesis.
179: Conclusion.
180: Corpus-Based Speech Synthesis:References.
181: Linguistic Processing for Speech Synthesis.
182: Why Linguistic Processing is Hard.
183: Fundamentals: Writing Systems and the Graphical Representation of Language.
184: Problems to be Solved and Methods to Solve Them.
185: Architectures for Multilingual Linguistic Processing.
186: Document-Level Processing.
187: Future Prospects.
188: Linguistic Processing for Speech Synthesis:References.
189: Prosodic Processing.