Springer, 2012. — 136 p.
During production of speech human beings impose durational constraints and intonation patterns on the sequence of sound units to convey the intended message. This inherent ability of the human beings in using the prosody (duration and intonation) knowledge is naturally acquired, and is difficult to articulate. But for synthesizing speech from a text by a machine, it is necessary to acquire, represent and incorporate this prosody knowledge. The prosody constraints are not only characteristic of the speech message and the language, but they also characterize a speaker uniquely. Even for speech recognition, human beings seem to rely on the prosody cues to disambiguate errors in the perceived sounds. Thus acquisition and incorporation of prosody knowledge becomes important for developing speech systems. This book attempts to discuss the methods to capture the prosody knowledge in speech in Indian languages, and to incorporate the knowledge in speech systems.
The book presents an approach to capture the implicit duration and intonation knowledge using models of neural networks and support vector machines. The results are demonstrated using labeled database for speech in three Indian languages, namely, Hindi, Telugu and Tamil. The prosody models are shown to possess speaker and language characteristics as well, besides information about the message in speech. Thus the models can be explored for identification of speaker and language. An important application of prosody models is in the synthesis of speech from text, and in voice conversion. For this, the prosody knowledge captured in the models needs to be incorporated in the speech signal. In this book a flexible prosody modification method is discussed. The method is based on exploiting the nature of excitation of speech production mechanism. The book also discuss methods to modify the speech prosody in real time, modification of formant structure and imposing the desired pitch contour.
This book mainly intended for speech researchers working on prosodic aspects of speech such as characterization, representation, acquisition and incorporation of speech prosody. This book is also useful for the young researchers, who want to pursue the research in speech processing. This book can be kept as a text book or reference book for the postgraduate level advanced speech processing course.
Prosody Knowledge for Speech Systems: A Review
Analysis of Durations of Sound Units
Modeling Duration
Modeling Intonation
Prosody Modification
Practical Aspects of Prosody Modification
Summary and Conclusions
A Coding Scheme Used to Represent Linguistic and Production Constraints