Cambridge: Cambridge University Press, 2012. — 387 p.
Multimodal signal processing for human meetings: an introduction
Data collection
Microphone arrays and beamforming
Speaker diarization
Speech recognition
Sampling techniques for audio-visual tracking and head pose estimation
Video processing and recognition
Language structure
Multimodal analysis of small-group conversational dynamics
Summarization
User requirements for meeting support technology
Meeting browsers and meeting assistants
Evaluation of meeting support technology
Conclusion and perspectives