Abstract
This article demonstrates that watching speech as well as listening to it is very much part of what a person does when looking at other people, and that observing someone talking can relate to other aspects of face processing. The perception of speaking faces can offer insights into how human communication skills develop, and the role of the face and its actions in relation to those skills. Cross-modal processing occurs at a relatively early cognitive processing stage in relation to speech processing and identifying aspects of the talker. It shows that multimodal processing is apparent early in cognitive processing for faces and for speech. Two domains—each of which is inherently multimodal—underpin much of the ability to process communicative information. The cortical correlates for each of these domains can be identified and each of their distinctive cognitive characteristics can be described. The details of how these domains interact await discovery.