Symposia > Kotz

A question of time: subcortico-cortical interactions in speech processing

Chairs: Sonja A. Kotz & Michael Schwartze

Max Planck Insitute for Human Cognitive and Brain Sciences, Minerva Research Group Neurocognition of Rhythm in Communication,  Leipzig, Germany


Speech is a transient acoustic signal and consists of energy patterns established by events with a specific formal and temporal structure. An event may be defined as a salient change in the formal structure of an acoustic signal, e.g. pitch or amplitude, whereas temporal structure pertains to the temporal relation between successive events. Events also demarcate different units of information, which, for example, correspond to features, phonemes, or syllables in speech. Accordingly, the temporal structure of the speech signal evolves across different timescales spanning from microseconds to milliseconds and beyond. Efficient speech processing probably makes use of all available information including temporal structure. We put forward the idea that speech processing interacts with temporal processing in order to exploit and to generate temporal
structure. Starting with early sensory processing in perception, and the planning of speech motor behavior in production, the emerging integrative framework links cortical speech processing networks to subcortical structures by incorporating cerebello-thalamo-cortical and cortico-striato-thalamo-cortical circuits. We reason that this interaction reflects a general bias to synchronize and allocate internal resources with sensory input. While we expect this mechanism to be effective in normal speech processing, it has considerable implications for a number of patient profiles to optimize speech processing (i.e. manipulating the temporal regularity of the acoustic signal).

Talk 1:

 The motor-sensory control of speech and its role in learning a new language

Anna Simmonds, Robert Leech, & Richard J.S. Wise
Imperial College London, UK.

Articulatory movements necessary for producing native speech are over-learned and automatic. They may involve repeated simple motor-to-sensory mappings (such as infant babbling, "bababa"), sequences of syllables (such as when congratulating a sportsmen, "bravo, bravo, bravo") and the much more complex sequences that make up connected speech ("It's a nice day today"). This hierarchy of speech motor control is dependent on Brodmann's areas (BA) 6, 44 and 45 in inferior frontal cortex (IFC), strongly left-lateralised. It is a central part of a distributed network, consisting of higher-order prefrontal regions, subserving the cognitive control of thoughts and internal goals of communication, and posterior cortex (temporal and inferior parietal) which stores long-term phonological, syntactic and semantic representations. In addition, there is an essential link between the IFC and the sensory regions that monitor the match between the intended motor speech goal and the one actually achieved (signaled by sensory feedback). It is, therefore, expected that learning novel (non-native) phonemes, syllables and words requires an interaction between the IFC and auditory and somatotosensory cortex - the planum temporale and parietal operculum - at the temporo-parietal junction (TPJ). Previous work has shown that regions involved in integrating motor feedforward with sensory feedback signals are more active during non-native speech production, even in proficient bilinguals, relative to native speech. Recent work, using a prospective training fMRI paradigm, explored regional cortical plasticity and changes in functional connectivity as subjects underwent an intense period of training in the production of non-native words. The emphasis was correct articulation, and the
subjects received no training in the meaning or grammatical properties of the words on which they were trained. Importantly, assessments were made of each English monolingual subject's competence at acquiring a 'good' accent in the three languages on which they were trained - Spanish, German and Mandarin. The results revealed the rapid neouroplastic changes in ventral dorsal prefrontal/premotor regions and TPJ cortex and their functional connectivity, and how these changes related to individual proficiency.

Talk 2:

Cortical filters meet timing in speech production

Axel Riecker
Neurology, University of Ulm, Ulm, Germany.

Speech and melody production during fMRI demonstrated an opposite activation pattern for speaking and singing at the level of the intrasylvian cortex. It could be assumed that the two hemispheres operate across different time domains ("double filtering by frequency theory": left hemisphere=segmental information consisting syllables and vowels, etc.; right hemisphere=intonation contours of verbal utterances and musical melodies consisting pitch, loudness, etc.). Moreover, using fMRI during passive listening to click trains we found several cerebral structures outside the central-auditory pathways displayed distinct activation patterns
each: rate-response profiles resembling high-pass (left side) or low-pass filtered (right side) signal series emerged at the level of the anterior insula. Therefore, it could be assumed that the intrasylvian areas of left and right cerebral hemispheres act as high- and low-pass filters, respectively, on auditory input according to the double filtering by frequency theory. Moreover, these areas seem to join up with the right cerebellum and the left inferior frontal gyrus to a network subserving parsing/timing functions within the auditory-verbal domain. This assumption could be supported using linguistic and non-linguistic stimuli in subjects with developmental dyslexia demonstrating that the anterior insula represents an important neural correlate of deficient temporal processing of speech and non-speech sounds in dyslexia.

Talk 3:

 Easy guessing, hard listening – Neural mechanisms of speech comprehension

Jonas Obleser
MPI for Human Cognitive and Brain Sciences, Auditory Cognition Group, Leipzig, Germany.

Comprehending speech is an astonishing faculty of the human brain, especially so under adverse listening conditions. How and by which neural mechanisms do we cope so well with the fleeting percepts of speech? In addition to “facilitating” influences such as semantic context, listeners also cope with challenging listening situation by fully exploiting their sensory and cognitive resources, e.g. their working memory capacities (“compensation”). I will present data from functional MRI (fMRI), magneto- and electroencephalography studies (M/EEG; with an emphasis on neural oscillations) that utilize acoustically degraded speech stimuli to pursue the neural underpinnings of these facilitation and compensation mechanisms in detail.

Talk 4:

Timing and Speech: inherent or distinct?

Michael Schwartze & Sonja A. Kotz
MPI for Human Cognitive and Brain Sciences, Minerva Research Group Neurocognition of Rhythm in Communication, Leipzig, Germany.

Our sense of hearing rests on the processing of events that unfold in time. Acoustic events form patterns of varying formal and temporal complexity extending from the clicks of a metronome and morse-code to musical notes and speech. Formal structure reflects characteristics such as pitch, timbre, loudness, while temporal structure gives rise to the concepts of succession and duration. Both are independent sources of information in auditory cognition. However, perceived regularity in either dimension can be used to generate predictions regarding the future course of events. Such predictions instantiate a powerful mechanism that  allows for proactive behavior in cognition and action. Here we propose that auditory processing, and  speech processing in particular, interfaces with dedicated temporal processing systems such as the  cerebellum, the supplementary motor area, and the basal ganglia in order to exploit temporal regularity and to predict the temporal locus of important events. The emerging integrative subcortico-cortical
framework models speech processing as a dynamic process and provides a novel perspective regarding the development, optimization, and functional loss of speech processing capacities.

Online user: 1 RSS Feed