Elsevier

NeuroImage

Volume 59, Issue 1, 2 January 2012, Pages 788-799
NeuroImage

Phonological manipulation between speech perception and production activates a parieto-frontal circuit

https://doi.org/10.1016/j.neuroimage.2011.07.025Get rights and content

Abstract

Repetition has been shown to activate the so-called ‘dorsal stream’, a network of temporo-parieto-frontal areas subserving the mapping of acoustic speech input onto articulatory-motor representations. Among these areas, a region in the posterior Sylvian fissure at the temporo-parietal boundary (also called ‘area Spt’) has been suggested to play a central role particularly with increasing computational demands on phonological processing. Most of the relevant evidence stems from tasks requiring metalinguistic processing. To date, the relevance of area Spt in natural phonological operations based on implicit linguistic knowledge has not yet been investigated. We examined two types of phonological processes assumed to be lateralized differently, i.e., the processing of syllabic stress versus subsyllabic segmental processing. In two ways, subjects modified an auditorily presented pseudoword before reproducing it overtly: (a) by a prosodic manipulation involving a stress shift across syllable boundaries, (b) by a segmental manipulation involving a vowel substitution. Manipulation per se was expected to engage area Spt. Segmental compared to prosodic processing was expected to reveal predominantly left lateralized activation, while prosodic compared to segmental processing was expected to result in bilateral or right-lateralized activation.

Contrary to expectation, activation in area Spt did not vary with increased phonological processing demand. Instead, area Spt was engaged regardless of whether subjects simply repeated a pseudoword or performed a phonological manipulation before reproduction. However, for both segmental and prosodic stimuli, reproduction after manipulation (compared to repetition) activated the left intraparietal sulcus and left inferior frontal cortex. We propose that these parieto-frontal regions are recruited when the task requires phonological manipulation over and above the more automated transfer of auditory into articulatory verbal codes, which appears to involve area Spt.

When directly contrasted with prosodic manipulation, segmental manipulation resulted in increased activation predominantly in left inferior frontal areas. This may be due to an increased demand on phonological sequencing operations at the subsyllabic phoneme level. Contrasted with segmental manipulations, prosodic manipulation did not result in increased activation, which may be due to a lower degree of morphosyntactic and to syllable-level processing.

Highlights

► Is area Spt specifically involved in naturally occurring phonological manipulation? ► Phonological manipulation in verbal reproduction engages parietal cortex and IFG. ► Area Spt is engaged in auditory–motor mapping regardless of task complexity.

Introduction

For decades, the link between acoustic speech information and the conceptual system has been central to research on the processing of auditory–verbal information. In contrast, the interface between auditory speech perception and the speech motor system has only recently become the focus of intensive research. Recent studies show that an auditory–motor link develops and is strengthened during a period of speech acquisition in childhood, when knowledge is acquired about how sound translates into articulation (Kuhl, 2000). There is strong evidence that the link between speech perception and production persists into adulthood. Experiments using auditory feedback perturbations have shown that adults subconsciously modify their own speech productions to counteract artificially induced shifts in pitch or formant frequency (Houde and Jordan, 2002, Larson et al., 2000, Purcell and Munhall, 2006, Tourville et al., 2008), suggesting that perceptual input may play a crucial role in guiding speech motor programming (Guenther, 2006). In support of this notion, adult speakers have been shown to unintentionally imitate incidental acoustic properties of linguistic stimuli during repetition (Kappes et al., 2009).

The auditory–motor integration of speech is thought to involve temporo-parietal and frontal regions, usually referred to as the “dorsal stream” (Hickok and Poeppel, 2004). Within the dorsal stream, a special role is ascribed to a region in the posterior Sylvian fissure at the temporo-parietal boundary (i.e., area Spt). As part of the left posterior planum temporale, area Spt has been shown to activate both during speech perception and production, suggesting that its specific role is to map acoustic speech signals onto articulatory representations (Buchsbaum et al., 2001, Hickok and Poeppel, 2004, Hickok and Poeppel, 2007, Okada and Hickok, 2006, Papathanassiou et al., 2000). Likewise, posterior superior temporal cortex has been suggested to map incoming auditory information onto stored templates derived from auditory experience, which are then used to generate a program for a motor response (Warren et al., 2005).

The strong link between sensory and motor functions implicated in speech processing may be licensed by distinct temporo-frontal fiber pathways, as suggested by several structural imaging studies. The arcuate fascicle, a fiber bundle connecting posterior temporal with inferior frontal structures, shows stronger structural maturation in the left compared to the right hemisphere between childhood and adolescence (Paus et al., 1999). According to the authors, this structural superiority may facilitate the fast bidirectional transfer of information between auditory and motor regions in the left hemisphere. A strongly left lateralized temporo-frontal white matter pathway was also shown by several diffusion tensor imaging (DTI) studies in adults (Barrick et al., 2007, Buchel et al., 2004, Parker et al., 2005).

One of the DTI studies tracking the arcuate fascicle is consistent with the assumption that Spt is part of this dorsal stream system (Catani et al., 2005). In addition to the direct temporo-frontal tract, this study reported a second indirect temporo-frontal pathway, consisting of two tracts with a connecting relay in inferior parietal cortex (i.e. BA39/BA40). According to the authors, the indirect pathway is used whenever an intervening stage, such as phonological recoding, occurs between auditory input and articulatory output. Other fiber tracking studies have independently shown structural connectivity between the supramarginal gyrus (Parker et al., 2005) or inferior parietal lobule/intraparietal sulcus (Frey et al., 2008), and inferior frontal as well as superior temporal regions, respectively.

The left temporo-parietal junction and specifically area Spt (e.g. Hickok, 2009) have been proposed to be involved in tasks which require temporary storage of phonological information (Buchsbaum and D'Esposito, 2008, Jacquemot and Scott, 2006). This is because area Spt is assumed to “act as an auditory–motor interface that serves to bind acoustic representations of speech with articulatory counterparts” (Buchsbaum and D'Esposito, 2008, p. 13). In fact, several functional imaging studies have revealed an increased involvement of Spt in tasks with increased phonological processing demands. Activation in the dorsal posterior temporal plane is influenced by word length, showing greater BOLD signal changes for multisyllabic compared to monosyllabic words in a covert object naming task (Okada et al., 2003), during the silent rehearsal phase of word pairs in a verbal working memory task (Buchsbaum et al., 2005a), and during covert rehearsal of nonsense sentences in which verbs and nouns had been replaced by pseudowords (Hickok et al., 2003). Sensory–motor integration for phonological information has recently been proposed to be a prominent function of area Spt (Hickok, 2009).

Thus, area Spt may play a central role in the mapping of acoustic speech input onto articulatory-motor representations, particularly when there is an increased computational demand on phonological processing. In prior studies, activation of components of the dorsal stream was shown predominantly in phonological tasks with a relatively high verbal working memory load (Buchsbaum et al., 2001, Buchsbaum et al., 2005a, Burton et al., 2000, Heim et al., 2003). Some of these studies used relatively artificial tasks requiring metalinguistic processing, such as subvocal rehearsal (Buchsbaum et al., 2005a) or explicit phoneme discrimination (Ashtari et al., 2004, Burton et al., 2000, Jacquemot et al., 2003, Zaehle et al., 2008). In addition, some of the studies selectively examined either receptive (Burton et al., 2000, Heim et al., 2003) or expressive phonological processes (Okada et al., 2003). The question arises, therefore, whether Spt is involved in natural phonological operations such as stress shifts across syllable boundaries (as in China versus Chinese) or vowel changes (as in woman versus women), operations which are usually performed according to implicit linguistic knowledge. Thus, the primary goal of the present study was to examine the involvement of area Spt in performing phonological manipulations, using naturalistic phonological processes requiring relatively low verbal working memory resources.

Auditory–motor integration generally is assumed to rely on left-hemispheric dorsal stream structures (e.g. Hickok and Poeppel, 2007). While prelexical auditory processing involves the superior temporal gyri of both hemispheres (Hickok, 2009, Hickok and Poeppel, 2004, Hickok et al., 2008), later auditory–verbal processing stages are presumed to involve left lateralized pathways (e.g. Hickok and Poeppel, 2000). However, the lateralization of dorsal stream activation during auditory–motor integration may depend on the nature of the phonological process. Available evidence suggests differential lateralization of segmental versus prosodic processing. Sublexical processes requiring the sequencing of phonemic information primarily involve left hemispheric regions (Ashtari et al., 2004, Burton and Small, 2006, Burton et al., 2000, Gelfand and Bookheimer, 2003, Heim et al., 2003, Jacquemot et al., 2003, Zaehle et al., 2008). In contrast, findings regarding the lateralization of linguistic–prosodic processes have been less consistent. Some lesion studies provide evidence consistent with the processing of prosodic units in the right hemisphere (Bradvik et al., 1991, Weintraub et al., 1981), whereas other studies suggest a left-hemispheric dominance for linguistic prosody (Arciuli and Slowiaczek, 2007, Emmorey, 1987, Van Lancker, 1980). Neuroimaging evidence, comparing the production of rhythmic to isochronous syllable sequences, points to an involvement of right hemisphere regions in linguistic–prosodic processing at the supra-syllabic level (Riecker et al., 2002). Therefore, a secondary goal was to examine the lateralization of dorsal stream activation during segmental versus prosodic phonological manipulation.

To investigate the involvement of area Spt in phonological processing and to determine the role of left versus right hemisphere structures, we used a segmental and a prosodic manipulation task, both based on naturalistic phonological regularities. Subjects were asked to manipulate auditory pseudoword stimuli phonologically according to implicit linguistic knowledge (i.e. from ‘woman’ to ‘women’ or from ‘China’ to ‘Chinese’) before overt reproduction. Verbatim repetition of the stimulus served as the control condition. In task construction we proceeded from the assumption that in order to arrive at the correct reproduction after phonological manipulation, a stimulus first has to be analyzed sequentially, then the phonological information has to be manipulated at the prosodic or the segmental level, respectively, and finally the stimulus has to be reassembled in order to generate the target utterance. In contrast, in the verbatim repetition condition a stimulus only has to be analyzed sequentially and then reassembled for production.

Our main question was whether additional phonological manipulations between speech perception and speech production would preferentially engage area Spt when compared to verbatim repetition. Our secondary question was whether the nature of the employed phonological operation, namely segmental versus prosodic manipulation, would result in preferential engagement of the left and/or the right hemisphere.

We hypothesized that repetition as well as reproduction after manipulation involves structures associated with the dorsal stream, predominantly superior temporal and inferior frontal areas. Reproduction after manipulation, as compared to verbatim repetition, was expected to reveal increased activation of Spt due to the additional phonological manipulation required. The comparison between the two phonological processes was expected to show predominantly left lateralized activation for segmental and more bilateral or right lateralized activation for prosodic manipulation.

Section snippets

Participants

Twenty-three healthy right-handed subjects participated in the study. One of the subjects had to be excluded from further analysis, as he misunderstood the task of one of the sessions. All subjects were native speakers of German without any history of serious medical, neurological or psychiatric illness, or hearing loss (mean age 26.8 years, range 21–36 years, eleven females). Hand preference was tested with the 10-item version of the Edinburgh Handedness Inventory (Oldfield, 1971). Subjects had

Phonemic accuracy across tasks and phonological processes

All participants reached a satisfactory level of phonemic accuracy, with a group mean of 86.8% correct responses across tasks and phonological processes (range 80.6–93.1%, SD 3.7%, see Table 2). Thus, none of the subjects had to be excluded from further analysis. Apart from the primary factor phonological process (i.e., segmental or prosodic processing), task was included as a second within-subject factor in a repeated-measures MANOVA. This analysis revealed significant main effects of task

Discussion

The present fMRI study aimed to examine whether areas associated with the dorsal stream are involved in perception–production tasks requiring relatively few verbal working memory resources. Additionally, the present study sought to examine the relative contributions of the left and right hemispheres, depending on the segmental versus prosodic nature of a phonological manipulation.

Subjects performed two different phonological tasks which required either a segmental or a prosodic manipulation of

Acknowledgments

This work was supported by a grant from the German Federal Ministry of Education and Research (BMBF-01GW0572) to WZ and AB and was carried out as part of the collaborative BMBF research project “From dynamic sensorimotor interaction to conceptual representation: Deconstructing apraxia”. We thank two anonymous reviewers for their constructive comments.

References (96)

  • F.H. Guenther

    Cortical interactions underlying the production of speech sounds

    J. Commun. Disord.

    (2006)
  • F.H. Guenther et al.

    Neural modeling and imaging of the cortical interactions underlying syllable production

    Brain Lang.

    (2006)
  • S. Heim et al.

    Broca's area in the human brain is involved in the selection of grammatical gender for language production: evidence from event-related functional magnetic resonance imaging

    Neurosci. Lett.

    (2002)
  • S. Heim et al.

    Phonological processing during language production: fMRI evidence for a shared production–comprehension network

    Brain Res. Cogn. Brain Res.

    (2003)
  • G. Hickok

    The functional neuroanatomy of language

    Phys. Life Rev.

    (2009)
  • G. Hickok et al.

    Towards a functional neuroanatomy of speech perception

    Trends Cogn. Sci.

    (2000)
  • G. Hickok et al.

    Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language

    Cognition

    (2004)
  • G. Hickok et al.

    Bilateral capacity for speech sound processing in auditory comprehension: evidence from Wada procedures

    Brain Lang.

    (2008)
  • C. Hutton et al.

    Image distortion correction in fMRI: a quantitative evaluation

    Neuroimage

    (2002)
  • P. Indefrey et al.

    Syntactic processing in left prefrontal cortex is independent of lexical meaning

    Neuroimage

    (2001)
  • C. Jacquemot et al.

    What is the relationship between phonological short-term memory and speech processing?

    Trends Cogn. Sci.

    (2006)
  • J. Kappes et al.

    Unintended imitation in nonword repetition

    Brain Lang.

    (2009)
  • W.J.M. Levelt et al.

    Do speakers have access to a mental syllabary?

    Cognition

    (1994)
  • R.C. Martin et al.

    An event-related fMRI investigation of phonological versus semantic short-term memory

    J. Neurolinguist.

    (2003)
  • L. Mary et al.

    Extraction and representation of prosodic features for language and speaker recognition

    Speech Comm.

    (2008)
  • T. Nichols et al.

    Valid conjunction inference with the minimum statistic

    Neuroimage

    (2005)
  • K. Okada et al.

    Left posterior auditory-related cortices participate both in speech perception and speech production: neural overlap revealed by fMRI

    Brain Lang.

    (2006)
  • R.C. Oldfield

    The assessment and analysis of handedness: the Edinburgh inventory

    Neuropsychologia

    (1971)
  • J. Pa et al.

    A parietal–temporal sensory–motor integration area for the human vocal tract: evidence from an fMRI study of skilled musicians

    Neuropsychologia

    (2008)
  • R. Padovani et al.

    Grammatical gender in the brain: evidence from an fMRI study on Italian

    Brain Res. Bull.

    (2005)
  • D. Papathanassiou et al.

    A common language network for comprehension and production: a contribution to the definition of language epicenters with PET

    Neuroimage

    (2000)
  • G.J. Parker et al.

    Lateralization of ventral and dorsal auditory–language pathways in the human brain

    Neuroimage

    (2005)
  • J.R. Pedersen et al.

    Origin of human motor readiness field linked to left middle frontal gyrus by MEG and PET

    Neuroimage

    (1998)
  • C. Peschke et al.

    Auditory–motor integration during fast repetition: the neuronal correlates of shadowing

    Neuroimage

    (2009)
  • H. Pihan

    Affective and linguistic processing of speech prosody: DC potential studies

    Prog. Brain Res.

    (2006)
  • S.M. Ravizza et al.

    Functional dissociations within the inferior parietal cortex in verbal working memory

    Neuroimage

    (2004)
  • A. Riecker et al.

    Hemispheric lateralization effects of rhythm implementation during syllable repetitions: an fMRI study

    Neuroimage

    (2002)
  • K. Rubia et al.

    An fMRI study of reduced left prefrontal activation in schizophrenia during normal inhibitory function

    Schizophr. Res.

    (2001)
  • B. Rypma et al.

    Load-dependent roles of frontal brain regions in the maintenance of working memory

    Neuroimage

    (1999)
  • K.D. Singh et al.

    Transient and linearly graded deactivation of the human default-mode network by a visual detection task

    Neuroimage

    (2008)
  • K. Specht et al.

    Tracing the ventral stream for auditory speech processing in the temporal lobe by using a combined time series and independent component analysis

    Neurosci. Lett.

    (2008)
  • F. Strand et al.

    Phonological working memory with auditory presentation of pseudo-words — an event related fMRI study

    Brain Res.

    (2008)
  • J.A. Tourville et al.

    Neural mechanisms underlying auditory feedback control of speech

    Neuroimage

    (2008)
  • J.E. Warren et al.

    Sounds do-able: auditory–motor transformations and the posterior temporal plane

    Trends Neurosci.

    (2005)
  • M. Wilke et al.

    LI-tool: a new toolbox to assess lateralization in functional MR-data

    J. Neurosci. Methods

    (2007)
  • M. Wilke et al.

    A combined bootstrap/histogram analysis approach for computing a lateralization index from neuroimaging data

    Neuroimage

    (2006)
  • S.M. Wilson et al.

    Neural responses to non-native phonemes varying in producibility: evidence for the sensorimotor nature of speech perception

    Neuroimage

    (2006)
  • T. Zaehle et al.

    Segmental processing in the human auditory dorsal stream

    Brain Res.

    (2008)
  • Cited by (32)

    • Dynamic auditory contributions to error detection revealed in the discrimination of Same and Different syllable pairs

      2022, Neuropsychologia
      Citation Excerpt :

      As internal modeling is thought to be involved in the cognitive mechanisms scaffolding perception (Skipper et al., 2006, 2017), it is proposed that error detection in auditory cortex also may occur when multiple sounds are being held in working memory for comparison purposes. Net sensorimotor activity from the DS in speech perception typically correlates with the cognitive demands of the task (Alho et al., 2012, 2014; Deng et al., 2012; Peschke et al., 2012; Wostmann et al., 2017). Electroencephalographic (EEG) time-frequency studies also clearly demonstrate that, in contrast to production, DS activity in perception is highly variable across the time course of the task (Bowers et al., 2013; Jenson et al., 2014a; Jenson et al., 2014b; Saltuklaroglu et al., 2018).

    • Mu rhythm dynamics suggest automatic activation of motor and premotor brain regions during speech processing

      2021, Journal of Neurolinguistics
      Citation Excerpt :

      The notion that activation in motor and premotor regions increases as the general cognitive demands of the perception task increase is fairly well established. For example, several studies have demonstrated that active speech discrimination tasks are associated with greater motor and/or premotor activity in comparison to listening to speech passively (Alho et al., 2012, 2014; Peschke et al., 2012; Wostmann et al., 2017). Similarly, within active discrimination paradigms, increasing the difficulty of the discrimination task also increases motor and premotor neural activation.

    View all citing articles on Scopus
    View full text