![]()
Research - Neurofunctional Imaging
USING fMRI TO COMPARE TALKING ALOUD TO "THE LITTLE VOICE IN THE HEAD"
Jie Huang1, Thomas H. Carr2, and Yue
Cao1
Departments of Radiology1 and Psychology2
Michigan State University, East Lansing, MI
SUMMARY
Vocalization-induced motion artifact has limited the ability to apply fMRI to speech production. We describe a set of techniques for motion reduction, detection, and correction intended to remove these artifacts from cortical activation during overt speech. We combined these techniques with Event-Related fMRI to compare overt speech with silent speech, focusing on Broca’s area and its right-hemisphere homologue plus two inferior regions of left and right primary motor cortex as regions of interest (Figure 1).
After training to reduce head motion, subjects performed four speech tasks, each in a run of 12 trials lasting 32 s each. Whole-brain-plus-vocal-tract images were collected by using gradient echo EPI with resolution of 7 x 3.75 x 3.75 mm. Subjects named letters overtly, named letters silently, generated animal names starting with particular letters overtly, and generated animal names silently. Image motion analyses revealed small, correctable rotations in the sagittal plane, which varied much more with subject than with task. Vocal-tract muscle movements produced large, sharply peaked signal changes in the first 4 s of each “overt” trial that could be used as the signature of articulatory motion (Figure 2). The signal changes appeared in some inferior-posterior brain voxels, but did not pass the activation threshold in the regions of interest.
Activation was analyzed using complex cross correlation techniques with phase and intensity criteria. The middle-inferior portion of primary motor cortex was activated bilaterally during overt speech but not during silent speech (Figure 3, Figure 4 and Figure 5). Activity in Broca’s area and its right homologue was more complex, responding in opposite directions to letter naming versus animal name generation (Figure 3 and Figure 6). Compared to silent speech, overt letter naming increased Broca’s area activation, both absolutely and relative to its right homologue (Figure 3a and Figure 6). This result might be taken to mean that Broca’s area is particularly important to overt speaking, perhaps because of the precise articulatory programming required to actually say a word aloud. This interpretation fails immediately, however, when animal-name generation is considered. In animal-name generation, overt speaking decreased Broca’s activation compared to silent speech, both absolutely and relative to its right homologue (Figure 3b and Figure 6). Thus overt speaking always increased the lateralization of inferior frontal activity, but in letter naming activation became more left-lateralized whereas in animal-name generation activation became slightly more right-lateralized. Though surprising, part of this pattern has been reported before using PET, lending some support to our findings and hence our methods.
INTRODUCTION
FMRI of cortical language functions favors “silent” task paradigms with no overt speaking, despite the importance of overt speech in linguistics, psycholinguistics, aphasiology, and everyday life. This is due to the fact that vocalization can induce severe motion artifacts in MR images [1-3]. If the neural networks of speech production were organized in a straightforward hierarchical fashion, e.g. overt speech equals silent speech plus an independent motor execution process that can be turned on or off as needed without much altering the rest of the system, then this would not be a problem. Results obtained during silent speaking could be extrapolated rather directly to the problem of understanding overt speaking. However, a PET study of cortical activation during silent versus overt word reading and silent versus overt picture naming by Bookheimer et al. found complex differences among the tasks [4] that suggested the possibility of different neural networks for silent speech and overt speech. In addition, neuroimaging studies of verbal working memory identify neural circuitry including regions of dorsolateral prefrontal cortex, parietal cortex, and temporal cortex that, though left-lateralized to a significant extent and sometimes also including Broca’s area, bear little resemblance to classical conceptions of the language areas of the brain [5-7]. Finally, aphasic patients often complain that the words they speak bear a poor correspondence to the words they think and intend to say [8,9]. These findings and phenomena make it an empirical question to what extent “the little voice in the head” shares neural substrate with speaking out loud. The present study combined Event-Related fMRI methodology (ER-fMRI) with a set of techniques for motion reduction, detection, and correction to further investigate cortical activation during overt speech and compare it to silent speech, with Broca’s area and primary motor cortex as particular regions of interest.
SOURCES OF ARTIFACT
What are the motion artifacts that need to be guarded against? People tend to bob their heads when speaking, producing movement in the sagittal plane that is correlated with task activity [2, 3]. In addition, muscles of the mouth, lips, tongue, jaw, and face move during speech, which can disturb the B0 magnetic field, changing the MR signals in the field of view that includes the brain even if the muscle movements occur outside the field of view [1].
ADVANTAGES OF EVENT-RELATED fMRI
In block-design studies, BOLD signals that emerge over several seconds after speaking each word are contaminated by signal changes caused by head and vocal-tract movements while speaking the following few words. In ER-fMRI, each trial consists of performance of the task followed by an extended period of rest that allows the hemodynamic response function (HRF) to develop and subside to baseline. A signal change caused by head or vocal-tract movement is likely to appear quite early in each trial timeline, while speech is actually occurring. The HRF for the neural activity that creates the speaking is delayed, not rising significantly above baseline for 2-4 s and not reaching its peak until 5-7 s after speaking. Birn et al. [10] showed the possibility of using these different temporal profiles to discriminate motion-induced false positives from true BOLD signal changes, and we pursue that possibility here.
METHODS AND MATERIALS
Seven normal right-handed native English speakers (4 male, 3 female, age 20 to 35 yrs) performed four language tasks, each during a separate functional scan: (1) silently naming a visually-presented letter of the alphabet; (2) overtly naming a visually-presented letter; (3) silently generating an animal name starting with a given visually-presented letter; and (4) overtly speaking an animal name starting with a given visually-presented letter. The letter-naming tasks were done before the animal-name-generation tasks, with order of silent versus overt speaking within each type of task counterbalanced across subjects. Two sets of stimuli, each consisting of 12 letters without duplication between sets, were used for the study, with silent and overt tasks assigned different sets. Sagittal T2*-weighted images of a whole head (including both the brain and vocal tract) were acquired on a GE 1.5 T clinical scanner using a gradient echo Echo-Planar-Imaging pulse sequence (field-of-view 24 cm, TE/TR=50/2000 ms, flip angle 90°, matrix size 64x64, slice thickness 7 mm). During scanning, each letter was displayed for 1 s, followed by a 31 s long fixation point. A total of 192 images per anatomic section was acquired for each functional scan. Before scanning, all subjects received a training session in the magnet to reduce head movement while speaking overtly.
DATA ANALYSIS
Head motion detection and correction: Functional images were assessed and corrected for possible in-plane translation and rotation of the head [11]. Each image in the dynamic series was registered with the first image by planar translations and rotations. The magnitudes of the translation and rotation were compared between silent and overt conditions. Baseline correction and normalization: Time course signal intensities of each voxel in an image series were corrected for possible slow baseline drifts using 0, 1st, and 2nd order polynomials, and then normalized to allow signal averaging over voxels and subjects. Assessment of articulatory motion: Signal intensity time courses in the vocal-tract musculature showed that the signal changes induced by articulatory motion during overt speaking occurred earlier than the delayed BOLD signal changes. The temporal profiles of the MRI signals derived from muscle movement were used to exclude articulatory motion artifacts in cortical activation. Statistical analysis of activation: Time courses of images were cross-correlated [12] with sine and cosine reference functions to obtain a pair of complex cross-correlation coefficients (ccc) voxel by voxel [13]. Then, magnitude and phase of ccc were calculated [13]. The phase represents relative timing of signal changes from trial onset and was used to differentiate false positives due to motion from true BOLD activation. Activation were thresholded by CCC magnitude with a type I error of p<0.0014 per voxel and by phase to exclude voxels having an early signal change. Regions of interest (ROI) analysis: First, a quantitative assessment of activated tissue volumes was made in six cortical regions: Broca’s area and its right homologue, the left and right “mouth, lips and tongue” regions of the primary motor cortex (MLT-PMC), the left and right “inferior vocalization” regions of the PMC (IV-PMC) (Figure 1). Second, for each ROI, the most activated cluster of three contiguous voxels located near the center of the ROI was chosen as a mask to compare BOLD signals from each pair of the silent and the overt paradigms. A two-tailed paired t test was used to test for a difference between the silent and overt conditions.
RESULTS
Head Motion Detection: In all 7 subjects, in-plane (sagittal) rotations but no translations were detected in images obtained during both silent and overt speaking conditions. For 4 subjects the detected rotations were less than 1 degree in all four conditions. In the other 3 subjects, head rotation > 1° but < 1.5° was detected in less than 1% of the total number of images. Detected head movement in images during overt speaking was not substantially worse than that during silent speaking, and therefore motion was more subject-dependent than task-dependent.
Assessment of articulatory motion: Signal intensity changes as large as 70% were observed in some voxels of the tongue and throat areas in the overt conditions, presumably induced by articulatory motion during overt speaking. During overt speaking of a letter name, the signal changes in the tongue area occurred 2-4 s earlier than the BOLD signal changes in the left MLT-PMC (Figure 2).
Regions of interest analysis: During overt speaking both in letter naming and in animal-name generation, robust activation was observed bilaterally in the MLT-PMC and IV-PMC (Figure 3, Figure 4, and Figure 5). These regions were not activated above baseline during silent speech. Activity in Broca’s area and its right homologue was more complex, responding in opposite directions to letter naming versus animal name generation. Overt letter naming increased Broca’s area activation, both absolutely and relative to its right homologue. In animal-name generation, overt speaking decreased Broca’s activation compared to silent speech, both absolutely and relative to its right homologue (Figure 3, and Figure 6).
DISCUSSION AND CONCLUSION
This study investigated whether in MRI-naďve subjects, articulatory motion could be controlled to a manageable level, meaning that the amount of articulatory motion could be limited and signal changes due to motion could be detected and discriminated from BOLD activation. A training session prior to scanning limited motion artifacts induced by head movement while speaking overtly, with only small rotations of the head in the sagittal plane observed. Head movement detected during overt speaking, though perhaps increased slightly in some subjects, was never substantially worse than head movement detected during silent speech. Moreover, most subjects showed approximately the same distribution of head-motion artifact during silent speech as during overt speech, indicating that under the training conditions and task demands we examined, head motion was considerably more subject-dependent than task-dependent. Image registration algorithms were able to correct the rotation-induced image misalignment we observed.
A second focus of the study, beyond development of methods for identifying and correcting motion artifact, was to assess the comparability of the language-production-related neural pathways involved in silent and overt speech. We observed strikingly different patterns of activation.
The vocalization-relevant areas of PMC, covered by the two ROIs we refer to as MLT-PMC and IV-PMC, were activated only during overt speech. In both of these motor areas activation was bilateral, consistent with known bilateral innervation of midline vocal-tract musculature [14].
Activation in Broca’s area occurred while silently speaking either a letter name or an animal name. However, while Broca’s area showed more activity in naming a letter overtly than in naming a letter silently, when the task was generating an animal name, Broca’s area actually showed less activity in speaking overtly than in speaking silently. Historically, Broca’s area has been associated with motor planning and articulatory coding of speech output [15, 16] and with syntactic processing [16-18]. In contrast, the present results suggest that if Broca’s area plays a role in phonological or articulatory coding, this role is not particular to overt production – that is, it is not tied specifically to motor output. Though perhaps surprising, this outcome is consistent with findings reported by previous PET and fMRI studies of speaking a word [2, 3, 14, 19]. Clearly much work remains to be done before the mysteries of Broca’s area are solved. Even the role of Broca’s area in producing Broca’s aphasia does not appear to be certain [20, 21].
REFERENCES
- R. M. Birn, et al., Magn. Reson. Med. 40, 55 (1998).
- F. Z. Yetkin, et al., Am. J. Neuroradiology 16, 1087 (1995).
- R. Hinke, et al., Cognitive Neuroscience and Neuropsychology 4, 675 (1993).
- S. Y. Bookheimer, et al., Human Brain Mapping 3, 93 (1995).
- E. E. Smith, et al., Science 283, 1657 (1999).
- B. R. Postle, et al., Proc Natl Acad Sci USA 96, 12959 (1999).
- J. Jonides, et al., J. Cognitive Neuroscience 9, 462 (1997).
- R. C. Marshall, et al., Aphasiology 8, 535 (1994).
- J. Marshall, et al., Brain Lang 63, 79 (1998).
- R. M. Birn, et al., Human Brain Mapping 7, 106 (1999).
- Y. Cao, et al., J. Mag. Res. Imag. 3, 869 (1993).
- P. A. Bandettini, et al., Magn. Reson. Med. 30, 161 (1993).
- A. T. Lee, et al., Magn. Reson. Med. 33, 745 (1995).
- R. J. Wise, et al., Lancet 353, 1057 (1999).
- N. Geschwind, Specializations of the human brain. Sci Am 241, 180 (1979).
- M. S. Gazzaniga, et al., Cognitive neuroscience: The biology of the mind. New York: Norton; 1998
- D. Caplan, et al., J. Cogn. Neurosci. 10, 541 (1998).
- A. D. Friederici, et al., Brain Lang. 74, 289 (2000).
- S. Y. Bookheimer, et al., In: First International Conference on Functional Mapping of the Human Brain, Paris, France, p. 429 (1995).
- N. F. Dronkers, Nature 384, 159 (1996).
- N. F. Dronkers, Brain Lang. 71, 59 (2000).

