Measuring acoustic cue encoding and categorization during speech processing using the auditory N1 and P3 ERP components

Toscano, J. C., & McMurray, B. (2011, November). Paper presented at the 10th Auditory Perception, Cognition, and Action Meeting, Seattle WA.

Abstract: An important question in speech perception is whether listeners encode speech sounds in terms of continuous acoustic cues at early stages of processing or whether they perceive them only in terms of categories. Although behavioral data show that listeners are sensitive to within-category acoustic differences, their responses are still generally shaped by phoneme categories, making this question difficult to answer. Recently, we have used the event-related brain potential (ERP) technique to examine cue encoding more directly (Toscano, McMurray, Dennhardt, & Luck, 2010, Psychological Science). We found that the amplitude of the auditory N1 component varies linearly with differences in voice onset time, a cue to word-initial voicing, suggesting that listeners encode continuous cues independently of categories. The later-occurring P3 component, in contrast, shows effects of both acoustic differences and phonological categories. Here, we ask whether the N1 may provide a general tool for studying cue encoding by examining ERP responses to other sets of speech sounds. Specifically, we asked whether we could observe differences in N1 amplitude for (1) naturally-produced, rather than synthesized, sounds; (2) spectral, rather than temporal, cues that distinguish other classes of phonemes (e.g., formant frequencies for vowels); and (3) word-medial acoustic differences. We also examined P3 responses, as well as the effect of task-defined phonological contrasts on the N1. The results show that differences in N1 amplitude can be clearly observed for some classes of speech sounds but are difficult to observe for others, though differences in P3 amplitude can still be seen. Thus, the N1 may serve as an index of cue encoding (in addition to other aspects of auditory processing identified by prior work). However, the specific speech sounds that we can study using this approach may depend on the complex link between the cues of interest and the neural generators of the N1.

Link to talk slides

Tagged with: , , , , , , , ,
Posted in Presentations