Task effects on prosodic prominence

Posted on 23 Mar 2013 by Joe Toscano

Buxó-Lugo, A., Toscano, J. C., & Watson, D. G. (2013, March). Paper presented at the 26th Annual CUNY Conference on Human Sentence Processing.

Abstract:

Most prosody research is conducted on spoken corpora or on speech samples collected from laboratory studies. However, laboratory studies often use referential communication tasks in which speakers talk to a computer and the communicative “stakes” are low. Previous studies have reported discourse focus effects on duration, intensity, and/or F0 (see Wagner & Watson, 2010, for a review), but these effects are typically small and the distributions of acoustic cues between different discourse contexts (e.g., given vs. new information) are highly overlapping. Thus, these cues may not provide listeners with reliable information about prosodic structure.

This could mean that the specific cues measured are not the best indicators of prosodic prominence or that talkers simply do not reliably convey this information. Another possibility is that prosodic choices are influenced by the degree to which a speaker is engaged in the task. If the task engages their attention, they might provide more informative cues than they would otherwise. If true, prosody use between interlocutors who care about conversational outcomes might look very different from prosody use in laboratory communication tasks. While corpora presumably contain speech from more natural conversations that may exhibit this property, they do not offer the level of control over prosodic context that laboratory tasks do.

We examined this issue by developing two different tasks: a low-engagement task and a high-engagement task. Critically, the words produced on experimental trials in each task are identical. The low-engagement task is similar to the referential communication tasks that are typically used in the field. The high-engagement task is a novel, goal-oriented task set in a computer game. In the game, participants move through a 3D environment and interact with objects in it to solve a series of puzzles. In both tasks, the participant is presented with color sequences that are read aloud. Color sequences are presented in pairs of three colors per trial. In the low-engagement task, the participant sees each color sequence, reads it, and proceeds to the next trial. In the high-engagement (computer game) task, the participant has to communicate with another person over a headset in order to solve a puzzle on each trial and reach the end of the game. Filler trials alternate with critical trials that involve the same color sequences used in the low-engagement task. One participant is presented with the color sequence and conveys that information to the other participant. The second participant enters the color sequence as a “code” in the game, unlocking the door for that trial. Because the same sequences are produced in each task, speakers’ productions are comparable across them.

Critically, we manipulated the information status of specific words in each sequence. For both tasks, there were scenarios in which the target words – the second word in the second sequence – were given (already mentioned), new, or carried contrastive focus. Example sentences are in (1) with target words in bold:

(1a): New sequences: red blue green | gray pink black
(1b): Given sequences: gray pink black | gray pink black
(1c): Contrast sequences: gray blue black | gray pink black

We analyzed target word duration, F0, and intensity using multilevel models. For duration, we found a main effect of information status, with longer durations for the new and contrast conditions. We also found a main effect of engagement, with longer durations overall for the high-engagement task. Critically, we also found an interaction between task and information status, suggesting that the differences in duration were larger across information status conditions in the high-engagement task. For F0, we found a main effect of information status, with higher F0 for new and contrast conditions, and an effect of engagement, with overall higher F0 in the high-engagement task. Lastly, we found significantly higher intensity for the high-engagement task.

These data suggest that speaker engagement matters in the production of prominence and that speakers may provide more reliable cues to prosodic context when they are engaged in a conversation.

Tagged with: computer games, discourse, language processing, minecraft, prosodic prominence, prosody, statistical models
Posted in Presentations