One of the questions about language processing that we’re interested in is how people provide information by emphasizing different words. For example, someone might say “Give me the RED one”, and emphasize the word RED to contrast it with a different color. How does the talker indicate this in the way they pronounce the word RED? In particular, we would like to know the acoustic cues that talkers use to indicate aspects of prosody, such as the discourse status of a word (whether it was mentioned previously in a conversation or not).
However, one of the difficulties in answering this question is that we must run a controlled experiment and, at the same time, have people engage in natural conversation. Often, participants are not engaged in the task, and as a result, we do not get robust estimates of the acoustic cues they use to indicate discourse status. To overcome this, we’ve developed a computer game approach (using a game called Minecraft).
We’ve designed maps in Minecraft in which two participants work together to solve puzzles. In the puzzles, one participant has information they have to convey to their partner, and we vary the discourse status of this information. For example, in the figure below the avatars of two participants in different rooms are shown. The participant on the left has a sequence of colors (white, brown, green) that they must give to their partner as a code to unlock the door to the next room. We can then vary whether particular color sequences are new or previously mentioned.
Running the experiment in Minecraft gives us a controlled environment, and it gives participants the ability to have more natural conversations. Using this approach, we’ve been able to more accurately identify several acoustic features that people use to emphasize words. For example, talkers used word duration to indicate discourse status (new words tended to be longer than previously mentioned words), and these differences were more distinct than in similar, but less engaging, laboratory tasks. Thus, examining conversations in the game gives us a new way of studying spoken language processing and a better understanding of the acoustic cues to prosody.