Automatic question detection from prosodic speech analysis
Date
2019
Authors
Hirsch, Rachel, author
Draper, Bruce, advisor
Whitley, Darrell, advisor
Kirby, Michael, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
Human-agent spoken communication has become ubiquitous over the last decade, with assistants such as Siri and Alexa being used more every day. An AI agent needs to understand exactly what the user says to it and respond accurately. To correctly respond, the agent has to know whether it is being given a command or asked a question. In Standard American English (SAE), both word choice and intonation of the speaker are necessary to discern the true sentiment of an utterance. Much Natural Language Processing (NLP) research has been done into automatically determining these sentence types using word choice alone. However, intonation is ultimately the key to understanding the sentiment of a spoken sentence. This thesis uses a series of attributes to characterize vocal prosody of utterances to train classifiers to detect questions. The dataset used to train these classifiers is a series of hearings by the Supreme Court of the United States (SCOTUS). Prosody-trained classifier results are compared against a text-based classifier, using Google Speech-to-Text transcriptions of the same dataset.
Description
Rights Access
Subject
lexicon
natural language processing
sentiment detection
machine learning
human-computer interaction
prosody