VanderHoeven, Hannah G., authorBlanchard, Nathaniel, advisorKrishnaswamy, Nikhil, advisorCleary, Anne M., committee member2024-09-092024-09-092024https://hdl.handle.net/10217/239139Throughout various collaborative problem solving (CPS) tasks, multiple different communicative modalities may be used by participants as they communicate with each other to work towards some goal. The ability to recognize and act on these modalities is vital for a multimodal AI agent to effectively interact with humans in a meaningful way. Potential modalities of interest might include, speech, gesture, action, pose, facial expression, and object positions in three dimensional space. As AI becomes move commonplace in various collaborative environments, there is a lot of potential to use an agent to help support learning, training and understanding of how small groups work together to complete CPS tasks. Designing a well rounded system to best understand small group interactions, multiple different modalities need to be supported. Gesture is one of many important features to consider in multimodal design. Robust gesture recognition is a key component of multimodal language understanding in addition to human-computer interaction. Most vision based approaches for gesture recognition focus on static standalone gestures that are identifiable in a single video frame. In CPS tasks, more complex gestures made up of multiple "phases" are more likely to exist. For instance deixis, or pointing, as it is used to indicate objects and referents in a scene. In this thesis, I present a novel method for robust gesture detection based on gesture phase semantics. This method is competitive with many state of the art computer vision approaches while being faster to train on annotated data. I also present various applications of this method to utilize pointing detection in a real-world collaborative task, and I discuss the importance of robust gesture detection as an important feature in multimodal agent design in further depth.born digitalmasters thesesengCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.gesturemultimodal problem solvinghuman computer interactioncommunicationRobust gesture detection for multimodal problem solvingText