Repository logo

Exploring The Effects Of Multimodal Features On A Machine Learning Knowledge Tracker

dc.contributor.authorkhebour, Ibrahim, author
dc.contributor.authorKrishnaswamy, Nikhil, advisor
dc.contributor.authorBlanchard, Nathaniel, committee member
dc.contributor.authorPeterson, Christopher, committee member
dc.date.accessioned2026-06-08T10:31:27Z
dc.date.issued2026
dc.description.abstractConversations involve multiple channels of information exchange. Spoken language is the most common, but non-verbal cues such as gestures, body pose, and movements also play a role. These channels carry semantic information but are discrete and harder for machines to detect. Recent advances in multimodal Large Language Models (LLMs) show that incorporating additional modalities can improve performance, raising the question: how much do extra modalities contribute, and what are the limits of continually stacking them?Modeling the flow of conversation remains challenging for AI, particularly in natural, collaborative settings where non-verbal channels are prominent. To address this, TRACE was developed, a multimodal system that monitors shared knowledge in group tasks by tracking utterances, gestures, and actions. The system runs in real time using speech-only features, while an offline version integrates broader modalities, including problem-solving cues from speech, actions, and gestures. This thesis extends the live system by incorporating additional features. Some require training new models to process visual inputs in real time. Since components may differ from the offline version, I will conduct a comparative analysis of both systems. The evaluation will highlight cases where the live version underperforms, as some loss is expected. A comparison with the current live tracker will also measure the impact of new modalities. The Weights Task Dataset will be used for training, testing, and evaluation of action and gesture classification. Automating this process reduces the need for manual annotation and links gestures to broader semantic context, offering substantial value for future work.
dc.format.mediumborn digital
dc.format.mediummasters theses
dc.identifierkhebour_colostate_0053N_19339.pdf
dc.identifier.urihttps://hdl.handle.net/10217/244745
dc.identifier.urihttps://doi.org/10.25675/3.027105
dc.languageEnglish
dc.language.isoeng
dc.publisherColorado State University. Libraries
dc.relation.ispartof2020-
dc.rightsCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.subjectHuman-Computer Interactions
dc.subjectMultimodality
dc.subjectHuman-Human Interactions
dc.subjectCommon Ground
dc.titleExploring The Effects Of Multimodal Features On A Machine Learning Knowledge Tracker
dc.typeText
dcterms.rights.dplaThis Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.disciplineComputer Science
thesis.degree.grantorColorado State University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science (M.S.)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
khebour_colostate_0053N_19339.pdf
Size:
1.75 MB
Format:
Adobe Portable Document Format

Collections