Exploring The Effects Of Multimodal Features On A Machine Learning Knowledge Tracker
| dc.contributor.author | khebour, Ibrahim, author | |
| dc.contributor.author | Krishnaswamy, Nikhil, advisor | |
| dc.contributor.author | Blanchard, Nathaniel, committee member | |
| dc.contributor.author | Peterson, Christopher, committee member | |
| dc.date.accessioned | 2026-06-08T10:31:27Z | |
| dc.date.issued | 2026 | |
| dc.description.abstract | Conversations involve multiple channels of information exchange. Spoken language is the most common, but non-verbal cues such as gestures, body pose, and movements also play a role. These channels carry semantic information but are discrete and harder for machines to detect. Recent advances in multimodal Large Language Models (LLMs) show that incorporating additional modalities can improve performance, raising the question: how much do extra modalities contribute, and what are the limits of continually stacking them?Modeling the flow of conversation remains challenging for AI, particularly in natural, collaborative settings where non-verbal channels are prominent. To address this, TRACE was developed, a multimodal system that monitors shared knowledge in group tasks by tracking utterances, gestures, and actions. The system runs in real time using speech-only features, while an offline version integrates broader modalities, including problem-solving cues from speech, actions, and gestures. This thesis extends the live system by incorporating additional features. Some require training new models to process visual inputs in real time. Since components may differ from the offline version, I will conduct a comparative analysis of both systems. The evaluation will highlight cases where the live version underperforms, as some loss is expected. A comparison with the current live tracker will also measure the impact of new modalities. The Weights Task Dataset will be used for training, testing, and evaluation of action and gesture classification. Automating this process reduces the need for manual annotation and links gestures to broader semantic context, offering substantial value for future work. | |
| dc.format.medium | born digital | |
| dc.format.medium | masters theses | |
| dc.identifier | khebour_colostate_0053N_19339.pdf | |
| dc.identifier.uri | https://hdl.handle.net/10217/244745 | |
| dc.identifier.uri | https://doi.org/10.25675/3.027105 | |
| dc.language | English | |
| dc.language.iso | eng | |
| dc.publisher | Colorado State University. Libraries | |
| dc.relation.ispartof | 2020- | |
| dc.rights | Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright. | |
| dc.subject | Human-Computer Interactions | |
| dc.subject | Multimodality | |
| dc.subject | Human-Human Interactions | |
| dc.subject | Common Ground | |
| dc.title | Exploring The Effects Of Multimodal Features On A Machine Learning Knowledge Tracker | |
| dc.type | Text | |
| dcterms.rights.dpla | This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). | |
| thesis.degree.discipline | Computer Science | |
| thesis.degree.grantor | Colorado State University | |
| thesis.degree.level | Masters | |
| thesis.degree.name | Master of Science (M.S.) |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- khebour_colostate_0053N_19339.pdf
- Size:
- 1.75 MB
- Format:
- Adobe Portable Document Format
