Browsing by Author "Blanchard, Nathaniel, advisor"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
Item Open Access APE-V: athlete performance evaluation using video(Colorado State University. Libraries, 2021) Roygaga, Chaitanya, author; Blanchard, Nathaniel, advisor; Beveridge, Ross, committee member; Reiser, Raoul, committee memberAthletes typically undergo regular evaluations by trainers and coaches to assess performance and injury risk. One of the most popular movements to examine is the vertical jump — a sport-independent means of assessing both lower extremity risk and power. Specifically, maximal effort countermovement and drop jumps performed on bilateral force plates provide a wealth of metrics; however, detailed evaluation of this movement requires specialized equipment (force plates) and trained experts to interpret results, limiting its use. Computer vision techniques applied to videos of such movements are a less expensive alternative for extracting such metrics. Blanchard et al. collected a dataset of 89 athletes performing these movements and showcased how OpenPose could be applied to the data. However, athlete error calls into question 46.2% of movements — in these cases, an expert assessor would have the athlete redo the movement to eliminate the error. Here, I augmented Blanchard et al. with expert labels of error and established benchmark performance on automatic error identification. In total, 14 different types of errors were identified by trained annotators. My benchmark models identified errors with an F1 score of 0.710 and a Kappa of 0.457 (Kappa measures accuracy over chance).Item Open Access Automatically detecting task unrelated thoughts during conversations using keystroke analysis(Colorado State University. Libraries, 2022) Kuvar, Vishal Kiran, author; Blanchard, Nathaniel, advisor; Mills, Caitlin, advisor; Ben-Hur, Asa, committee member; Zhou, Wen, committee memberTask-unrelated thought (TUT), commonly known as the phenomenon of daydreaming or zoning- out, is a mental state where a person's attention moves away from the task-at-hand to self-generated thoughts. This state is extremely common yet not much is known about it during dyadic interactions. We built a model to detect when a person experiences TUTs while talking to another person through a chat platform, by analyzing their keystroke patterns. This model was able to differentiate between task-unrelated thoughts and task-related thoughts with a kappa of 0.343. This serves as a strong indicator that typing behavior is linked with mental states, task-unrelated thoughts in our case.Item Open Access EgoRoom: egocentric 3D pose estimation through multi-coordinates heatmaps(Colorado State University. Libraries, 2022) Jung, Changsoo, author; Blanchard, Nathaniel, advisor; Beveridge, Ross, committee member; Clegg, Benjamin, committee memberRecent head-mounted virtual reality (VR) devices include fisheye lenses oriented to users' bodies, which enable full body pose estimation from video. However, traditional joint detection methods fail under this use case because fisheye lenses make joint depth information ambiguous, causing body parts to be self-occluded by the distorted torso. To resolve these problems, we propose a novel architecture, EgoRoom, that uses three different types of heatmaps in 3D to predict body joints, even if they are self-occluded. Our approach consists of three main modules. The first module transmutes the fisheye image into feature embeddings via an attention mechanism. Then, the second module utilizes three decoder branches to convert those features into a 3D coordinate system, with each branch corresponding to the xy, yz, and xz planes. Finally, the third module combines the three decoder heatmaps into the predicted 3D pose. Our method achieves state-of-the-art results on the xR-EgoPose dataset.Item Open Access Eye've seen this before: building a gaze data analysis tool for déjà vu detection(Colorado State University. Libraries, 2022) Seabolt, Logan K., author; Blanchard, Nathaniel, advisor; Anderson, Charles, advisor; Thomas, Michael, committee memberIn order to expand the understanding of the phenomenon known as déjà vu, an investigation into the use of eyetracking was needed. Through the use of an advanced eyetracking device, open-source software, and previous research into déjà vu, this thesis provides a discussion and analysis of the development for a standardized eyetracking set up for general gaze data collection and a novel gaze data conversion pipeline. The tools created for this thesis work in conjunction to collect and convert data into easier to comprehend formats and separates the results into simplified separate text files. This data analysis tool analyzes and formats files en mass in order to make the processing of high volumes of data easier. These tools are designed to be accessible to professionals within and outside of the field of computer science. With these tools researchers can develop their own projects and implement the eyetracking code over theirs and then pass the output data through the data analysis tool to gather all the information needed.Item Open Access GAN you train your network(Colorado State University. Libraries, 2022) Pamulapati, Venkata Sai Sudeep, author; Blanchard, Nathaniel, advisor; Beveridge, Ross, advisor; King, Emily, committee memberZero-shot classifiers identify unseen classes — classes not seen during training. Specifically, zero-shot models classify attribute information associated with classes (e.g., a zebra has stripes but a lion does not). Lately, the usage of generative adversarial networks (GAN) for zero-shot learning has significantly improved the recognition accuracy of unseen classes by producing visual features on any class. Here, I investigate how similar visual features obtained from images of a class are to the visual features generated by a GAN. I find that, regardless of metric, both sets of visual features are disjointed. I also fine-tune a ResNet so that it produces visual features that are similar to the visual features generated by a GAN — this is novel because all standard approaches do the opposite: they train the GAN to match the output of the model. I conclude that these experiments emphasize the need to establish a standard input pipeline in zero-shot learning because of the mismatch of generated and real features, as well as the variation in features (and subsequent GAN performance) from different implementations of models such as ResNet-101.Item Open Access Intentional microgesture recognition for extended human-computer interaction(Colorado State University. Libraries, 2023) Kandoi, Chirag, author; Blanchard, Nathaniel, advisor; Krishnaswamy, Nikhil, advisor; Soto, Hortensia, committee memberAs extended reality becomes more ubiquitous, people will more frequently interact with computer systems using gestures instead of peripheral devices. However, previous works have shown that using traditional gestures (pointing, swiping, etc.) in mid-air causes fatigue, rendering them largely unsuitable for long-term use. Some of the same researchers have promoted "microgestures"---smaller gestures requiring less gross motion---as a solution, but to date there is no dataset of intentional microgestures available to train computer vision algorithms for use in downstream interactions with computer systems such as agents deployed on XR headsets. As a step toward addressing this challenge, I present a novel video dataset of microgestures, classification results from a variety of ML models showcasing the feasibility (and difficulty) of detecting these fine-grained movements, and discuss the challenges in developing robust recognition of microgestures for human-computer interaction.Item Open Access Neuralator 5000: exploring and enhancing the BOLD5000 fMRI dataset to improve the robustness of artificial neural networks(Colorado State University. Libraries, 2023) Pickard, William Augustus, author; Blanchard, Nathaniel, advisor; Anderson, Chuck, committee member; Thomas, Michael, committee memberArtificial neural networks (ANNs) originally drew their inspiration from biological constructs. Despite the rapid development of ANNs and their seeming divergence from their biological roots, research using representational similarity analysis (RSA) shows a connection between the internal representations of artificial and biological neural networks. To further investigate this connection, human subject functional magnetic resonance imaging (fMRI) studies using stimuli drawn from common ANN training datasets are being compiled. One such dataset is the BOLD5000, which is composed of fMRI data from four subjects who were presented with stimuli selected from the ImageNet, Common Objects in Context (COCO), and Scene UNderstanding (SUN) datasets. An important area where this data can be fruitful is in improving ANN model robustness. This work seeks to enhance the BOLD5000 dataset and make it more accessible for future ANN research by re-segmenting the data from the second release of the BOLD5000 into new ROIs using the vcAtlas and visfAtlas visual cortex atlases, generating representational dissimilarity matrices (RDMs) for all ROIs, and providing a new, biologically-inspired set of supercategory labels specific to the ImageNet dataset. To demonstrate the utility of these new BOLD5000 derivatives, I compare human fMRI data to RDMs derived from the activations of four prominent vision ANNs: AlexNet, ResNet-50, MobileNetV2, and EfficientNet B0. The results of this analysis show that the old, less-advanced AlexNet has a higher neuro-similarity than the much more recent, and technically better-performing models. These results are further confirmed through the use of Fiedler vector analysis on the RDMs, which shows a reduction in the separability of the internal representations of the biologically inspired supercategories.Item Open Access Robust gesture detection for multimodal problem solving(Colorado State University. Libraries, 2024) VanderHoeven, Hannah G., author; Blanchard, Nathaniel, advisor; Krishnaswamy, Nikhil, advisor; Cleary, Anne M., committee memberThroughout various collaborative problem solving (CPS) tasks, multiple different communicative modalities may be used by participants as they communicate with each other to work towards some goal. The ability to recognize and act on these modalities is vital for a multimodal AI agent to effectively interact with humans in a meaningful way. Potential modalities of interest might include, speech, gesture, action, pose, facial expression, and object positions in three dimensional space. As AI becomes move commonplace in various collaborative environments, there is a lot of potential to use an agent to help support learning, training and understanding of how small groups work together to complete CPS tasks. Designing a well rounded system to best understand small group interactions, multiple different modalities need to be supported. Gesture is one of many important features to consider in multimodal design. Robust gesture recognition is a key component of multimodal language understanding in addition to human-computer interaction. Most vision based approaches for gesture recognition focus on static standalone gestures that are identifiable in a single video frame. In CPS tasks, more complex gestures made up of multiple "phases" are more likely to exist. For instance deixis, or pointing, as it is used to indicate objects and referents in a scene. In this thesis, I present a novel method for robust gesture detection based on gesture phase semantics. This method is competitive with many state of the art computer vision approaches while being faster to train on annotated data. I also present various applications of this method to utilize pointing detection in a real-world collaborative task, and I discuss the importance of robust gesture detection as an important feature in multimodal agent design in further depth.Item Open Access SMOKE+: a video dataset for automated fine-grained assessment of smoke opacity(Colorado State University. Libraries, 2024) Seefried, Ethan, author; Blanchard, Nathaniel, advisor; Sreedharan, Sarath, committee member; Roberts, Jacob, committee memberComputer vision has traditionally faced difficulties when applied to amorphous objects like smoke, owing to their ever-changing shape, texture, and dependence on background conditions. While recent advancements have enabled simple tasks such as smoke detection and basic classification (black or white), quantitative opacity estimation in line with the assessments made by certified professionals remains unexplored. To address this gap, I introduce the SMOKE+ dataset, which features opacity labels verified by three certified experts. My dataset encompasses five distinct testing days, two data collection sites in different regions, and a total of 13,632 labeled clips. Leveraging this data, we develop a state-of-the-art smoke opacity estimation method that employs a small number of Residual 3D blocks for efficient opacity estimation. Additionally I explore the use of MAMBA blocks in a video based architecture, exploiting their ability to handle spatial and temporal data in a linear fashion. Techniques developed during the SMOKE+ dataset creation were then refined and applied to a new dataset titled CSU101, designed for educational use in Computer Vision. In the future I intend to expand further into synthetic data, incorporating techniques into Unreal Engine or Unity to add accurate opacity labels.Item Open Access Using eye gaze to automatically identify familiarity(Colorado State University. Libraries, 2024) Castillon, Iliana, author; Blanchard, Nathaniel, advisor; Sreedharan, Sarath, committee member; Cleary, Anne M., committee memberUnderstanding internal cognitive states, such as the sensation of familiarity, is crucial not only in the realm of human perception but also in enhancing interactions with artificial intelligence. One such state is the experience of familiarity, a fundamental aspect of human perception that often manifests as an intuitive recognition of faces or places. Automatically identifying cognitive experiences could pave the way for more nuance in human-AI interaction. While other works have shown the feasibility of automatically identifying other internal cognitive states like mind wandering using eye gaze features, the automatic detection of familiarity remains largely unexplored. In this work, we employed a paradigm from cognitive psychology to induce feelings of familiarity. Then, we trained machine learning models to automatically detect familiarity using eye gaze measurements, both in experiments with traditional computer use (e.g., eye tracker attached to monitor) and in virtual reality settings, in a participant independent manner. Familiarity was detected with a Cohen's kappa value, a measurement of accuracy corrected for random guessing, of 0.22 and 0.21, respectively. This work showcases the feasibility of automatically identifying feelings of familiarity and opens the door to exploring automated familiarity detection in other contexts, such as students engaged with a learning task while interacting with an intelligent tutoring system.Item Open Access Utilizing network features to detect erroneous inputs(Colorado State University. Libraries, 2020) Gorbett, Matthew, author; Blanchard, Nathaniel, advisor; Anderson, Charles W., committee member; King, Emily, committee memberNeural networks are vulnerable to a wide range of erroneous inputs such as corrupted, out-of-distribution, misclassified, and adversarial examples. Previously, separate solutions have been proposed for each of these faulty data types; however, in this work I show that the collective set of erroneous inputs can be jointly identified with a single model. Specifically, I train a linear SVM classifier to detect these four types of erroneous data using the hidden and softmax feature vectors of pre-trained neural networks. Results indicate that these faulty data types generally exhibit linearly separable activation properties from correctly processed examples. I am able to identify erroneous inputs with an AUROC of 0.973 on CIFAR10, 0.957 on Tiny ImageNet, and 0.941 on ImageNet. I experimentally validate the findings across a diverse range of datasets, domains, and pre-trained models.