Repository logo
 

Looking under the hood: visualizing what LSTMs learn

dc.contributor.authorPatil, Dhruva, author
dc.contributor.authorDraper, Bruce, advisor
dc.contributor.authorBeveridge, J. Ross, committee member
dc.contributor.authorMaciejewski, Anthony, committee member
dc.date.accessioned2019-09-10T14:35:31Z
dc.date.available2019-09-10T14:35:31Z
dc.date.issued2019
dc.description.abstractRecurrent Neural Networks (RNNs) such as Long Short Term Memory (LSTM) and Gated Recurrent Units (GRUs) have been successful in many applications involving sequential data. The success of these models lies in the complex feature representations they learn from the training data. One criteria to trust the model is its validation accuracy. However, this can lead to surprises when the network learns properties of the input data, different from what the designer intended and/or the user assumes. As a result, we lack confidence in even high-performing networks when they are deployed in applications with novel input data, or where the cost of failure is very high. Thus understanding and visualizing what recurrent networks have learned becomes essential. Visualizations of RNN models are better established in the field of natural language processing than in computer vision. This work presents visualizations of what recurrent networks, particularly LSTMs, learn in the domain of action recognition, where the inputs are sequences of 3D human poses, or skeletons. The goal of the thesis is to understand the properties learned by a network with regard to an input action sequence, and how it will generalize to novel inputs. This thesis presents two methods for visualizing concepts learned by RNNs in the domain of action recognition, providing an independent insight into the working of the recognition model. The first visualization method shows the sensitivity of joints over time in a video sequence. The second visualization method generates synthetic videos that maximize the responses of a class label or hidden unit within a set of known anatomical constraints. These techniques are combined in a visualization tool called SkeletonVis to help developers and users gain insights into models embedded in RNNs for action recognition. We present case studies on NTU-RGBD, a popular data set for action recognition, to reveal properties learnt by a trained LSTM network.
dc.format.mediumborn digital
dc.format.mediummasters theses
dc.identifierPatil_colostate_0053N_15501.pdf
dc.identifier.urihttps://hdl.handle.net/10217/197280
dc.languageEnglish
dc.language.isoeng
dc.publisherColorado State University. Libraries
dc.relation.ispartof2000-2019
dc.rightsCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.subjectactivation maximization
dc.subjectrecurrent neural networks
dc.subjectaction recognition
dc.subjectvisualization
dc.subjectLSTM
dc.titleLooking under the hood: visualizing what LSTMs learn
dc.typeText
dcterms.rights.dplaThis Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.disciplineComputer Science
thesis.degree.grantorColorado State University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science (M.S.)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Patil_colostate_0053N_15501.pdf
Size:
1.7 MB
Format:
Adobe Portable Document Format