Looking under the hood: visualizing what LSTMs learn

Patil, Dhruva, author; Draper, Bruce, advisor; Beveridge, J. Ross, committee member; Maciejewski, Anthony, committee member

Looking under the hood: visualizing what LSTMs learn

dc.contributor.author	Patil, Dhruva, author
dc.contributor.author	Draper, Bruce, advisor
dc.contributor.author	Beveridge, J. Ross, committee member
dc.contributor.author	Maciejewski, Anthony, committee member
dc.date.accessioned	2019-09-10T14:35:31Z
dc.date.available	2019-09-10T14:35:31Z
dc.date.issued	2019
dc.description.abstract	Recurrent Neural Networks (RNNs) such as Long Short Term Memory (LSTM) and Gated Recurrent Units (GRUs) have been successful in many applications involving sequential data. The success of these models lies in the complex feature representations they learn from the training data. One criteria to trust the model is its validation accuracy. However, this can lead to surprises when the network learns properties of the input data, different from what the designer intended and/or the user assumes. As a result, we lack confidence in even high-performing networks when they are deployed in applications with novel input data, or where the cost of failure is very high. Thus understanding and visualizing what recurrent networks have learned becomes essential. Visualizations of RNN models are better established in the field of natural language processing than in computer vision. This work presents visualizations of what recurrent networks, particularly LSTMs, learn in the domain of action recognition, where the inputs are sequences of 3D human poses, or skeletons. The goal of the thesis is to understand the properties learned by a network with regard to an input action sequence, and how it will generalize to novel inputs. This thesis presents two methods for visualizing concepts learned by RNNs in the domain of action recognition, providing an independent insight into the working of the recognition model. The first visualization method shows the sensitivity of joints over time in a video sequence. The second visualization method generates synthetic videos that maximize the responses of a class label or hidden unit within a set of known anatomical constraints. These techniques are combined in a visualization tool called SkeletonVis to help developers and users gain insights into models embedded in RNNs for action recognition. We present case studies on NTU-RGBD, a popular data set for action recognition, to reveal properties learnt by a trained LSTM network.
dc.format.medium	born digital
dc.format.medium	masters theses
dc.identifier	Patil_colostate_0053N_15501.pdf
dc.identifier.uri	https://hdl.handle.net/10217/197280
dc.identifier.uri	https://doi.org/10.25675/3.020785
dc.language	English
dc.language.iso	eng
dc.publisher	Colorado State University. Libraries
dc.relation.ispartof	2000-2019
dc.rights	Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.
dc.subject	activation maximization
dc.subject	recurrent neural networks
dc.subject	action recognition
dc.subject	visualization
dc.subject	LSTM
dc.title	Looking under the hood: visualizing what LSTMs learn
dc.type	Text
dcterms.rights.dpla	This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
thesis.degree.discipline	Computer Science
thesis.degree.grantor	Colorado State University
thesis.degree.level	Masters
thesis.degree.name	Master of Science (M.S.)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Patil_colostate_0053N_15501.pdf
Size:: 1.7 MB
Format:: Adobe Portable Document Format

Download

Collections

2000-2019
Theses and Dissertations