Looking under the hood: visualizing what LSTMs learn
dc.contributor.author | Patil, Dhruva, author | |
dc.contributor.author | Draper, Bruce, advisor | |
dc.contributor.author | Beveridge, J. Ross, committee member | |
dc.contributor.author | Maciejewski, Anthony, committee member | |
dc.date.accessioned | 2019-09-10T14:35:31Z | |
dc.date.available | 2019-09-10T14:35:31Z | |
dc.date.issued | 2019 | |
dc.description.abstract | Recurrent Neural Networks (RNNs) such as Long Short Term Memory (LSTM) and Gated Recurrent Units (GRUs) have been successful in many applications involving sequential data. The success of these models lies in the complex feature representations they learn from the training data. One criteria to trust the model is its validation accuracy. However, this can lead to surprises when the network learns properties of the input data, different from what the designer intended and/or the user assumes. As a result, we lack confidence in even high-performing networks when they are deployed in applications with novel input data, or where the cost of failure is very high. Thus understanding and visualizing what recurrent networks have learned becomes essential. Visualizations of RNN models are better established in the field of natural language processing than in computer vision. This work presents visualizations of what recurrent networks, particularly LSTMs, learn in the domain of action recognition, where the inputs are sequences of 3D human poses, or skeletons. The goal of the thesis is to understand the properties learned by a network with regard to an input action sequence, and how it will generalize to novel inputs. This thesis presents two methods for visualizing concepts learned by RNNs in the domain of action recognition, providing an independent insight into the working of the recognition model. The first visualization method shows the sensitivity of joints over time in a video sequence. The second visualization method generates synthetic videos that maximize the responses of a class label or hidden unit within a set of known anatomical constraints. These techniques are combined in a visualization tool called SkeletonVis to help developers and users gain insights into models embedded in RNNs for action recognition. We present case studies on NTU-RGBD, a popular data set for action recognition, to reveal properties learnt by a trained LSTM network. | |
dc.format.medium | born digital | |
dc.format.medium | masters theses | |
dc.identifier | Patil_colostate_0053N_15501.pdf | |
dc.identifier.uri | https://hdl.handle.net/10217/197280 | |
dc.language | English | |
dc.language.iso | eng | |
dc.publisher | Colorado State University. Libraries | |
dc.relation.ispartof | 2000-2019 | |
dc.rights | Copyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright. | |
dc.subject | activation maximization | |
dc.subject | recurrent neural networks | |
dc.subject | action recognition | |
dc.subject | visualization | |
dc.subject | LSTM | |
dc.title | Looking under the hood: visualizing what LSTMs learn | |
dc.type | Text | |
dcterms.rights.dpla | This Item is protected by copyright and/or related rights (https://rightsstatements.org/vocab/InC/1.0/). You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | Colorado State University | |
thesis.degree.level | Masters | |
thesis.degree.name | Master of Science (M.S.) |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Patil_colostate_0053N_15501.pdf
- Size:
- 1.7 MB
- Format:
- Adobe Portable Document Format