Utilizing network features to detect erroneous inputs
Date
2020
Authors
Gorbett, Matthew, author
Blanchard, Nathaniel, advisor
Anderson, Charles W., committee member
King, Emily, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
Neural networks are vulnerable to a wide range of erroneous inputs such as corrupted, out-of-distribution, misclassified, and adversarial examples. Previously, separate solutions have been proposed for each of these faulty data types; however, in this work I show that the collective set of erroneous inputs can be jointly identified with a single model. Specifically, I train a linear SVM classifier to detect these four types of erroneous data using the hidden and softmax feature vectors of pre-trained neural networks. Results indicate that these faulty data types generally exhibit linearly separable activation properties from correctly processed examples. I am able to identify erroneous inputs with an AUROC of 0.973 on CIFAR10, 0.957 on Tiny ImageNet, and 0.941 on ImageNet. I experimentally validate the findings across a diverse range of datasets, domains, and pre-trained models.
Description
2020 Fall.
Includes bibliographical references.
Includes bibliographical references.