Binarized activations for post-hoc interpretability and representation analysis in deep neural networks
Loading...
Files
Jamil_colostate_0053A_19298.pdf (2.03 MB)Access status: Embargo until 2028-01-07 ,
Date
Journal Title
Journal ISSN
Volume Title
Abstract
Interpretability and representation analysis have advanced rapidly, but most methods rely on the continuous activations of convolutional, attention, or pooled penultimate layers. While informative, magnitudes can blur interpretability by conflating whether a unit participates at all with how strongly it fires, and they are sensitive to rescaling. Moreover, analyses at scale can become computationally and memory intensive. To address these issues, I propose an alternative lens: bit vectors, the binarized activations of ReLU layers. While this representation achieves a 60× compression, we show that it can perform on par with established attribution methods by developing Quant-CAM, which performs competitively with gradient-based CAMs, outperforms the perturbation-based Score-CAM on ResNet-50 by 8% in deletion AUC and 2% in insertion AUC, and is over two orders of magnitude faster (minutes versus hours) to compute, while also extending naturally to Transformer architectures. Building on this lens, I investigate three aspects of bit vector representations. First, their ability to probe representation efficiency: bit vectors can rank channel importance and reveal differences in feature economy, robustness, and redundancy across supervised and self-supervised paradigms. Second, their comparability to continuous activations: bit vectors preserve interpretability quality while maintaining the 60× compression advantage. Third, their role in attribution: Quant-CAM demonstrates that binary activations are not merely a compressed signal but a foundation for effective and generalizable explanation techniques. Taken together, this work establishes bitvectors as a lightweight lens for analyzing and explaining deep networks, yielding insights into feature economy, robustness, and redundancy, and leading to Quant-CAM, an attribution method that demonstrates the strength of this binary perspective.
Description
Rights Access
Embargo expires: 01/07/2028.
