AI Model Utilization Measurements For Finding Class Encoding Patterns
Peter Bajcsy, Antonio Cardone, Chenyi Ling, Philippe Dessauw, and Michael Majurski, Tim Blattner, Derek Juba, Walid Keyrouz

TL;DR
This paper introduces theoretical methods to measure and explain how AI models encode class information, focusing on traffic sign classification and detection of poisoned models with triggers.
Contribution
It develops utilization measurement techniques for AI models and analyzes class encoding patterns, including poisoned models, at the graph and node levels.
Findings
Utilization varies across computation graph nodes in AI models.
Poisoned models show distinct class encoding patterns compared to clean models.
Implications for trojan detection and model security are discussed.
Abstract
This work addresses the problems of (a) designing utilization measurements of trained artificial intelligence (AI) models and (b) explaining how training data are encoded in AI models based on those measurements. The problems are motivated by the lack of explainability of AI models in security and safety critical applications, such as the use of AI models for classification of traffic signs in self-driving cars. We approach the problems by introducing theoretical underpinnings of AI model utilization measurement and understanding patterns in utilization-based class encodings of traffic signs at the level of computation graphs (AI models), subgraphs, and graph nodes. Conceptually, utilization is defined at each graph node (computation unit) of an AI model based on the number and distribution of unique outputs in the space of all possible outputs (tensor-states). In this work, utilization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
