Concept-based interpretability tools help artificial intelligence researchers and engineers design, develop, and debug AI models. Additionally, these tools help understand how AI models work, helping businesses assess whether models deliver accurate results and reflect their values. One of these interpretability tools is Facebook’s Captum.
Captum is a powerful and flexible interpretability library for PyTorch, which makes interpretability algorithms easily accessible for access to the PyTorch community. Captum supports the interpretability of the model through the modalities – vision, text; further allowing researchers to add new algorithms and compare their work with existing algorithms available in the library. Finally, it offers tools to help developers discover vulnerabilities using conflicting metrics and attacks.
Register for our upcoming Masterclass>>
Recently, Facebook released its latest version of Captum – Captum 0.4, which has new features for understanding models. Facebook AI has added tools to assess model robustness, improvements to its existing attribution models, and new attribution methods in the latest version.
Removal of statistical bias
Captum 0.4 adds tests with Concept Activation Vectors (TCAVs) that allow researchers and engineers to assess how different user-defined concepts affect a model’s prediction. It can also be used to check for algorithmic and label biases, which can be built into networks.
Additionally, TCAV’s capabilities extend beyond currently available attribution methods, allowing researchers to quantify the importance of different inputs and quantify the impact of concepts such as gender and race on prediction. of a model.
Looking for a job change? Let us help you.
Captum 4.0 comes with a generically implemented TCAV, allowing users to define custom concepts with sample inputs for different modalities – vision and text.
Source: Facebook AI
The graphs above show the visualized distributions of TCAV scores for the sensitivity analysis model implemented in a Captum tutorial. As a dataset, Facebook AI researchers used movie ratings with positive sentiment. The graphs visualize TCAV scores for positive adjective concepts as well as five sets of neutral term concepts. The concept of positive adjectives is more important than for the two convolutional layers in the five sets of neutral concepts, indicating the importance of positive adjectives in predicting positive sentiment.
Robust AI models
Deep learning techniques are often vulnerable to conflicting inputs which, in turn, can trick the AI ââmodel and be imperceptible to humans. Captum 0.4 comes with tools to support a better understanding of the limitations and vulnerabilities of a model. As a result, the AI ââsystem will react to unforeseen issues and make the necessary changes to avoid harming or adversely affecting people.
Captum 0.4 also comes with tools for understanding model robustness, including conflicting attack implementations and robustness metrics to assess the impact of different attacks or disruptions on a model. The robustness metrics included in its latest version are:
- Attack Comparator: It allows users to quantify the impact of the input disturbance. This includes the text augmentation and torch transformations. It also makes it possible to quantify the enemy attacks on a model and to compare the impact of the different attacks.
- Minimum Disturbance: Identifies the minimum disturbance required for a model to misclass the disturbed input.
This new tool allows developers to better understand potential model vulnerabilities and analyze counterfactual examples to better understand a model’s decision limit.
Source: Facebook AI
Propagation of relevance and attribution
Facebook AI has implemented a new attribution algorithm, in collaboration with the Technische Universitat Berlin, to offer a new perspective for explaining the model predictions. Captum 0.4 adds both LRP and an alternative layer assignment – LRP layer.
The Layered Relevance Propagation (LRP) algorithm is based on a backward propagation mechanism applied sequentially to all layers of the model. The model output score represents the initial relevance which is then broken down into values ââfor each neuron in the underlying layers.
Additionally, in Captum 0.4, the Facebook AI team added tutorials, improvements, and bug fixes to existing attribution methods.
Join our Discord server. Be part of an engaging online community. Join here.
Subscribe to our newsletter
Receive the latest updates and relevant offers by sharing your email.