As predictive models are increasingly deployed in critical domains, there has been a growing emphasis on explaining the predictions of these models to decision makers and relevant stakeholders so that they can understand the rationale behind model predictions, and determine if and when to rely on these predictions. To this end, several algorithms have been proposed in eXplainable Artificial Intelligence (XAI) literature to explain models in a post hoc manner. Despite the plethora of post hoc explanation methods, there is little to no work on systematically benchmarking these methods in an efficient and transparent manner. OpenXAI is a comprehensive and extensible open-source framework for evaluating and benchmarking post hoc explanation methods. OpenXAI is designed to support the development of novel explanation methods and evaluation metrics, and our publicly available leaderboards allow for easy and transparent comparison of explanation methods across diverse evaluation metrics.
OpenXAI comprises of implementations of various state-of-the-art evaluation metrics to assess the faithfulness (both with and without ground truth), stability, and fairness of post hoc explanations. In addition, it has XAI-ready datasets, trained models, and APIs for popular post hoc explanation methods to enable researchers and practitioners to easily benchmark existing or new explanation methods.
To install the OpenXAI package, clone OpenXAI's repo and install the package from the root directory using:
pip install -e .
The installation of the OpenXAI package is hassle-free with minimum dependency on external packages.
The data loaders in OpenXAI are lightweight. OpenXAI provides a collection of functionalities with easy-toa-use high-level APIs to benchmark explanations. As an example, to obtain the
German dataset from the data loader in OpenXAI, do as follows:
from openxai.dataloader import return_loaders loader_train, loader_test = return_loaders(data_name= 'german', download=True) inputs, labels = iter(loader_test).next()
OpenXAI provides two classes of trained predictive models for transparent and reproducible benchmarking of explanation methods. The code snippet below shows how to load OpenXAI’s pre-trained models using our LoadModel class.
from openxai import LoadModel model = LoadModel(data_name= 'german', ml_model='ann', pretrained=True)
Explanation methods included in OpenXAI are readily accessible through the Explainer class. Users need to specify the method name in order to invoke the appropriate method and generate explanations.
from openxai import Explainer exp_method = Explainer(method= 'lime',model=model, dataset_tensor=inputs) explanations= exp_method.get_explanation(inputs, labels)
Benchmarking an explanation method using evaluation metrics is quite simple and the code snippet below describes how to invoke the RIS metric. The input_dict is described in the Getting Started file.
from openxai import Evaluator metric_evaluator = Evaluator(input_dict, inputs, labels, model, exp_method) score= metric_evaluator.evaluate(metric='RIS')