Paper Reading: Explainable Machine Learning in Deployment (2020)
by Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José M. F. Moura, Peter Eckersley
At work, I know people who set a side an hour daily just reading and learning. I’m still trying to get into that routine. While I figure out my days, I thought I’d post on some papers I’ve read in the past year. This is a paper on Explainable Machine Learning, published in 2020 at the FAccT Conference. Below is what I found to be most interesting:
Abstract
Explainable machine learning offers the potential to provide stakeholders with insights into model behavior by using various methods such as feature importance scores, counterfactual explanations, or influential training data. Yet there is little understanding of how organizations use these methods in practice. This study explores how organizations view and use explainability for stakeholder consumption. We find that, currently, the majority of deployments are not for end users affected by the model but rather for machine learning engineers, who use explainability to debug the model itself. There is thus a gap between explainability in practice and the goal of transparency, since explanations primarily serve internal stakeholders rather than external ones. Our study synthesizes the limitations of current explainability techniques that hamper their use for end users. To facilitate end user interaction, we develop a framework for establishing clear goals for explainability. We end by discussing concerns raised regarding explainability
Conclusion
We are the first, to our knowledge, to interview various organizations on how they deploy explainability in their ML workflows, concluding with salient directions for future research. We found that while ML engineers are increasingly using explainability techniques as sanity checks during the development process, there are still significant limitations to current techniques that prevent their use to directly inform end users. These limitations include the need for domain experts to evaluate explanations, the risk of spurious correlations reflected in model explanations, the lack of causal intuition, and the latency in computing and showing explanations in real-time. Future research should seek to address these limitations. We also highlighted the need for organizations to establish clear desiderata for their explanation techniques and to be cognizant of the concerns associated with explainability. Through this analysis, we take a step towards describing explainability deployment and hope that future research builds trustworthy explainability solutions.
Intro
Explanations can come in many forms: from telling patients which symptoms were indicative of a particular diagnosis [35] to helping factory workers analyze inefficiencies in a production pipeline [17].
One form of this would be to publish an algorithm’s code, though this type of transparency would not provide an intelligible explanation to most users. Another form would be to disclose properties of the training procedure and datasets used [39]. Users, however, are generally not equipped to be able to understand how raw data and code translate into benefits or harms that might affect them individually. By providing an explanation for how the model made a decision, explainability techniques seek to provide transparency directly targeted to human users, often aiming to increase trustworthiness [44].
Methodology
• We interview around twenty data scientists, who are not currently using explainability tools, to understand their organization’s needs for explainability.
• We interview around thirty different individuals on how their organizations have deployed explainability techniques, reporting case studies and takeaways for each technique.
• We suggest a framework for organizations to clarify their goals for deploying explainability
Findings
Trustworthiness refers to the extent to which stakeholders can reasonably trust a model’s outputs.
Transparency refers to attempts to provide stakeholders (particularly external stakeholders) with relevant information about how the model works: this includes documentation of the training procedure, analysis of training data distribution, code releases, feature-level explanations, etc.
Explainability refers to attempts to provide insights into a model’s behavior.
Stakeholders are the people who either want a model to be “explainable,” will consume the model explanation, or are affected by decisions made based on model output.
Practice refers to the real-world context in which the model has been deployed.
Local Explainability aims to explain the model’s behavior for a specific input.
Global Explainability attempts to understand the high-level concepts and reasoning used by a model.