Martins is an Applied Scientist at Onfido.
Most artificial intelligence (AI) systems are functional black boxes. They receive an input and provide an output. But often they don't explain how the input led to the output.
Their behavior remains inscrutable, even in instances when we can look at the source code of the system, when the data that was used to train it is accessible, and when we did the training ourselves.
There is a veil of unknowing surrounding AI systems, especially those based on deep neural networks. But in many applications—translation, object detection, face recognition—these systems often perform better than any alternatives, sometimes including humans. And so we face the following dilemma: we have well-performing AI systems that we want to use, but don't fully understand.
When do we need explainable AI?
Explainable AI is one attempt to deal with this dilemma. It’s a field of research and a collection of methods that are trying to pierce the veil of unknowing.
Our reasons for seeking insight into the working of an AI system vary depending on how well the AI system performs.
Systems with below human-level performance: These are unlikely to be employed in critical applications. Usually, they form part of a larger system, or are used in research and development. The main purpose of explainability is to understand why the system is making mistakes, so it can be improved.
Systems that perform at, or near, the level of humans: These are more likely to be integrated into products and services. Most users will be non-technical people and the purpose of explainability is to create trust and increase adoption of the AI system.
Some AI systems already perform beyond human performance: Among the more prominent examples are DeepMind's AlphaGo, who beat Go world champion Lee Sedol in 2016. There’s also its chess playing cousin, AlphaZero. For these systems, we want to understand how they work in order to improve our own performance. Chess players are already learning from AlphaZero's published games.
It turns out that the middle category is the difficult one, because "trust" is a high bar to clear. Google translate is an incredibly useful AI-powered service which allows us to read webpages in languages we don't understand. But would we trust it to correctly translate a legal document?
Some AI mishaps have happened in public. For some time Google's photo tagging algorithm was not able to distinguish gorillas from black people. Even three years later Google's fix of the problem was to remove the category "gorilla" altogether. Similarly back in 2013, in a case that lacked human oversight, Amazon was selling T-shirts with the slogan "Keep Calm and Punch Her".
Both are cases of automated systems giving useful results for some inputs, but failing catastrophically for others. In these scenarios, explainability could help us understand what went wrong, so similar circumstances can be avoided in the future.
When does explainability become essential?
The ability to explain decisions ceases to be an optional extra and becomes an essential requirement as we move into regulated areas such as finance or safety-critical areas such as medicine. In such cases, "move fast and break things" is not an appropriate motto. Explainability can also be a useful tool to ensure that algorithms are fair and unbiased.
In recent years, several tech companies have moved into the medical space. DeepMind has applied deep learning to patient data from the Department of Veteran Affairs in the US, to predict the onset of acute kidney injury. Microsoft is partnering with hospitals to use AI for assessing the risk of developing cardiovascular disease. And IBM tried to adapt its Jeopardy-winning Watson to help doctors diagnose patients. However, it seems that this goal was too ambitious and IBM has since made a tactical retreat.
The need for explainability in medical scenarios can be illustrated with the following example. Pneumonia is usually diagnosed via a physical exam and a chest X-ray. From a machine learning point of view, it’s a straightforward exercise to train a deep learning model that diagnoses whether a chest X-ray shows signs of pneumonia or not. The lack of training data is not a problem here. Researchers have collected more than 150 thousand chest X-rays with corresponding labels. What, then, is the problem? To have confidence in the output of a model, we must be sure that the model is basing its decisions on clinically relevant information.
One study looked at how well a model that was trained on X-ray images taken at one hospital, performs when evaluated on X-ray images taken at a different hospital. It turned out that the model doesn’t generalize well at all. When trained on data from hospital A and evaluated on data from the same hospital the model attained a diagnostic accuracy of 0.617. But the accuracy dropped to 0.184 when evaluated on data from hospital B. The exact reasons for the lack of generalization are not known, but the following observations provide some hints.
First, the disease prevalence in hospital A was 34% but only 1% in hospital B. Second, the authors were able to train a model to detect whether an X-ray was taken in hospital A or B with more than 95% accuracy. This suggests that the pneumonia classifier could learn to rely on subtle differences in image acquisition or preprocessing, and ignore the underlying pathology.
This is an important observation, because we don't have the data to train a separate model for each hospital. And even within a single hospital, we would expect the intensive care unit and the outpatient clinic to see different rates of disease prevalence. Without a thorough understanding of the trained model (an understanding that goes beyond statistical accuracy measurements), we cannot safely deploy such models in medical practice. This is where explainability becomes an essential ingredient.
How do we solve the problem of explainability?
Now that we have seen the problem, what do we do about it? Several companies have released explainability solutions: IBM has AI Explainability 360, an open source toolkit implementing explainability methods for machine learning models, Facebook has Captum, a similar library focussed on PyTorch, Google’s offer is called simply Explainable AI and is part of its cloud AI platform and Microsoft also offers interpretability solutions as part of its Azure ML SDK.
But even though commercial products exist offering interpretability and explainability, there are no easy solutions or, as academics like to say: this is an area of active research. And it makes sense that there are no easy solutions.
After all, a state-of-the-art neural network is a highly nonlinear function. It has at least 10 million parameters that are chosen based on statistical patterns found in the training set. We are using deep learning to find the patterns in the training set automatically, presumably because it is too complicated to extract and code the patterns by hand. And 10 million parameters give the neural network a lot of freedom to behave unexpectedly.
Some progress has been made. We have some methods to pull back the curtain on convolutional neural networks. These methods have names such as Layerwise Relevance Propagation (LRP), Guided BackPropagation (GuidedBackProp), Deep Taylor Decomposition (DTD), class activation mappings (CAM), Testing with Concept Activation Vectors (TCAV) and Randomised Input Sampling for Explanation (RISE). Each tries to illustrate some aspect of the neural network.
Some methods concentrate on attribution—which pixels of the image contributed to the classification? Others look at sensitivity—which pixels do we need to change to effect the biggest change in the classification score? And yet others look at how classification results change if we occlude parts of the image. As it’s an active area of research, there is currently no consensus which is the right method to use or the right interpretation to consider. But each method provides an additional piece to the puzzle.
The above methods all look at single images and try to explain individual prediction. So let’s zoom out and look at the whole dataset rather than individual prediction. In a recent Nature Communications paper Lapuschkin et al., the authors looked at a representative collection of relevance heatmaps for inputs of the same class. In this case, horses. They applied spectral clustering to group the samples according to how similar the network's explanations are. By manually inspecting the clusters, the researchers detected that one model was using the copyright tag in some of the images to classify an image as a horse. They verified that this was the case by adding the copyright tag to an image of a car. The network proceeded to classify the image of the car as a horse, with high confidence.
The importance of explainable AI for Onfido
So why is explainable AI so important for Onfido?
Well firstly, we want our AI models to be as fair as possible. For example, we’ve taken steps to ensure our FaceMatch algorithm has achieved market-leading accuracy while being the fairest it has ever been across ethnicities. This is important in satisfying regulatory requirements.
For our machine learning scientists, explainable AI means they know that their models correctly learnt what they were meant to learn in the training process. And finally, for our customers, they need to know that we are catching the fraud. and that our work isn’t doing overfitting or data mining.
We achieve all this by following an established process. We ask our models what and how they learnt, and how they made a decision. And we then develop the methods to answer these questions. Each model we put into production can have a different approach to how it answers these questions
Overall, explainable AI allows us to meet the needs of the different stakeholders we interact with, including regulators, customers and their consumers.
You can read more about our approach to artificial intelligence in our Center of Applied AI pages.