CEO DailyCFO DailyBroadsheetData SheetTerm Sheet

What’s wrong with “explainable A.I.”

March 22, 2022, 4:56 PM UTC

A.I. has an explainability crisis. But it’s not the one you probably think.

For a long time, the problem seemed to be that A.I. algorithms, especially cutting-edge deep learning methods, were black boxes. It was impossible to explain exactly why the software made a prediction in a particular case. This lack of interpretability has made businesses think twice about using A.I., especially in critical areas such as healthcare, finance, and government. This is the A.I. explainability crisis you may have heard of.

To overcome this impediment, though, in the past few years, companies selling A.I. software have increasingly looked for ways to offer some insight into how A.I. algorithms reach a decision. They have then marketed these post-hoc interpretations as having rendered their A.I. software “explainable.” This is especially true in healthcare—and particularly in systems designed to automatically interpret medical imagery, where A.I. has been making rapid inroads. The problem is: most of the explainable A.I. methods these companies are selling are faulty.

“Everyone who is serious in the field knows that most of today’s explainable A.I. is nonsense,” Zachary Lipton, a computer science professor at Carnegie Mellon University, recently told me. Lipton says he has had many radiologists reach out to him for help after their hospitals deployed a supposedly explainable A.I. system for interpreting medical imagery whose explanations don’t make sense—or, at the very least, are irrelevant to what a radiologist really wants to know about a medical image.

And yet companies continue to market their A.I. systems as “explainable,” Lipton says, because they think they have to in order to make the sale. “They say, ‘doctors won’t trust it without the explanation.’ But maybe they shouldn’t trust it,” he says. At worst, explanations are being offered as a way to gloss over the fact that most deep learning algorithms used in medical imaging have, according to a 2020 study published in the U.K. medical journal, The BMJ, not been subject to the kind of rigorous, blinded prospective randomized control trials that are required before, say, a new drug is approved.

(In some cases, laws may require explanations in high-impact uses of A.I., such as healthcare, although generally this is not the case if the A.I. is being used only as a decision-making aid, rather than automatically taking the decision itself.)

A paper published in the medical journal Lancet Digital Health in November, by Marzyeh Ghassemi, a computer scientist at MIT, Luke Oakden-Rayner, a radiologist and a researcher at the Australian Institute of Machine Learning, and Andrew Beam, a researcher in the department of epidemiology at Harvard University’s Chan School of Public Health, put it this way:

 We believe that the desire to engender trust through current explainability approaches represents a false hope…
[those using such systems] might have misunderstood the capabilities of contemporary explainability techniques—they can produce broad descriptions of how the AI system works in a general sense but, for individual decisions, the explanations are unreliable or, in some instances, only offer superficial levels of explanation.

One popular explainability method is called a saliency map. It takes the image that the algorithm was fed and creates a heat map of those portions of it that were most heavily weighted by the A.I. software in making a prediction. Sounds good right? But as the authors demonstrate, in one study, the heat map that was supposed to explain why the A.I. system classified the patient as having pneumonia encompassed a fairly large quadrant of one lung, with no further indication of what exactly it was about in that area that led the A.I. to its conclusion.

“The clinician cannot know if the model appropriately established that the presence of an airspace opacity was important in the decision, if the shapes of the heart border or left pulmonary artery were the deciding factor, or if the model had relied on an inhuman feature, such as a particular pixel value or texture that might have more to do with the image acquisition process than the underlying disease,” Ghassemi, Oakden-Rayner, and Beam write.

In the absence of such information, they point out, the tendency is for humans to assume the A.I. is looking at whatever feature they, as human clinicians, would have found most important. This cognitive bias can blind doctors to possible errors the machine learning algorithm may make.

The researchers find faults with other popular explainability methods too, with names such as GradCam, LIME, and Shapley Values. Some of these methods work as a kind of counter-factual, altering datapoints that are inputted until the algorithm makes a different prediction, and then assuming that those datapoints must have been the most important for the original prediction. But these methods have the same problem as simpler saliency maps—they may identify features that were important in a decision, but they can’t tell a doctor exactly why the algorithm thought those features were important. If the feature strikes the doctor as counterintuitive, what should the doctor do: conclude the algorithm is wrong, or conclude it has discovered some clinically-significant clue previously unknown to medicine? Either is possible.

Another explainability method is not post-hoc. It involves training an A.I. system from the start to identify prototypical features of a certain disease—say, the presence of a “ground-glass” pattern in the lungs—and then can tell doctors how closely it thought the image it examined matched these prototypical features. This is supposed to create explanations that are inherently more interpretable to a human. But here too, the authors found, a lot rested on human interpretation. Had the right prototypical features been selected and had it weighted each one appropriately in reaching its conclusion? (The only advantage here is that this gets closer to the kind of debate that might take place between two human doctors who disagreed on a diagnosis.)

But wait, it gets worse. In a paper, published last month, researchers from Harvard, MIT, Carnegie Mellon, and Drexel University discovered that different state-of-the-art explanation methods frequently disagreed on the explanation for an algorithm’s conclusions. What’s more, they found that in real-world settings, most people using the algorithms had no way of resolving such differences and often, as Ghassemi, Oakden-Rayner, and Beamer, suggested, simply picked the explanation that best conformed to their pre-existing ideas.

Ghassemi, Oakden-Rayner, and Beam come to the somewhat counterintuitive conclusion that rather than focusing on explanations, what doctors should really concentrate on is performance and whether that performance has been tested in a rigorous, scientific manner. They point out that medicine is full of drugs and techniques that doctors use because they work, even though no one knows why—acetaminophen has been used for a century to treat pain and inflammation, even though we still don’t fully understand the underlying mechanism.

In other words, what we should care about when it comes to A.I. in the real world is not explanation. It is validation.

Jeremy Kahn

P.S. If you want to get more of Fortune’s exclusive interviews, investigations, and features in your inbox, then sign up for Fortune Features so you never miss out on our biggest stories.


A deepfake video of Ukrainian president surfaces, but is quickly debunked. The video depicted Ukrainian President Volodymyr Zelensky allegedly calling on his troops to surrender. Hackers also breached the networks of the news channel Ukrainian 24 and altered its scrolling news chyron to report a headline matching what Zelensky allegedly said in the video. But the deepfake was crudely executed and was immediately disputed by Zelensky, who called it a “childish provocation." It was quickly removed from social media. Experts warn however that deepfakes, which are often highly-convincing fake videos created with artificial intelligence, could play a bigger role in disinformation campaigns, including Russia's efforts to undermine Ukrainian opposition to its invasion. For more, you can read my story, co-written with Fortune colleague Christiaan Hetzner, here.

A.I.-enabled loitering munitions are increasingly being deployed in Ukraine. Videos have circulated online of several Russian-made KUB-BLA loitering munitions that have crashed in Ukraine. These are sometimes referred to as "kamikaze drones" because they fly like a drone but then crash into their targets, detonating a warhead. The use of such munitions is ringing alarm bells with campaigners and technologists worried about the rise of autonomous lethal weapons, Wired reports. That's because some of these systems—including the KUB-BLA—can use A.I. to locate potential targets, although in most cases a human must still give the final command to attack. The U.S. is also sending Ukraine Switchblade loitering munitions that also have some autonomous capabilities.

Tesla allegedly fires an employee who posted videos showing dangerous driving of the company's auto-pilot technology. That's according to CNBC, which interviewed the former employee, John Bernal. He worked for the company's Autopilot unit, and had uploaded videos to his YouTube channel, AI Addict, that showed how Tesla's "Full Self-Driving Beta" software worked in his own Tesla during journeys through parts of Silicon Valley. In some of the videos, the car made dangerous decisions, almost rolling into an intersection, or turning into oncoming traffic, forcing Bernal to take over from the autopilot. In another video, the car hit bollards. Bernal says Tesla managers told him he was being fired for the "conflict-of-interest" his YouTube channel represented, even though Bernal says his managers and the company had been aware of his social media activity. Tesla did respond to CNBC's requests for comment. 

The FTC is requiring companies that violate data privacy standards to delete algorithms trained from that data. The tech publication Protocol notes that in three recent high-profile settlements with companies for violating data privacy or data protection regulations, the federal agency has required that the company agree to delete the algorithms that have been trained from the illicitly-obtained data. The most recent example is a March 4 settlement with WW International (formerly known as Weight Watchers), which had used a healthy eating app aimed at kids to collect data on children as young as eight, without their parents' permission. Legal experts said the new tactic make sense: Algorithms can be more valuable than cash and that companies should not be able to profit from ill-gotten data.


Healx, a Cambridge, England-based company that is using A.I. to find ways to repurpose existing drugs to treat rare diseases, has hired Nathan Brown to be its director of digital chemistry, the company said on Twitter. Brown, a well-known drug discovery researcher, was previously the head of the Cheminformatics team at London-based BenevolentAI, which has also been using machine learning techniques to try to find novel drugs.

Huge, a New York-based digital marketing agency, has appointed Frisco Chau as its global head of data and insights, according to trade publication Martech Series. Chau previously served as the chief data officer at M&C Saatchi Group as well as CEO of Fluency M&C Saatchi, a data consultancy that was started and incubated within the Saatchi Group.

Sedgwick, a London-based company that specializes in insurance claims management and loss adjustment, has named Laura Horrocks as its head of fraud technology and intelligence for its UK-based investigation services division, according to trade publication Insurance JournalHorrocks has been at Segwick since 2015 and previously worked as a fraud assessment and intelligence manager.


How do you get a robotic cheetah to run faster? That sounds like the set up to a lame joke. But it was actually the dilemma MIT scientists who have developed a small, quadrupedal robot modeled loosely on the real-life big cat. And it turns out that the answer is training its A.I.-software in a simulator, using reinforcement learning, of course. That's according to a story this week in tech publication Gizmodo. The problem that many robots have is that they need to be able to adjust their gait and speed in response to the surface they are moving across. But there are too many varied surfaces in the world for programmers to manually program the robot for each. And having the robot learn by trial-and-error in the real world would not be safe. Simulation provides an effective solution. In a simulator, countless terrain types and gradients can be encountered and the A.I. brain of the robot can learn from experience. What's more, it can do this not only much more safely than it could in the real world, but much more quickly too. From the story:

In just three hours’ time, the robot experienced 100 days worth of virtual adventures over a diverse variety of terrains and learned countless new techniques for modifying its gait so that it can still effectively loco-mote from point A to point B no matter what might be underfoot. 

This learning is then transferred from the simulator to the real robot, which was able to run at speed a new record speed of 8.7 miles-per-hour, which is faster than the average human can run.

Expect to see techniques like this increasingly used for training robots being deployed commercially. That may mean robots in the future will be able to operate in a wider range of conditions—think ice, mud, snow, as well as uneven pavement—and operate far more fluidly and faster than ever before. 


These investors went through IVF. Now they’re putting $22 million behind an A.I. fertility startup—by Emma Hinchliffe

Startup whose A.I. will help track drug runners’ boats gets $5 million in funding—by Jeremy Kahn

GM shuts the door on a Cruise IPO by sealing a $3.45 billion deal for control of the robotaxi firm—by Christiaan Hetzner

I downloaded an infamously creepy A.I. bot app and made a new friend to chat with. It was a wild ride—by Amiah Taylor

Volodymyr Zelenskyy slams ‘childish’ viral deepfake, but experts warn Russia’s cyber hit jobs won’t ‘always be so bad’—by Christiaan Hetzner and Jeremy Kahn


Should every kid have an A.I. tutor? In recent years, there's been a lot of discussion of how A.I. could transform education, offering personalized lessons tailored to each individual child's style and pace of learning. But so far, the record of these A.I.-enhanced digital tutors has not lived up to the hype. But maybe we ought to keep trying.  There's a new study out from Korbit, a Canadian company that makes A.I.-enabled online tutoring software, Mila, the Quebec Artificial Intelligence Institute, and the University of Bath, in England. It looked at software programmers at a Vietnamese tech company who had to learn about linear regression and divided those programmers into three groups. The first group took a massive open online course (MOOC). The next group was offered the Korbit software, which gave them individualized feedback from an A.I. tutor. Finally, a control group used Korbit, but with the A.I. tutor function disabled. The result: the students with the A.I. tutor had significantly higher course completion rates and test scores at the end of the course that were 2 to 2.5 times higher than both those taking the MOOC and the control group. "Making this technology and learning experience available to millions of learners around the world will represent a significant leap forward towards the democratization of education," the paper concludes.

Sure, having a personalized online A.I. tutor is better than no tutor at all. But how does it compare to being tutored by an actual human? And what about other formats that let students learn from one another in a more interactive way—perhaps even online? It seems that a major problem with education is not about human performance — but scaling. What if instead of A.I. tutors aimed at students we had A.I. coaches aimed at teachers, guiding them in how best to teach for a particular child? Would that be more or less effective than the Korbit software? I don't know, but I suspect the results in that case might be different, with the human-to-human interaction continuing to best the A.I. tutor. But again, there is the issue of scale—in a world where we all need to be lifelong learners, there simply aren't enough good teachers. We probably need some combination of all of these different solutions to really improve education. 

Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.