Do We Really Want Explainable AI? - Edward Ashford Lee (EECS, UC Berkeley)

Abstract: “Rationality” is the principle that humans make decisions on the basis of step-by-step (algorithmic) reasoning using systematic rules of logic. An ideal “explanation” for a decision is a chronicle of the steps used to arrive at the decision. Herb Simon’s “bounded rationality” is the observation that the ability of a human brain to handle algorithmic complexity and data is limited. As a consequence, human decision making in complex cases mixes some rationality with a great deal of intuition, relying more on Daniel Kahneman’s “System 1” than “System 2.” A DNN-based AI, similarly, does not arrive at a decision through a rational process in this sense. An understanding of the mechanisms of the DNN yields little or no insight into any rational explanation for its decisions. The DNN is operating in a manner more like System 1 than System 2. Humans, however, are quite good at constructing post-facto rationalizations of their intuitive decisions. If we demand rational explanations for AI decisions, engineers will inevitably develop AIs that are very effective at constructing such post-facto rationalizations. With their ability to handle to handle vast amounts of data, the AIs will learn to build rationalizations using many more precedents than any human could, thereby constructing rationalizations for ANY decision that will become very hard to refute. The demand for explanations, therefore, could backfire, resulting in effectively ceding to the AIs much more power. In this talk, I will discuss similarities and differences between human and AI decision making and will speculate on how, as a society, we might be able to proceed to leverage AIs in ways that benefit humans.


  • Explaining DNN decisions is challenging.
  • Demanding explanations may backfire.
  • DNNs may be more like intuitive than rational thinking.
  • Intuitive thinking may not be ultimately algorithmic.
  • Architected compositions of DNNs may offer some explanability.

Playing fast and loose with the word “algorithmic” and “complexity” gets one into trouble very quickly. It is inescapable that Turing complete, recursive “rules” are “algorithms” that embody the data one has about the world in an approximation of Kolmogorov Complexity are the best we can do in terms of predictive models. Indeed, you don’t get “reasoning” without them. Also “rationality” has a root word “ratio” which, in its original meaning referred to quantitative comparisons – putting things in perspective with arithmetic – rather than yes/no rules.

Having said all that, Cogent Confabulation was Robert Hecht-Nielsen’s approach to ridding us of the rather nonsensical boundary between Aristotelian logic rules and what folks in the old days called “fuzzy-logic” rules. Where that boundary seems to have merit is in what we might call formalizing scientific theories: We identify what looks like a general rule from statistics but then go on to identify a recursive rule for dynamical models of causality – and in so doing attempt to compress the data by setting aside, but not discarding, the residuals. We then test that causal model experimentally/experientially.


A new model of vertebrate cognition is introduced: maximization of cogency:
p(alpha beta gamma delta | epsilon).
This model is shown to be a direct generalization of Aristotelian logic, and to be rigorously related to a calculable quantity. A key aspect of this model is that in Aristotelian logic information environments it functions logically. However, in non-Aristotelian environments, instead of finding the conclusion with the highest probability of being true (a popular past model of cognition); this model instead functions in the manner of the ‘duck test;’ by finding that conclusion which is most supportive of the truth of the assumed facts.

RHN was, BTW, not only a prominent figure in the second connectionist summer but the company he founded ended up being bought by the premiere credit rating agency FICO where he operated as chief scientist when he essentially declared Cogent Confabulation to be the most likely model for the neocortex. It was shortly thereafter than RHN basically disappeared from the neural network scene. Some attribute this to his losing his marbles over Cogent Confabulation. However, I find it rather intriguing that he made a presentation of Cogent Confabulation to Sandia on an “Intelligence Extraction System” (I snatched a copy of it at the time but it’s now gone from the web) with an estimated budget of $300B/year and then the son of his partner committed mass murder.


About a decade ago, the folks at FICO were talking about RHN’s antics, and it’s great to hear your explanation about what went on. Thank you. It would be great to read that copy, perhaps even here?

It’s a pdf so I can’t upload it here, but here’s a link to a google drive copy.

A quote from a few pages in indicates this may not have been intended to be on the open web:

Collectors and Analysts have no need to know how
extraction system works (this knowledge should be
highly restricted) – users need only know extraction
system’s capabilities and how to use it.


Thanks! By now, this information bottleneck approach has become widely known and quite mainstream: it’s the basis for the Dall-e image synthesis for example.

1 Like