The Nature article, in combination with the ignorance of how to compare data analysis results fairly that’s been obvious since the 1960s, brings to mind this scene from Twin Peaks: The Return:
Kinda weird coincidence with your linking to Andrew Gelman’s (important) paper.
Last week I started reading “The Origin of Consciousness in the Breakdown of the Bicameral Mind” by Julian Jaynes and noticed how heavily he leans on the notions of metaphor and analogy*. Since these are central to Bertrand Russell’s relation arithmetic**, one of the landmark language model achievements was Peter Turney’s matching college bound senior performance on verbal analogy tests and Turney named his blog “apperception”, I decided to search for what Turney thought of Jaynes’s approach to cognition.
Lo and behold, what should come up but Gelman’s blog:
Moreover, quite independent of this, I’d been trying to come up with a way of describing what is wrong with the current language models to a potential underwriter of the Hutter Prize and had decided that formalizing the concept of “critical thinking” with examples structured in a manner not unlike that which Gelman’s blog entry on Jaynes set forth regarding “model checking” might get the idea across.
* analogy, for example, being “relational similarity”
** relational similarity forms the basis for Russell’s notion of relation numbers thence a mathematical handle on the structure of the empirical world including numbers that have dimensionality that are absent in the foundation of arithmetic (hence programming languages’ risible inability to deal correctly with units and dimensions), and I basically sacrificed my career status by threatening to resign from HP’s “Internet Chapter 2” project if they forced me to hire H-1bs from India rather than another co-“colonizer”, Tom Etter, to revive relation arithmetic.
I decided to go looking for what passes for “critical thinking” out there nowadays and end up damning GPT4 with faint praise:
Running GPT4 through some standard “critical thinking” tests given to employees; the following test question showed that, compared purple and green haired Human Resource Department “studies” major critters, GPT4 came off as less woke.
Should banks and financial institutions be obligated to engage in socially-responsible investing?
Yes, engaging in socially responsible investing leads to a happier and more fulfilled workforce compared to banks which do not engage in socially-responsible investing.
Correct answer: STRONG
The argument addresses the question directly, and offers two clear advantages of the premise described in the statement, directly addressing the question and presenting a strong argument.
GPT4: This argument is considered weak because it makes an assumption without offering substantial evidence or clear linkages. The happiness and fulfillment of a workforce are influenced by numerous factors, not just the bank’s investment strategies. Additionally, the argument does not explain how socially responsible investing leads directly to workforce happiness and fulfillment. Thus, the argument is not entirely convincing in obligating banks for socially-responsible investing solely based on employee satisfaction.
Looking into Cosma Shalizi’s work might also be interesting.
It’s always nice to see folks taking “causality” seriously enough to at least devote effort to a recommended reading list.
Unfortunately despite Shalizi having invested a great deal of care and effort in the topic, and despite his site, bactra.org having numerous hits regarding algorithmic information, he just doesn’t seem to get the forest for the trees. His numerous references to Judea Pearl on that page are why I reserve my vitriol for Turing Awardees (and other hood ornaments) that have no excuse, as they misled and are misleading generations with “forest for the trees” pedantry.
It really is as simple as understanding that any pretense of natural science presupposes calculation and this is what Ray Solomonoff’s 1960s proof assumed in concluding that you can’t do better than the algorithmic information criterion for model selection* – well, except for what Tom Etter and Pierre Noyes were trying to tell people regarding their radical approach to “the quantum core” as I previously posted here:
But that’s a whole other fork in the road in which “the arrow of time” is more analogous to the arrow of gravitational force emergent from a new science with process, information and structure as the primary categories rather than time, space and matter. See Outline of a New Science by Tom Etter.
* “model selection” is, of course, only one stage of scientific activity. But it is worth focusing on because you can’t even begin decision-making until you have selected a model upon which to base your decision tree with all it’s “what if” nodes. All the noise about “causality” leading to the relatively sophisticated understanding about “p-hacking” are 2 levels of intellectual rigor behind Algorithmic Information approximation as causal model selection. We should, long ago, have abandoned attempting to predict things on the basis of Pearl’s DAGs (Directed Acyclic Graphs) and gone to DCGs (Directed CYCLIC Graphs) which immediately and obviously takes you out of the dead end of trying to focus on only one dependent variable at a time hence “p-hacking”.
By “long ago” I mean at least as far back as when John Tukey’s student, Charlie Smith found DAGs inadequate to model the energy economy of the US during founding the DoE’s EIA, thence departed to start the second neural network summer in the 1980s at the System Development Foundation where he financed the guys who were trying to do dynamical systems modeling of nervous systems (ie: DCGs/RNNs).