Saturday, February 7, 2026

No, AI does not have human-level intelligence

In an article at Nature, Eddy Keming Chen, Mikhail Belkin, Leon Bergen, and David Danks ask “Does AI already have human-level intelligence?” and claim that “the evidence is clear” that the answer is Yes.  (Though the article is partially pay-walled, a read-only PDF is available here.)  But as is typical with bold claims about AI, their arguments are underwhelming, riddled with begged questions and other fallacies.

Defining “intelligence”

Naturally, before we can establish that AI has genuine intelligence, we need to make clear what it would be for it to have intelligence, and how we could go about determining that it has it.  The first is a metaphysical question, the second an epistemological question.  Our authors make no serious attempt to answer either one.

Explaining what they mean by “general intelligence,” they write: “A common informal definition of general intelligence, and the starting point of our discussions, is a system that can do almost all cognitive tasks that a human can do.”  Expanding on this, they say:

General intelligence is about having sufficient breadth and depth of cognitive abilities, with ‘sufficient’ anchored by paradigm cases.  Breadth means abilities across multiple domains – mathematics, language, science, practical reasoning, creative tasks – in contrast to ‘narrow’ intelligences, such as a calculator or a chess-playing program. Depth means strong performance within those domains, not merely superficial engagement.

There are three serious problems here.  The first is that it is not illuminating to say that general intelligence entails the ability to carry out cognitive tasks unless our authors have already given us (as they have not) an explanation of what they mean by “cognitive.”  Now on one common usage, “cognitive” tasks are activities of the kind that require intelligence.  So, if that is what they mean, then our authors’ definition is circular – they are defining intelligence in terms of cognition, where cognition is (implicitly) defined in terms of intelligence.  And as anyone who has taken a logic course knows, one of the fundamental rules of a good definition is that should not be circular.

Second, another basic rule for definitions familiar from logic is that a good definition should not be too broad.  For example, if I defined “gold” as a yellowish metal, this would violate the rule, because it would fail to make clear what distinguishes gold from pyrite.  Now, our authors violate this rule as well, because their characterization of intelligence is not precise enough to distinguish genuine intelligence from mere mimicry.  They matter-of-factly refer to what calculators and chess-playing programs do as instances of “intelligence.”  But it is a commonplace among critics of AI that what calculators and chess-playing programs do is mere mimicry and not genuine intelligence at all, not even of the “narrow” kind.

Of course, our authors would no doubt disagree with the critics.  But the point is that a good definition of intelligence should provide us with an independent guide for determining who is right.  It should make it clear what it would be to have genuine intelligence as opposed to mere mimicry, so that we could then go on to establish on that basis whether or not calculators and chess-playing machines actually have it.  Imagine a gold miner trying to prove that what he has dug up really is gold rather than pyrite by saying “I define ‘gold’ as ‘a yellow metal.’  So, the evidence is clear that this is gold!”  Obviously, even if what he has really is gold, he cannot establish that it is with that particular definition, because it is so broad that even a mere simulation of gold would meet it.  Similarly, our authors cannot claim to have established that AI has genuine intelligence when they are working from a definition so broad that even a simulation would meet it.  Even if they were right, their argument could not show that they are right, because given their definition of “intelligence,” it simply begs the question.

A third problem with our authors’ definition is that it violates yet another standard rule for definitions, which is that they should capture the essence of the thing defined.  Part of what this involves is leaving out of a definition any reference to features that the thing defined needn’t actually possess.  For example, it would be a mistake to define “gold” as a metal used in making jewelry, because even though that is true of gold, it is not essential to gold that it be used in making jewelry.  Gold would still have the same nature it has even if human beings had never decided to make jewelry out of it. 

But another part of what this rule involves – and the part relevant to our concerns here – is sensitivity to the fact that even features that always exist in things of a certain type are not necessarily part of the essence of the thing.  They may instead flow from its essence as “proper accidents” (to use the traditional Scholastic jargon).  For example, water is clear and liquid at room temperature, but this isn’t plausibly the essence of water.  Rather, these features follow from water’s having the essence it has, which is (either in whole or in part, depending on different views about essence I won’t adjudicate here) a matter of having the chemical composition H2O.  And things with a different essence from water might have these same features (as heavy water does, for example).  (I discuss the distinction between essence and proper accidents in my book Scholastic Metaphysics, at pp. 230-35.)

Now, our authors violate the rule that a definition should capture a thing’s essence, when they define intelligence in terms of a capacity for tasks involving “mathematics, language, science” and the like.  Capacities for mathematics, language, and science certainly follow from our having intelligence, but they are not themselves the essence of intelligence.  (That is why they don’t always manifest – though they “flow” from our having intelligence, the flow can be “blocked,” as it were, by immaturity, brain damage, or what have you.) 

What would be the essence of intelligence?  I would say that it has to do with the interconnected capacities for forming abstract concepts, putting them together into propositions, and reasoning logically from one proposition to another.  (See chapter 3 of my book Immortal Souls: A Treatise on Human Nature for a detailed exposition and defense of this traditional conception of intelligence.)  It is because we have intelligence in this sense that we are capable of mathematics, language, science, and so on.  Those capacities flow or follow from intelligence in this sense.

Of course, our authors may well disagree with this account.  But the point for present purposes is that their attempted definition doesn’t reflect any awareness of the need to distinguish essence from proper accidents, and to define intelligence in terms of the former rather than the latter.  As in the other ways noted, their account is simply conceptually sloppy.

This sloppiness manifests itself also in what they say intelligence does not involve.  They write:

Intelligence is a functional property that can be realized in different substrates – a point Turing embraced in 1950 by setting aside human biology. Systems demonstrating general intelligence need not replicate human cognitive architecture or understand human cultural references.

This is not wrong, but, without saying more, it is also not very helpful.  Suppose I suggested to you that a stone is intelligent and you replied that that seems like an absurd claim given that stones can't speak or reason logically, lack basic knowledge, and so on.  And suppose I responded: “True, but remember that systems demonstrating general intelligence need not replicate human cognitive architecture or understand human cultural references.”  Presumably you would not be impressed with this response. 

The problem, obviously, is that while genuine intelligence need not always look exactly the way it does in us, it nevertheless is not true that just anything goes.  We need some criteria for determining when something departs too far from how intelligence manifests in us to count as genuinely intelligent.  And our authors offer no such criteria.  They simply assume that, wherever we draw this line, AI will fall on the “genuine intelligence” side of it.  But since they merely assume this rather than argue for it, their position once again begs the question.

Detecting intelligence

Having failed to provide a serious definition of intelligence, it is no surprise that they also fail to provide a serious account of how to go about detecting intelligence (since the latter task presupposes the former).  There is a lot of hand waving about “a cascade of evidence,” and gee-whiz references to what LLMs can do.  But all of this ultimately boils down to nothing more than a stale appeal to the Turing test.  And the problem with the Turing test is that, of its very nature, it cannot distinguish genuine intelligence from a mere clever simulation.  Indeed, it deliberately ignores the difference and focuses narrowly on the question of what would lead us to judge a machine to be intelligent, rather than the question of what would make it the case that a machine actually is intelligent. 

Since it is the latter question that is at issue here, the Turing test is simply irrelevant.  (And as I have argued elsewhere, to make it relevant, the defender of the view that AI is genuinely intelligent will have to appeal to either verificationism or scientism, and the resulting position will be either self-defeating or question-begging.)

But the problem with our authors’ argument is worse than that.  It’s not just that they haven’t shown that AI has genuine intelligence.  It’s that we already know that it does not have it, and that it amounts to nothing more than a simulation of intelligence.  And we know that because mere mimicry is precisely all that computer architectures are designed to do.

Here’s an analogy (which I have developed in more detail elsewhere).  The methods employed by entertainers such as David Copperfield, David Blaine, and Penn and Teller are designed to produce effects that merely simulate magic.  And no matter how well they work, we know that mere simulation is all they can achieve, because the means they use are in no way preternatural but entirely mundane – sleight of hand, illusions, and so on.  Of course, genuine magic is not real in the first place, but that is irrelevant.  The point is that even if it were real, it would not be what Copperfield, Blaine, and Penn and Teller are doing, precisely because their methods are not of the type that could produce more than mimicry.

Now, AI operates on an analogous principle.  It is designed to produce effects that simulate intelligence, by means that don’t require any actual intelligence on the part of the machines themselves.  And this is as true of machine learning and related approaches that now dominate AI research as it was of the Turing machine model that dominated the earlier history of AI.  To borrow some terminology from John Searle, AI algorithms of whatever kind are sensitive only to the “syntax” of the representations they process rather than their “semantics.”  That is to say, they process representations in a way that is sensitive only to their physical properties rather than the meanings that we associate with those physical properties.

Because there is a general correlation between syntax and semantics – for example, the word “love” on a printed page is typically going to be used to express the concept LOVE – the output of a sophisticated algorithm can be made to simulate the sort of thing a human being might write.  The correlation is not perfect.  For instance, there might be cases where the string of letters “love” is not actually being used to express the concept LOVE, and there might be cases where the concept LOVE is being conveyed but without using the string of letters “love.”  This is why AI programs can often be “tripped up” and reveal, through a failure to reflect such nuances, that they don’t actually understand what they are “saying.”  Exposure to further data or refinements to an algorithm might work around such problems.  But all that that yields is a more convincing simulation (just as Penn and Teller and company may come up with new ways of producing ever more convincing illusions).  It doesn’t somehow generate a grasp of meaning or semantics on the part of the machine, because its basic structure is in no way sensitive to that in the first place.

Our authors speak as if the dispute over whether AI models are genuinely intelligent or merely simulating intelligence is a matter of “inference to the best explanation” – as if their position and that of the AI skeptic are alternative hypotheses meant to account for the same evidence, and the controversy has to do with which view is more empirically adequate, more parsimonious, and so forth. 

But this is as silly as suggesting that the view that Penn and Teller are merely simulating magic rather than producing the real thing is being proposed as the “best explanation” of the “evidence.”  It is, of course, nothing of the kind.  It is just a simple and straightforward conceptual point about the nature of their methods.  And to note that AI produces only a simulation of intelligence rather than the real McCoy is an equally simple and straightforward conceptual point about the nature of its methods.

Interested readers will find a more detailed and academic treatment of these issues in chapter 9 of Immortal Souls.

Related posts:

Computer pseudoscience

Artificial intelligence and magical thinking

Accept no imitations [on the Turing test]

Kripke contra computationalism

Do machines compute functions?

Can machines beg the question?

No comments:

Post a Comment