The Problem of Induction, and Does Science Have Presuppositions?

In a recent post I addressed the claim that science needs a philosophical or metaphysical grounding. Today I’d like to address the related subject of the “problem of induction”, and the following claims that are sometimes based on it:
– Science presupposes that “nature is uniform” or that “the future will resemble the past”.
– Science needs faith.

I’ll assume that you are already somewhat familiar with the problem of induction. If not, here are links to Wikipedia and SEP articles on the subject. For present purposes, I’ll take the conclusion of the argument to be as follows:

(1) We cannot give a justification for our most basic inductive practices.

Throughout this post, what I will mean by a “justification” is a justificatory argument. I limit the conclusion to our most basic inductive practices, because there is nothing circular about using more basic inductive practices to justify more advanced ones. For example, there is nothing wrong, necessarily, with the justificatory arguments that have been made in support of particular statistical methods, including Bayesian inference. (If there happens to be an error in some such justifications, it is not usually an error of circular reasoning.)

My response to (1) is to say that we don’t need any such justificatory argument. Why would anyone think that we do? The idea that we need such a justification, and that it’s a “problem” that we can’t have one, seems to arise because we’ve persuaded ourselves that we need justifications for everything we believe. But who persuaded us of that? If someone claimed, “We need a justification for every proposition we accept”, we should point out that that proposition seems self-defeating, as any justification of it will have premises that require further justifications, and so on ad infinitum. In any case, self-defeating or not, why should we accept it? If you ask me to accept the principle on the basis of an induction (e.g. “it’s always been useful to insist on justifications before”), then it’s self-defeating to use that principle to cast doubt on induction. If you accept that all our reasoning is rooted in induction (as the problem of induction seems to do), then it’s self-defeating to engage in any reasoning that casts doubt on induction. Just as it’s circular to argue for induction, it’s self-defeating to argue against it.

How did we get ourselves into this pickle? We’ve developed a habit of demanding and offering justifications, and that habit has become so inculcated that we’ve come to see justifications as an absolute requirement for appropriate belief (even a “metaphysical” requirement), when in fact they are just a tool that’s sometimes useful. After all, animals have appropriate beliefs, based on the evidence of their senses, despite never giving themselves any justifications. We humans too, throughout most of our waking hours, are forming appropriate beliefs about our immediate environment, without giving ourselves any justifications. Such beliefs are usually appropriate, because our automatic cognitive processes work reasonably well and give us reasonable beliefs based on the evidence of our senses. These processes mostly work automatically, without any discursive reasoning (verbal reasoning from proposition to proposition). And our basic inductive practices were themselves mostly acquired without discursive reasoning, through natural selection, experience and training. From birth we learn automatically from the school of hard knocks, and through natural selection we’ve learned from the hard knocks of our evolutionary ancestors.

Of course, we humans supplement our automatic inductive practices with discursively reasoned inductions. But note that even our discursive inductions make use of automatic induction: how else would we get from premises to a conclusion that isn’t deductively entailed? In any case, we tend to over-emphasise discursive reasoning, because that’s what we consciously observe ourselves using. We aren’t directly aware of the operation of our automatic non-conscious cognitive processes, and I suspect that until a couple of hundred years ago people had little inkling of the existence of such processes. Consequently, philosophers have traditionally tended to see discursive reasoning (or arguments) as the primary source of knowledge, and have attempted to root our knowledge in foundational arguments and foundational premises.

I claimed above that the practice of giving justifications is just a sometimes useful tool. Let me elaborate on that. One use of justifications, of course, is to persuade other people round to my way of thinking. Another is to make myself feel more satisfied about the beliefs I hold. Of more interest than those are what we might call truth-seeking uses, uses in which I genuinely attempt to improve the accuracy of my belief set, by critically scrutinising an existing belief. It can be useful to ask myself what justification I can give for a belief I hold, but I shouldn’t take the inability to find one as a decisive reason to reject the belief. Instead I should ask whether I have better reasons to reject a belief than to hold it. Being able to give a good justification is some reason for accepting or continuing to hold the belief, but it’s not necessarily decisive, since not all justifications are deductive proofs. (And even in the case of a mathematical proof, I could have made a mistake.) If I have no reason either to accept or to reject, then I have no reason to change my belief. And I have no reason to reject my basic inductive practices. I cannot argue against those practices in any way that does not depend on those practices. It’s just as self-defeating to doubt basic induction as it is circular to justify it.

My response to the problem of induction has similarities to Hume’s. According to the SEP:

Hume’s argument is often credited with raising the problem of induction in its modern form. For Hume himself the conclusion of the argument is not so much a problem as a principle of his account of induction: Inductive inference is not and could not be reasoning, either deductive or probabilistic, from premises to conclusion, so we must look elsewhere to understand it. Hume’s positive account does much to alleviate the epistemological problem—how to distinguish good inductions from bad ones—without treating the metaphysical problem. His account is based on the principle that inductive inference is the work of association which forms a “habit of the mind” to anticipate the consequence, or effect, upon witnessing the premise, or cause. He provides illuminating examples of such inferential habits in sections I.III.XI and I.III.XII of the Treatise(THN). The latter accounts for frequency-to-probability inferences in a comprehensive way. It shows that and how inductive inference is “a kind of cause, of which truth is the natural effect.”

I agree with this passage, except for the phrase “without treating the metaphysical problem”. First, the author seems to imply–on his own account–that there actually is a metaphysical problem. I deny that. What metaphysical problem? The idea that there’s a metaphysical problem seems to arise from the misguided ways of thinking that I’ve addressed above. Second, the author could be taken as implying that Hume believed there remained a metaphysical problem to be addressed. That may be so, though I’m not convinced. To be sure, Hume described his own solution to the problem of induction as a “skeptical solution”, but it’s not clear to me why he used the word “skeptical”. Presumably it indicates some sort of dissatisfaction with his solution. But what sort?

My statement (1) of the problem of induction referred to inductive practices. But sometimes the problem is cast in terms of beliefs, stated as propositions, on the grounds that induction involves reasoning from some such premise as “nature is uniform” or “the future will resemble the past”. In that case the problem may be stated in one of these forms:

(2) We cannot give a justification for our belief that nature is uniform.

(3) We cannot give a justification for our belief that the future will resemble the past.

As you might guess from what I’ve written above, my response to this is that we don’t need to reason from any such premise. Induction is not rooted in discursive reasoning, and in fact we rarely if ever observe anyone arguing from such a premise.

Do we in fact have such beliefs, even if we don’t use them as premises? I think what we (and animals) have is a tendency to learn from past experience by the basic inductive practices that I’ve discussed above. At a stretch, we could describe that state of affairs as our having a belief that nature is somewhat uniform, or a belief that the future will somewhat resemble the past, as long as we don’t take this to mean that we have some such proposition stored in our heads. (Do you think animals have those propositions stored in their heads?)

This brings me on to the claim that science presupposes that “nature is uniform” or that “the future will resemble the past”. The trouble is that such language is ambiguous. If we take it to mean that scientists must adopt such propositions as premises in their reasoning, then I reject that claim, for the same reasons I’ve given already. The inductive processes that scientists employ are not rooted in discursive reasoning, any more than are the inductive processes of the rest of us, or of animals. That’s why we don’t see scientists reasoning from that sort of foundational premise. But we could take the claim to mean only that scientists must proceed as if nature is (somewhat) regular, or as if the future will (somewhat) resemble the past. The claim is reasonable provided we interpret it that way, and accept that scientists proceed in that way because they have the inductive habits that they have, not because they are reasoning from a foundational premise.

Does any of this mean that scientists “need faith” to do science? Not really. Since we cannot sensibly question our most basic inductive habits, it makes little sense to say that we must have faith in them. We simply use them because we can do nothing else. We couldn’t stop using them even if we tried. But let me set aside the irrelevant problem of induction, and make a more general point. The expression “taking something on faith” is a rather fuzzy one, which is best seen as a matter of degree. We use it when we think someone has good reasons to subject their beliefs to skeptical scrutiny but fails to do so, or to do so sufficiently. In this matter of degree, science is just about as far towards the not-faith end of the spectrum as it’s possible to get. (Perhaps mathematics is even further.) Sure, individual scientists are fallible human beings, who sometimes fail to subject their scientific claims to sufficient skeptical scrutiny. But in general a very high priority is given to skeptical scrutiny in science, including scrutiny by testing against evidence. Contrast this with, say, religion, which lies far towards the opposite end of the spectrum, and where skeptical scrutiny of beliefs is often positively discouraged. Some believers do make some attempt to subject their religious beliefs to skeptical scrutiny. Again, I don’t want to make this an absolute distinction; it’s a matter of degree, and varies from person to person. But it’s also important to distinguish between the giving of justifications for one’s beliefs and skeptical scrutiny of those beliefs. Religious apologists may do plenty of the former while doing little of the latter. Indeed, that’s the normal human tendency. But the practices of science are aimed at minimising such truth-unfriendly tendencies. The practices of religions are not.

Advertisements

Why science doesn’t need a philosophical or metaphysical grounding

Here’s a comment I recently posted at Jason Rosenhouse’s blog. I think it’s useful enough to be worth reposting here.


I’ve just been reading this interview with Feser: http://www.strangenotions.com/scholasticism-vs-scientism-an-interview-with-dr-edward-feser/

It clearly shows the fundamental problem with his approach to philosophy: he thinks we can start from metaphysics.

“The trouble is that this gets things precisely backwards.”

Well, at least he and I agree that someone is getting things backwards. We just disagree on who that is.

The best we can do is start with what we know most securely, and proceed by the methods that have shown themselves to be most effective. We humans started with the knowledge that arose from the unreflective use of our evolution-given cognitive faculties, primarily knowledge of our immediate environment. Over time we gradually started to develop more reflective methods, but those always built on what we knew most securely. We adopted new methods because we found they worked. Over time our epistemic methods improved, building on what worked, until we arrived where we are now. We have good reason to trust modern science (on the whole) because it has proven so successful. Science doesn’t need any philosophical underpinning, which is just as well since philosophy hasn’t provided one.

The idea that we need some ultimate grounding or underpinning for our knowledge is hopeless. That misguided way of thinking leads to the problems of infinite regress and induction. It arises from an over-emphasis on arguments, or reasoning from premises to conclusion. Argumentative reasoning is a useful tool, but it’s not the primary basis of knowledge. Our knowledge starts with the working of our automatic (non-conscious) cognitive faculties, which work as well as they do thanks to natural selection (another sort of trial and error, building on what works best). Even when we engage in reasoning, our justifications must end somewhere (on pain of infinite regress), and where justification ends we must rely on our automatic cognitive faculties to work properly. In other words, we must rely on judgements which are not further justified. This should be particularly obvious in the case of non-deductive inference (which is most of our inference), where the premises are not sufficient to deliver the conclusion.

To say that we must rely on our cognitive faculties working successfully is not to deny that we should do our best to improve them, and do our best to critically scrutinise our existing beliefs. But those processes also involve using our cognitive faculties. In the end, if our cognitive faculties fail us, that’s just tough. Nothing can give me a guarantee that my faculties are working properly. (And I don’t mean “properly” in some absolute, ideal sense. I just mean working well enough.)

I think the traditional, scholastic way of philosophical thinking has appealed to philosophers primarily because argumentative reasoning is the epistemic method that we observe ourselves using. We aren’t directly aware of the operation of our non-conscious cognitive processes, and until a couple of hundred years ago we knew nothing of such processes. I guess before that our intuitive judgements must have seemed to arise from nowhere, or perhaps from a dualistic soul. Now we know better, or at least those of us who take the lessons of science seriously.

Contrast this reasonable, scientifically-informed account of knowledge building on success with Feser’s claim that metaphysics is “prior” to all other knowledge. What reason do we have to take metaphysical thinking seriously? Where are its demonstrated successes? How can a metaphysical account avoid the problem of infinite regress, or starting from some premises that just seem intuitively obvious (in which case it isn’t prior to all other knowledge)?

It isn’t hard to see that metaphysics is neither necessary nor useful. That’s enough reason to ignore it. On top of that, however, careful examination can reveal more specifically the ways in which metaphysics goes wrong. I won’t elaborate here, except to say that Wittgenstein showed us the ways in which much of traditional philosophy (including metaphysics) is bewitched by language.

More Responses to Searle on Strong AI

In two earlier posts I addressed Searle’s two best-known arguments against Strong AI, the Chinese Room Argument and the Syntax-and-Semantics Argument. Here I’d like to address some additional arguments and assertions that he’s made.

1. Appeals to intuition about stuff

According to Strong AI, any computer executing the right sort of program would have a mind, regardless of the physical materials (or “substrate”) that the computer is made from. In principle this could include a non-electronic computer. Searle attempts to ridicule Strong AI by asking readers to imagine non-electronic computers constructed from unlikely materials, and then appealing to their intuition that a system made from those materials couldn’t have a mind.

First, the distinction between program and realization has the consequence that the same program could have all sorts of crazy realizations that had no form of intentionality. Weizenbaum (1976, Ch.2), for example, shows in detail how to construct a computer using a roll of toilet paper and a pile of small stones. Similarly, the Chinese story understanding program can be programmed into a sequence of water pipes, a set of wind machines, or a monolingual English speaker, none of which thereby acquires an understanding of Chinese. Stones, toilet paper, wind, and water pipes are the wrong kind of stuff to have intentionality in the first place–only something that has the same causal powers as brains can have intentionality–and though the English speaker has the right kind of stuff for intentionality you can easily see that he doesn’t get any extra intentionality by memorizing the program, since memorizing it won’t teach him Chinese. [“Minds, Brains and Programs”, 1980]

(Here Searle talks about “intentionality”, but I’ll stick to the more familiar terms “mind” and “consciousness”, which he uses elsewhere. The “Chinese story understanding program” he refers to is a contemporary AI program written by Roger Schank, but I doubt anyone would claim that that program was conscious, and we should think instead about a hypothetical human-level AI program.)

Searle’s argument about the English speaker is a rehash of the CRA, which I’ve addressed in a previous post, so I’ll ignore that example. Apart from that, all he has is an appeal to the intuition that the other systems are not made of the right kind of stuff for mind. Not only is this is just an appeal to intuition, but Searle is biasing readers’ intuitions by presenting his examples in a misleading way. First, he fails to explain even minimally how a computer could be constructed from such materials, and secondly he gives no idea of the issues of scale that are involved.

It’s very hard to imagine how a computer could realistically be constructed from a roll of toilet paper and stones. In lieu of an explanation, Searle refers the reader to Weizenbaum. But, on reading Weizenbaum for myself, I find that no such computer is described. Searle has misinterpreted the text. What Weizenbaum presents is a pretty standard explanation of a Turing machine, of the sort that has a paper tape on which binary symbols are marked and erased. All Weizenbaum has done is replace the paper tape with toilet paper, and the symbols with black and white pebbles. This does not describe an entire computer, because a mechanism is still needed to move the pebbles around in accordance with the transformational rules. In Weizenbaum’s account, this role is played by a human being. In effect, Searle has left out the computer’s processor and mentioned only its memory. Without a processor nothing will happen. (I haven’t seen Searle repeat this example in subsequent writing, so perhaps he discovered his error shortly after the publication of this paper.)

Searle’s “sequence of water pipes” is more plausible, though he’s failed to mention anything about the arrangement of the pipes, or the valves that would be required at the junctions. I don’t know much about electronics, but I guess a junction of pipes could be fitted with a valve of some sort and made to control flows of water in some way that is analogous to the operation of a transistor. A computer could then be built, in principle, with the same logical design as a digital computer, replacing electric circuits with water pipes, and transistors with valve-equipped junctions. Of course, this would be quite impractical on any significant scale, and a computer capable of executing a human-level AI would have to be vast in size, perhaps larger than a cubic mile. It would also be many times slower than a human brain.

What does Searle gain by appealing to our intuitions about flows of water through pipes and valves, instead of to our intuitions about flows of elecricity through metal circuits and transistors? I suspect the main reason for his switch from electricity to water is that we’re familiar with electronic computers, and impressed by their abilities. We haven’t seen equally impressive water-based computers, and Searle describes them in simplistic terms that make the idea seem very unimpressive indeed. Searle is appealing to our intuitions based on our ordinary experience of water pipes, doing nothing much more than carrying water.

Whatever system we look at, if we zoom in on the very small details, we don’t see anything that looks like it could be conscious. That doesn’t just apply to electronic and water-based computers. It applies to brains too. Individual neurons don’t look conscious, and if we looked at even lower levels of abstraction, e.g. at biomolecules, those would look even less conscious. In that case too, we might wonder, “Where’s the consciousness?” If we follow such intuitions consistently, it seems that no mere “matter in motion” could be sufficient for consciousness. Searle is turning a blind eye to such problems in the case of brains, because he knows that he himself is conscious. Since he’s already convinced that what makes a system conscious is being made of the right kind of stuff, it follows that brain stuff is one of the right kinds. So he’s appealing to two intuitions, not just one: (1) it’s the kind of stuff that matters; (2) the stuff of water-based computers is not the right kind. And since intuition (1) directly contradicts Strong AI, this amounts to little more than saying that my intuition tells me you’re wrong.

2. Simulation is not duplication

Supporters of Strong AI say that a sufficiently detailed simulation of a human brain would have a mind, in the full sense that the simulated brain has a mind. Searle has responded to this claim by asserting that “simulation is not duplication”, but it’s unclear just why he thinks this truism is relevant. He has also made such remarks as the following:

No one supposes that computer simulations of a five-alarm fire will burn the neighborhood down or that a computer simulation of a rainstorm will leave us all drenched. Why on earth would anyone suppose that a computer simulation of understanding actually understood anything? [“Minds, Brains and Programs”, 1980]

There seems to be an implied argument here, but it’s not quite clear what it is. I can think of two possible interpretations.

(A) It can be interpreted as a refutation of an argument that is implicitly being attributed to Strong AI supporters: simulation is duplication; therefore a simulation duplicates all the properties of the simulated system, including mental properties. On this interpretation, I say that Searle is refuting a straw man. Strong AI supporters are not making any such argument. No one thinks that all the properties of a simulated system are present in the simulation. That would indeed be just as absurd as Searle suggests.

(B) It can be interpreted as an argument against Strong AI: based on some examples of properties that are not duplicated in simulations (burning and wetness), we are apparently to infer by induction that no other properties are duplicated in simulations. But significantly all the properties Searle mentions are physical properties. He fails to mention properties that might be called functional, computational or informational, which are duplicated by an appropriate simulation. Consider a calculator (electronic or mechanical) and an appropriate computer simulation of that calculator. Let’s say, for simplicity, that the calculator and the simulation are running in parallel, and taking the same inputs. When the calculator has a certain number stored in its memory, so does the simulation. If the calculator is adding, so is the simulation. And so on. Strong AI says that consciousness is broadly this sort of property, and not a physical property. For Searle to assume the contrary is begging the question.

So it seems that Searle is either refuting a straw man or begging the question. But, without a clear statement of his argument, it’s unclear which response to make. I think Searle’s lack of clarity contributes to his own confusion.

3. Syntax is not intrinsic to physics

Worse yet, syntax is not intrinsic to physics. The ascription of syntactical properties is always relative to an agent or observer who treats certain physical phenomena as syntactical. [“Is the Brain a Digital Computer?”, 1990]

I’ve already addressed this issue in the appendices of my post on the Syntax-and-Semantics Argument. I explained there why the ways that we can sensibly interpret the states of a computer are constrained by the causal nature of the system. They cannot sensibly be interpreted however an observer wishes. In summary:

Searle is looking at the memory states in isolation, instead of taking them in the context of the processor [and program]. It’s that context that gives the states their meaning, and makes them symbols, not just physical states.

In the same paper, Searle writes:

Cognitivists cheerfully concede that the higher levels of computation, e.g. “multiply 6 times 8” are observer relative; there is nothing really there that corresponds directly to multiplication; it is all in the eye of the homunculus/beholder. But they want to stop this concession at the lower levels. [“Is the Brain a Digital Computer?”, 1990]

I doubt that Strong AI supporters generally make such a concession. I certainly don’t.

At times in the paper it’s unclear (as in #2 above) whether Searle is making an argument against the Strong AI position or responding to an argument for Strong AI which he supposes its supporters to be making. When he seems to be doing the latter, the implied arguments seem to be simplistic straw men. Given that he believes he can refute Strong AI by means of trivial yet decisive arguments (like the CRA and SSA), it’s perhaps not surprising that he attributes such trivial arguments to supporters of Strong AI too. Of course, it’s quite possible that some supporters of Strong AI actually have made such simplistic arguments, but I haven’t seen them from leading supporters, like Daniel Dennett. For the most part I think these simplistic arguments are the products of Searle’s misunderstandings of Strong AI.

4. The Turing test is behaviouristic

The Turing test is typical of the tradition in being unashamedly behavioristic and operationalistic [“Minds, Brains and Programs”, 1980]

We don’t need to adopt any sort of behaviourism in taking the Turing test as a test for mind. We only need to reject epiphenomenalism. In the philosophy of mind, epiphenomenalism is the view that the presence of mind has no effect on physical events. Searle seems to hold the closely related view that the presence of mind has no effect on external behaviour, since he thinks that a computer AI could have just the same behaviour as a brain, despite having no mind. If he were right, it would naturally follow that tests of behaviour cannot provide evidence of the presence of mind. If, on the other hand, the presence of mind does affect behaviour, then behaviour can provide evidence of the presence of mind, at least in principle.

The question of epiphenomenalism is a long-standing one in philosophy of mind, and I don’t propose to address it here. I just wanted to make the point that Searle is mistaking a rejection of epiphenomenalism for an acceptance of some sort of behaviourism. Needless to say, Searle and Strong AI supporters have very different views of the nature of mind (or consciousness). Unfortunately Searle has some unhelpful ways of characterising these differences.

5. Strong AI is dualistic

…this residual opertionalism is joined to a residual form of dualism; indeed strong AI only makes sense given the dualistic assumption that, where the mind is concerned, the brain doesn’t matter. [“Minds, Brains and Programs”, 1980]

Strong AI is not dualistic at all. Searle appears either to have misunderstood the view, or to have confused himself by casting it in ambiguous language. Carefully put, what Strong AI says is this: there exists a set of possible programs such that any system that instantiates one of these programs has a mind (or more specifically, a mind with certain mental states). This is no more dualistic than saying: there exists a set of possible programs such that any system that instantiates one of these programs is a word-processor (or more specifically, is running Microsoft Word).

Because of the way that we talk about hardware and software, and because a computer’s software can be so easily changed, it may be tempting to see software as non-physical. But two otherwise identical computers with different programs in RAM are in different physical states, and it’s those different physical states that cause the computers to behave differently. In that sense, instantiated software is physical. So, when we say that what matters is the program, we are not denying that physical, causal states matter. (Remember that the program must be considered in the context of a particular processor. A different processor might execute the program differently.)

6. Strong AI pays no attention to causal powers

On Strong AI it’s precisely the causal powers of a particular computer, running particular software, that would make it the right sort of system for having a mind. But those causal powers are not restricted to any particular types of materials, except insofar as the materials must be suitable for implementing a computational system with the right sort of algorithm.

The Moral Landscape Challenge

Now that I’ve got a place to post things, I may as well make the most of it. Here’s something I wrote a while back but didn’t use. It was a response to Sam Harris’s Moral Landscape Challenge. Unfortunately he didn’t accept submissions until six months after announcing the Challenge, and when the time to submit entries came around, I missed it. Pity, because I thought my argument was quite compelling, and I can’t help feeling I might have had a modest chance of winning.

This piece follows the same kind of plan as my refutations of John Searle’s arguments: home in on the core error, and address that as clearly as possible, trying to ignore any distractions. It probably helped that entries were limited to 1000 words.


Hello Sam,

I intend to refute just your argument from worst possible misery. I’ll first make my refutation in a brief form, and then in more detail.

(P1) The worst possible misery for everyone is bad.

You claim that P1 is so tautological as to need no further argument. But the word “bad” is ambiguous–it can be taken in both moral and non-moral senses. P1 is only tautological if we take it in a non-moral sense, meaning “bad for people”, not “morally bad”.

(P1a) Producing the worst possible misery for everyone is morally bad (or morally wrong).

When P1 is disambiguated in such a way that we’re forced to take it in a moral sense (=P1a), it no longer appears tautological. It is now a moral judgement, not a tautology.

If we take P1 in a non-moral sense then you’ve failed to argue for any moral fact. If we take P1 in a moral sense (=P1a) then it’s not the tautology you claimed it to be, and you haven’t given us any other reason to accept it. By conflating the two senses you’ve created the false impression of having found a moral fact that’s as undeniable as a tautology. This is a fallacy of equivocation.

Now for the detailed version. The logic of your argument is not spelled out clearly, but it seems to be little more than an appeal to intuition: here’s a moral claim that is so self-evidently true that surely no reasonable person can deny it; therefore at least one moral claim is true.

At first glance, this doesn’t seem like a very constructive way to proceed. Those philosophers who deny objective moral truth (or all moral truth) think they have good general reasons for denying all such truths. And one can hardly expect them to abandon such reasons on the basis of no more than an appeal to intuition about one case.

The appeal makes a little more sense if it’s seen as an appeal to our linguistic intuition, our instinctive competence with language. After all, we have to assume some shared competence with language, and specifically with moral language. In fact, the following footnote (no. 22) does suggest that you are appealing to our linguistic intuition:

“And I don’t think we can intelligibly ask questions like, ‘What if the worst possible misery for everyone is actually good?’ Such questions seem analytically confused. We can also pose questions like, ‘What if the most perfect circle is also a square?’ or ‘What if all true statements are actually false?’ But if someone persists in speaking this way, I see no obligation to take his views seriously.”

I see analytical facts as those which follow from the meanings of the words alone. For a circle to be a square is inconsistent with the meanings of those words, regardless of any contingent facts about reality.

(P1) The worst possible misery for everyone is bad.

I take you to be claiming not just that P1 is analytic, but that (like your examples) it’s a tautology, by which I mean a fact that follows so directly from the meanings of the words that one can hardly insert any significant intermediate logical steps. In these terms, mathematical theorems are analytic but not tautological, since they follow by proofs from more basic mathematical facts or axioms. If you were claiming P1 to be analytic but not tautological, then we should expect you to have provided some connecting steps.

Though appeals to linguistic intuition have their place, it’s implausible that such a simplistic and unexplained appeal will suffice in the case of moral language, given that the meaning of moral language is so disputed among philosophers. So we should be very suspicious of this argument. In fact, you’ve just created the false appearance of an effective argument by committing a fallacy of equivocation.

P1 is (near enough) a tautology if we interpret “bad” as “bad for people”. You define “bad” to mean “reduces well-being”, and that’s a reasonable definition for this non-moral sense of “bad”. But this sense of “bad” doesn’t mean the same as “morally bad”. The word “bad” clearly has non-moral senses, since we can say that a hammer is a bad one without making any moral judgement–we just mean that it’s not fit for purpose. To say that smoking is bad for people (or reduces their well-being) can be just a neutral statement about the likely outcome of the behaviour. It doesn’t in itself tell us that smoking is inappropriate, though that might be a further conclusion we would draw. On the other hand, to say that genocide is morally bad (or morally wrong) is in itself to say that genocide is inappropriate.

As long as you continue to use “bad” in the non-moral sense, you haven’t established anything that helps support your case for moral truth. But even though you didn’t state the conclusion of this argument unequivocally, it’s clear that it was intended to support that case. So at some point you must have switched from talking about non-moral good/bad to talking about moral good/bad. At no point did you do this explicitly. If you had done, you would have seen the need to justify the move from non-moral good/bad to moral good/bad. Instead, you appear to have made this move without even noticing, through conflating the two senses.

Here’s a place where you seem to have switched to a moral sense of “good”:

(P2) “…it is good to avoid behaving in such a way as to produce the worst possible misery for everyone.”

We’re much more likely to read this as a moral judgement than we are with P1, because here you’re applying the word “good” to a behaviour, rather than a state of affairs. We usually make moral judgements about behaviours, not states of affairs. But you can read P2 as a non-moral tautology if you try hard enough to take “good” in a non-moral sense. If you substitute “conducive to well-being” for “good” in P2 you get:

(P3) …it is conducive to well-being to avoid behaving in such a way as to produce the worst possible misery for everyone.

This is a tautology, and not a moral judgement. To ensure that P2 is read as a moral judgement we can insert the word “morally”:

(P4) …it is morally good to avoid behaving in such a way as to produce the worst possible misery for everyone.

I hope it’s obvious that this isn’t just a tautology, equivalent to P3. It’s a moral judgement. But let me make some more adjustments to make the point even clearer:

(P5) Reducing everyone’s well-being is morally wrong. [Note 1]

If P5 seems to you like a moral judgement, and not just a tautology (reducing everyone’s well-being reduces well-being), that shows that “morally wrong” means more to you than just “reduces well-being”. Unfortunately, if you’re committed to saying that “morally wrong” just means “reduces well-being”, then you might mentally substitute this definition into P5 and get another tautology. That would be begging the question. You need to resist that temptation and read P5 with an open mind, being guided by your instinctive competence with the words “morally wrong”. (This is the right sort of appeal to linguistic intuition.)

So, P2 is ambiguous. On the more natural reading it’s a moral judgement, not a tautology. On the other reading it’s a tautology, not a moral judgement. By conflating the two meanings you can make it seem that you’ve found a moral judgement that’s as undeniable as a tautology. [Note 2]

Given your conflation, I can understand why you seem to find some people’s views on the subject baffling. Sometimes you’re interpreting the word “good” as meaning non-moral good (conducive to well-being) when the other person means it as moral good. So you’re talking past each other.

You may say that the whole purpose of the book is _only_ to argue that science can tell us what’s conducive to well-being, and nothing more. But when you argue for moral truth you _are_ arguing for something more. If that weren’t the case, you could just as well have omitted the word “moral” from your book altogether, and talked about well-being alone. You could even have called the book “The Well-Being Landscape”. That would have been clearer, more direct, and avoided many of the objections the book has received. But the reason that alternative didn’t suit your purpose is precisely because talk of moral rightness and talk of well-being are not equivalent.

Incidentally, you also conflate different senses of the word “value”. Are you talking about people’s values (what they actually value). Or are you talking about moral values in the sense of moral facts (i.e. facts about what it’s morally good to do)? In your argument about values you seem to conflate the two, jumping from the first sort of value to the second. This ambiguity even affects the subtitle of your book. You presumably mean it to say that science can determine the moral facts. But it can also (and I think more naturally) be read as saying that science can determine what people value.

In general, I feel you haven’t been sufficiently sensitive to the ambiguity and context-dependency of key words, like “good”, “value” and “ought”.


Notes

1. I’ve switched from ‘morally bad’ to ‘morally wrong’, as I think that’s even more clearly a term of moral judgement. I think you have no grounds to complain about the switch, since the book frequently uses the words ‘right’ and ‘wrong’ in a moral sense, and you never attempt to establish that these are significantly different from ‘good’ and ‘bad’.

2. Russell Blackford made a broadly similar point about tautology (circularity) in his review of your book. In your reply you didn’t respond to those remarks, and I would urge you to read that section of his review again.

Searle’s Argument from Syntax and Semantics

[Edited November 9, 2015]

My previous post addressed John Searle’s Chinese Room argument. There I remarked that Searle has two related arguments: the Chinese Room argument (CRA) and the Syntax-and-Semantics argument (SSA). This post will address the SSA. In summary, my conclusion is that the SSA, like the CRA, is vacuous, doing no genuine work, but only creating the illusion of an argument through the use of misleading language. There are several misleading elements to clear up, making the refutation quite long. The post is made even longer by some appendices, but you can skip those if you like.

As before, I’ll take as my main source for Searle’s argument his 2009 article at Scholarpedia. The first paragraph of the section “Statement of the Argument” is what I’m calling the CRA. The remainder of that section down to the “Conclusion” is what I’m calling the SSA. The SSA is stated in the form of a “deductive proof”:
(P1) Implemented programs are syntactical processes.
(P2) Minds have semantic contents.
(P3) Syntax by itself is neither sufficient for nor constitutive of semantics.
(C) Therefore, the implemented programs are not by themselves constitutive of, nor sufficient for, minds. In short, Strong Artifical Intelligence is false.

First let me say something about the relationship between the two arguments, since Searle has left that unclear. The CRA and SSA are both arguments against Strong AI. (I’ll adopt without further comment Searle’s sense of the term “Strong AI”.) The SSA states this explicitly. The CRA concludes that no computer can understand Chinese solely “on the basis of implementing the appropriate program for understanding Chinese”, which clearly contradicts Strong AI. In Searle’s 1980 paper (“Minds, Brains and Programs”), his first on the subject, this was his main argument against Strong AI, and it is what most people interested in the subject seem to understand by the term “Chinese Room Argument”. In that first paper, Searle mentioned the subject of syntax and semantics, but it was only later that he developed these thoughts into the 3-premise argument that I’m calling the SSA.

Given that the CRA and SSA both argue for the same conclusion, we might think that they are independent arguments, requiring separate refutations. Indeed, it’s only because I think it’s possible to interpret them that way that I am addressing them in separate posts. However, the Scholarpedia version of the SSA makes them interdependent, though it’s ambiguous as to the nature and direction of the dependency. On the one hand, Searle claims that the CRA is “underlain by” the SSA, which suggests that the SSA supports the CRA. Also, in his response to the Systems Reply, he attempts to use P3 of the SSA (“The principle that the syntax is not sufficient for the semantics”) to justify the CRA’s assumption that nothing in the CR understands Chinese. But on the other hand, the SSA proceeds in the opposite direction, using the alleged absence of understanding of Chinese to support P3.  Having it both ways round means that he is arguing in a circle.

Let’s have a closer look at the SSA’s argument in support of P3:

The purpose of the Chinese Room thought experiment was to dramatically illustrate this point [P3]. It is obvious in the thought experiment that the man has all the syntax necessary to answer questions in Chinese, but he still does not understand a word of Chinese.

In the past I’ve questioned whether Searle’s word “illustrate” should be interpreted as claiming support, but now I think no other interpretation makes sense. The SEP entry on this subject also interprets this as a supporting relationship: “The Chinese Room thought experiment itself is the support for the third premise.” Note that there is no theoretical argument here as to why syntax is insufficient for semantics. It relies purely on the alleged absence of understanding of Chinese in the CR. Setting aside any other problems with this argument, it commits exactly the same fallacy as does the CRA. It relies on the same unstated assumption: Since Searle doesn’t understand any Chinese, nothing in the CR understands any Chinese. In my refutation of the CRA, I explained how this move depends on an equivocation over the ambiguous first phrase  (“Searle doesn’t understand any Chinese”) followed by an illegitimate jump to the conclusion (“nothing in the CR understands any Chinese”). I pointed out that its conclusion already contradicts Strong AI, and so the remainder of the argument does no significant work. Insofar as the SSA depends on the above-stated argument in support of P3, it is committing just the same fallacy as the CRA, and I consider it already refuted by my earlier post. If the SSA is interpreted that way, I can rest my case here.

Alternatively, we can ignore the SSA’s appeal to the CR, and interpret the SSA as an independent argument from more general principles, not drawing on the CR. I suspect Searle originally intended it that way, and only added the appeal to the CR when he realised he needed some support for the SSA’s major premise, P3. For the remainder of this post, I will proceed on the basis of this interpretation. It leaves P3 with no support, but it seems to me that Searle has contrived to make P3 seem undeniable anyway, as we’ll see later. He seems to consider all his premises so self-evident that he has previously referred to them as “axioms” (“Is the Brain’s Mind a Computer Program?”, Scientific American, January 1990). In that article, he didn’t make the above argument for P3, claiming instead that, “At one level this principle is true by definition.”

Searle describes the SSA as a “deductive proof”. But philosophy is not mathematics. And even in mathematics, little of interest can be proved by a one-step deduction. A philosophical argument may sometimes be clarified by a statement of premises and conclusion. But all the real work still remains to be done in justifying those premises. Any controversial elements in the conclusion are merely transferred to one or more of the premises. We should be able to look below the headline premises and see the substantive argument underneath. But, when we do that with the SSA, we find nothing of substance. I will argue that the SSA relies on vague and misleading use of the words “syntax” and “syntactical”. This vagueness makes it hard to see how the work of the argument is being divided between P1 and P3, and helps to obscure the fact that no real work is being done at all.

Proceeding to address the argument in detail, my first step is to slightly clarify some of the wording of the premises and conclusion. First, the expression “implemented programs” is potentially ambiguous. It could be taken as referring just to the program code in memory, but it makes more sense to take it as referring to the process of program execution, which makes it consistent with Searle’s use of the word “processes” in P1. So I’ll replace “implemented programs” by “execution of programs”. Second, I think the expression “nor constitutive of” is redundant, so I’ll delete it for the sake of brevity. As far as I’m concerned, Searle only needs to show that the execution of programs is not “sufficient for” minds. Finally, I’ll make an insignificant but convenient change from plural to singular. The argument then becomes:

(P1a) The execution of a program is a syntactical process.
(P2a) Minds have semantic contents.
(P3a) Syntax is not sufficient for semantics.
(Ca) The execution of a program is not sufficient for a mind.

Instead of making an argument directly about minds, Searle chooses to make an argument about “semantics”, and then trivially derive a conclusion about minds. Though I have some reservations about his use of the terms “semantic contents” and “semantics”, P2a is relatively uncontroversial, and here I’ll accept it unchallenged. We can then reduce Searle’s argument to the following shorter one:

(P1a) The execution of a program is a syntactical process.
(P3a) Syntax is not sufficient for semantics.
(Cb) The execution of a program is not sufficient for semantics.

This is the core of the SSA, and it works in two steps: (1) it gets us to accept the expression “syntactical process” in place of “the execution of a program”, and then (2) it switches from “syntactical process” to “syntax”. Let’s look at step (2) first. You may think the switch from “syntactical process” to “syntax” is just a minor change of grammar. But it can have a big effect on how we read P3. If this switch hadn’t been made, the premise would have been:

(P3b) A syntactical process is not sufficient for semantics.

Given the premise in this form, a reader might be inclined to ask: “Just what is it to be a syntactical process, and what properties of such a process render it insufficient for semantics?” In other words, with P3b it’s easier for a reader to see that there is a substantive question to be addressed. Switching to P3a makes the premise seem so obviously undeniable that a reader might be inclined to accept it without further reflection. Why? Because P3a lacks any mention of a process, and instead opposes “semantics” directly to “syntax”. The words “syntax” and “semantics” usually refer to two distinct properties of languages (or expressions in languages), and in that sense it’s incoherent (a category error) to talk of one being sufficient for the other. You may say this is not Searle’s intended sense of the words “syntax” and “semantics”. But his unnecessary switch to the words “syntax” and “semantics” (from “syntactical process” and “semantic contents”) encourages such a misreading. He goes even further in this direction later on, when he reformulates the premise as “Syntax is not semantics”. I suggest that Searle himself has sometimes unwittingly conflated different senses of these words, which would help explain his earlier claim that P3 is “true by definition”.

If we look at the full text of the argument, we see a gradual and unexplained slide towards the use of increasingly questionable and leading language to describe the process of program execution:

– “[process] defined purely formally”
– “[process defined purely] syntactically”
– “syntactical process”
– “purely syntactical operations”
– “syntax”

The latter terms are more leading, in that the terms alone might be taken as suggesting an inconsistency with semantics. Searle gives no justification or explanation for his use of any of these terms beyond the first, and even that one is not explained clearly. To say that the process of program execution is formally defined is to say no more than that at some level of abstraction (particularly the machine code level) it can be modelled by a precisely specifiable algorithm. Since the computer’s memory states are discrete (just two possible states at the level of binary flip-flops), the state of the computer is precisely specifiable. And the operation of the processor conforms so reliably to the rules of execution that, given the state of the computer at one time, its state after the execution of the next instruction can be reliably and precisely predicted. What more is there to be said than that? What is gained by calling such a process “syntactical”? Let’s call a spade a spade. If there’s nothing more to be said than “precisely specifiable”, let’s just say “precisely specifiable”.

In the text following P1, Searle writes that “the notion same implemented program specifies an equivalence class defined purely in terms of syntactical manipulation”. But the word “syntactical” can perfectly well be replaced here by “precisely specifiable”. To say that two computers are executing the same program is to say no more than that at some level of abstraction they can be modelled by the same precisely specifiable algorithm.

It should be said that Searle is not the only writer to use the word “syntactical” (or “syntactic”) in relation to the execution of programs. It seems to be quite widespread. For example, Daniel Dennett, a staunch critic of the Chinese Room, speaks of “syntactic engines”. I suspect that this usage of these terms has arisen from a misguided association between precisely specified algorithms and mathematical formal systems. I address this point in Appendix 3. At least in Dennett’s case, I don’t think it’s leading him to make any mistakes, but is simply an unfortunate way of expressing himself, which works to his own rhetorical disadvantage. What really matters here is not what word Searle uses but whether he has made any substantive argument. We must ask, what do you mean by “syntactical” and why is such a process insufficient for semantics? He never answers that question.

Another unhelpful term he uses is “formal symbol manipulation”. Under his discussion of P1 he claims that “The computer operates purely by manipulating formal symbols…”. But he doesn’t explain why this is insufficient for semantics. Moreover, I will argue in Appendices 1 and 3 that this claim is misleading.

In the 1990 article cited above, Searle made a different argument, based on the idea that symbols can stand for anything an observer wants:

“The second point is that symbols are manipulated without reference to any meanings. The symbols of the program can stand for anything the programmer or user wants. In this sense the program has syntax but no semantics.” [“Is the Brain’s Mind a Computer Program?”, Scientific American, January 1990]

Again, no genuine argument has been made here, as there’s no explanation of why we should proceed from each sentence to the next, and it’s far from clear that they follow. Moreover, the first two sentences are both misleadingly ambiguous. I’ll address them at greater length in Appendix 2, but here’s a brief response to the second sentence. Can the user of a chess program interpret the on-screen symbols any way he likes? Can he interpret a knight symbol as a rook, or even as a Monopoly house? Not sensibly. The conventional interpretation is forced on him, not only by the appearance of the knight symbol, but also by the way it behaves. It follows the rules for a knight, and not for a rook. The interpretation is forced by the program. What’s true for on-screen states is equally true for memory states. Searle is probably thinking about memory states taken out of context. If we look at a byte of memory on its own, without the context of the program, then there’s a sense in which we can interpret it in any number of ways: as an integer, as an ASCII character, as any chess piece, etc. But why should such context-less interpretations be relevant? Searle doesn’t tell us.

I suggest that the SSA is primarily motivated by the intuition that you can’t get mind or meaning from the mere mindless execution of instructions. This is similar to the more common intuition that you can’t get mind or meaning from the mere mindless interactions of particles. Not satisfied with expressing such an intuition, Searle has tried to find ways of supporting it with an argument, but has failed to come up with anything of substance, let alone the sort of decisive “proof” that he claims. He has been misled by a common usage of terms like “formal symbol manipulation” and “syntactical” in relation to program execution, and wrongly jumped from such usage to the conclusion that program execution cannot give rise to “semantics”. There’s nothing more to his argument than that.

Of course, the passages I’ve quoted here have been only a small sample of Searle’s writing on the subject. I can’t possibly go through every line that he’s written in order to prove a negative, that none of it amounts to anything of substance. But the Scholarpedia article is relatively short, and I think you can verify for yourself that it contains nothing of substance. If you’re satisfied of that, then you have sufficient reason at least to be very skeptical about the CRA and SSA.

There ends my refutation of the SSA. My goal has not been to show that the execution of a program is sufficient for meaning or mind, but merely to show that Searle has given us no reason to accept the contrary. I’ve tried to avoid relying on any positive philosophical position of my own, as I don’t want any reader’s rejection of my own position to get in the way of seeing that Searle’s argument amounts to nothing. However, I’ve added some appendices, in which I express some more positive views.

——

APPENDIX 1. Levels of abstraction.

It should be understood that, when we talk about the world, we are modelling it. We model the world at various levels of abstraction, and with various types of abstraction (or we could say various types of model). We can talk about cities, about buildings in cities, about bricks in buildings, about molecules in bricks, and so on. It would be pointless (even meaningless) to ask which of of these entities (cities, buildings, bricks and molecules) are the “real” ones. Do bricks really exist, or are there only molecules (or atoms, or quarks, or quantum fields)? All of these concepts are abstractions which play a useful role in our useful models. The abstractions I’ve mentioned so far are pretty much physical ones, though “city” is stretching that point. Other abstractions could be cautiously described as “less physical” or “more abstract”. That includes such things as beliefs and desires, or “intentional states” as they are sometimes called.

When we compare different levels of abstraction, we may say such things as “x happens at level X, but y happens at level Y”. For example, at the molecular level there are only chemical interactions, but at the macroscopic level organisms are going about their lives. This way of speaking can create the impression that there are two parallel processes, and lead to a misguided strong emergentism: the entities and properties at the higher level seem to appear rather miraculously from a lower-level process that lacks those properties. But this misguided impression can be avoided by remembering that there is only one process, looked at in different ways. That’s not to deny that it’s sometimes useful to talk of different processes, especially since the different models may be addressing different aspects and parts of the process. There’s nothing wrong with such talk, as long as we don’t let it mislead us.

As with other real-world objects and processes, computers can be modelled with various levels and types of abstraction. In thinking about a computer executing a program, we could think at a hardware or physical level, for example about transistors switching flows of electrons. But mostly when we think about the execution of a program we think at a software or computational level, abstracting out the physical details. One software level of abstraction is that of machine code. But we can also think about higher levels of abstraction. If we’re programming in an interpreted language, like interpreted BASIC, then it’s useful to think at the level of that language. Since the BASIC interpreter is itself a program, when the BASIC program is being executed we have two programs being run. But the programs are not being run side-by-side, as when we run two programs in different windows on a PC. There’s a sense in which execution of the BASIC program just is execution of the interpreter. Putting it more carefully, we are talking about the same process at two different levels of abstraction. We can also speak at different levels of abstraction when talking about a program that’s been written in a modular, hierarchical way, with one subroutine calling other subroutines, which call still others. When we talk about what a higher-level subroutine is doing, we’re talking at a higher level of abstraction. If we say that our high-level subroutine calls the SORT subroutine to sort some list, we are abstracting out all the detail of the sorting work that goes on within the SORT subroutine. Yet another type of high-level abstraction occurs when we talk about what’s going on at the levels at which we typically describe our interaction with the system. For example, we may say that the program (or the system) is playing chess, that it moved P-K4, that it made a good move, etc. All these statements are modelling the program’s execution at a very high level of abstraction.

In thinking about computation and AI it’s important to keep these matters in mind. It would, for example, be a mistake to take too literally a claim that a program only executes machine-code instructions, or only engages in “formal symbol manipulation”. That may be all we see when we choose to model the process at the machine code level. But at other levels of abstraction the program is doing more sophisticated things, like playing chess. There may also be a temptation to privilege the machine-code level of abstraction, and say that what’s happening at that level is what’s really happening. To say that would be to make the same mistake as saying that only atoms really exist, and bricks don’t. Or only neurons really exist, and beliefs don’t. There is no such privileged level or type of abstraction. When programming we typically focus on a level at which we can see the execution of precisely specifiable instructions, such as the machine-code level. And that may incline us to assume incorrectly that every computational level of abstraction must be precisely specifiable. The fact that an AI would follow a precisely specifiable algorithm at the machine code level is no more relevant than the fact that the human brain could (in principle) be very precisely simulated.

APPENDIX 2. Meaning

Let me return to a passage I quoted earlier, and take this as a way into a broader discussion of meaning.

“The second point is that symbols are manipulated without reference to any meanings. The symbols of the program can stand for anything the programmer or user wants. In this sense the program has syntax but no semantics.” [“Is the Brain’s Mind a Computer Program?”, Scientific American, January 1990]

First let’s note an ambiguity in Searle’s second sentence. “The symbols of the program” could refer either to the symbols representing instructions or to the symbols representing data. I assume he means the latter, but I’ll start by addressing the former, as I think that case is easier to understand.

Let’s think about machine code instructions, like the JUMP instruction, which tells the processor to continue execution of the program from an address other than the following one. The JUMP instruction is represented in RAM by a certain state of a memory cell. (Looking at the level of binary flip-flops, we could say that it’s represented by a certain sequence of flip-flops.) When the processor encounters a cell in that state, it JUMPs. We can think of that state as a symbol for the JUMP instruction. It’s no less a symbol than is the printed word “JUMP” in an assembly language listing of the program. One is more easily read by processors, the other is more easily read by humans. But, with the right equipment, a human could read the symbol in RAM. And in principle we could equip a computer with an optical character recognition system so that it could automatically read and execute the program from the printed assembly language listing. In principle a human could execute the program by reading the symbols (on paper or in RAM) and executing them, rather like Searle in the Chinese Room. Whether it’s a human or a processor executing the program, they are both doing the same thing: reading the symbol for JUMP and then JUMPing. Of course, JUMP here is not the ordinary English word “jump”. (What the processor does has some resemblance to “jumping” in the English sense, and that gives the symbol “JUMP” useful mnemonic value. We could give the machine instructions non-mnemonic names, like “ALPHA”, “BETA”, etc, but that would just make them harder to remember.) The meaning of the JUMP symbol lies just in what it tells the executor of the program (human or processor) to do, no more and no less. The symbol has the same meaning to either the human or the processor: it tells the executor to continue execution from a different address. The processor is just as capable of acting in accordance with this meaning as is the human, though of course it does it in an automatic, mindless way. There should be no problem accepting this, as long as we don’t fall into the trap of seeing meaning as a quasi-dualistic property which somehow gets added to symbols in a mysterious way, or of conflating meaning with the conscious appreciation of meaning. Our talk of meaning (in the semantic sense, relating to symbols) is a useful way of understanding the role that symbols play.

We might be tempted to say that the meaning of the JUMP symbol is given by the formal specification of the instruction set that the engineers presumably had in hand when they designed the processor. But that’s past history. In the context of discussing the particular processor in front of us, the meaning of the JUMP symbol is given just by the fact that, when the processor encounters that symbol, its consistent behaviour is to continue execution from a different address, i.e. to do what’s usually called “JUMP”. Contrary to Searle’s second sentence (if we apply it to instructions), we can’t take this symbol as standing for anything we want. It would make no sense to take it as standing for ADD (unless we perversely used “ADD” to mean JUMP). ADD and JUMP are two very different operations, and there is a fact as to which is being executed at a given time.

Although the meaning of the JUMP symbol can be specified, it doesn’t have to be, in the sense that such specifications (or definitions) are just a useful way of describing the behaviour of the executor, or specifying what the executor should do. The executor doesn’t need them. A human executor might attend to such specifications while learning what to do, but eventually he could reach a state in which he is able to execute the machine instructions without attending to the specifications (rather the way that we may learn the grammar of a foreign language from reading rules in a book, but eventually learn to speak fluently without attending to any rules). He would then be in a similar situation to the processor, which does not use any specifications of the instructions it’s executing. Given that a computer doesn’t make use of any specifications of its machine instructions, why should the fact that such instructions can be precisely specified be of any relevance in considering the situation that the computer is in? (Strictly speaking, a processor may execute micro-code, which could be considered using a specification, but we can ignore that complication here.)

We might be tempted to jump from the first thought in this sequence to the others:
1. The execution of an instruction can be precisely specified.
2. The execution of an instruction can be specified without reference to its meaning.
3. An instruction can be executed without reference to its meaning.
Thought #2 is confused. A specification of how to execute an instruction symbol gives the meaning of the symbol. There is nothing more to its meaning than that. Thought #3 is confused too. When the executor executes the instruction, it might attend to a specification (which gives the meaning), as in the case of a human who hasn’t yet learned to execute the instruction automatically. Even when there is no attending (or “reference”)  to a specification, the executor is still executing the instruction in the way that is appropriate to the meaning of the symbol. We can call this “acting in accordance with” the meaning of the symbol, as long as we don’t interpret “acting in accordance with” as “attending to”. In other words, a processor doesn’t use any specification of the meaning of the symbol, and in that irrelevant sense it doesn’t make “reference to the symbol’s meaning”, but it does act in accordance with the meaning of the symbol. Searle’s use of the unclear expression “without reference to meaning” fails to observe this important distinction, and so is misleading.

Now let’s proceed to consider data symbols, starting at the machine code level. What should we make of Searle’s assertion that such “symbols are manipulated without reference to any meanings”? I say this is confused in a similar way to thought #3 above. The processor manipulates data symbols in a way that is in accordance with their meaning, i.e. appropriate to their meaning, but it doesn’t need to make any “reference” to their meaning. And I’ll show again that, at the machine code level, our assignment of meanings to symbols is constrained by the processor’s behaviour. Contrary to Searle’s assertion, the symbols can’t reasonably be taken as standing for anything we want.

Since we’re looking at the machine code level, let’s think about the interpretation of flip-flops as 0s and 1s. If we can interpret the states as standing for anything we want, we should be able to interpret both states of the flip-flops as 0s. But that would clearly be absurd. The whole point of our modelling the world is to help us make sense of it. We couldn’t make any sense of what the computer is doing at the level of flip-flips if we treated both states as the same! But though it’s less obvious, we also couldn’t make sense of computer operations at this level if we reversed our usual interpretation of the states, so that the state usually interpreted as 1 is now interpreted as 0, and vice versa. Consider an ADD instruction, which takes two numbers and returns the sum. 0000 + 0000 = 0000. But 1111 + 1111 does not equal 1111. So, after reversing our usual interpretation, the ADD instruction is no longer adding. OK, you might say, we can change our interpretation of the instruction too, and call it something else. But, as far as I know, the operation the instruction is now doing is not a standard one. There may be no pre-existing name for it. Forcing us to reinterpret ADD as some other, strange operation is working against our goal of making sense of what the computer is doing. And that would be just the start of our problems. We would soon be tying ourselves in knots trying to make sense of the situation. The standard interpretation is fixed not just by the initial decision of the computer’s designers to interpret the binary states that way round, but by the fact that they designed the processor to work on the basis of that interpretation. Now that the processor works that way, the interpretation is locked in. Searle is looking at the memory states in isolation, instead of taking them in the context of the processor. It’s that context that gives the states their meaning, and makes them symbols, not just physical states.

When we focus on the machine code level of abstraction, the only meanings we can see are the simple meanings that arise from the behaviour of the processor. At higher levels of abstraction, the meanings arise from the behaviour of the program (in combination with the processor), and, since there’s no limit in principle to the complexity and sophistication of programs, there’s no limit to the complexity and sophistication of the meanings. At this higher level too, it’s not true that we can interpret symbols however we like. Consider a computer running a chess program, displaying the (virtual) board on its screen. It will be convenient to think about the symbols on the computer’s screen, though we could also think about corresponding internal memory states. We can’t reasonably interpret the screen as showing a game of Monopoly, so we can’t reasonably interpret the knight symbols as Monopoly houses. Nor can we reasonably interpret the knight symbols as rooks, because they don’t behave like rooks. A knight symbol represents a knight because the computer has been programmed to treat it as a knight, and that interpretation is now locked into the system. It would make little sense to say that the knight symbol is manipulated “without reference to any meaning”. The system manipulates the knight symbol in accordance with its meaning, because it manipulates it in just the way that is appropriate for a knight (and not for a rook).

Searle’s mistake is to consider states of of memory (or screen) on their own, independently of any context. The relevant context here includes the rest of the program and the processor, which constitute the causal system that produces and/or interprets those memory states. Searle’s attitude is analogous to saying that we could interpret the words of a book as numbers in base-26 notation; or as numbers in binary notation, taking the letter “A” as 0, and other letters as 1; or any number of other pointless interpretations. These interpretations are pointless because they ignore the relevant context, which includes the process by which the book was produced. But note that a computer system is different from a book, in that it is itself a causal system, producing further states, so the meanings of its states can be fixed by its own behaviour, and not just by the process that produced the system in the first place.

Of course, there may be meanings to which a computer is indifferent. A text editor program is indifferent to the meanings of the words that the user is typing. The program does not manipulate the words in accordance with their meanings. An AI program of the sort we have today, such as Siri, uses words in accordance with their usual meanings to some extent, though nowhere near a fully human extent. But consider an English-speaking AI that could use language in just as sophisticated a way as a human. Searle allows that, in principle, an AI could have just the same behaviour as a human, by virtue of executing the right sort of program, e.g. a highly detailed simulation of a human brain. (He says that “I, in the Chinese Room, behave exactly as if I understood Chinese, but I do not.”) That means it would be just as creative as a human, capable of making up jokes, writing original poetry, and perhaps engaging in philosophical discussion. Let’s consider that scenario. It doesn’t matter for present purposes whether such an AI would be conscious or not. If you like, assume that it would be a non-conscious “zombie”. Since we could understand what it was saying, its words must have the same meanings as ours. That kind of program would be sufficient to fix the meanings of its words as the ordinary meanings. If the AI is a full-brain simulation of my brain, the meanings of its words could be considered “derived” from my meanings, but they would be “derived” in much the same sense that my meanings were in turn “derived” from those of the people from whom I’ve learned my linguistic habits. The AI would merely have acquired its linguistic habits in an unusual way. And once it started running, it would thereafter acquire new linguistic habits in the same ways I do, by picking them up from the speech it encounters, occasionally looking up words in dictionaries, and occasionally inventing new words and meanings of its own. The language of a community of such AIs would evolve over time, adding new words and meanings, in the same manner that human languages do. If an AI started using a new word correctly as a result of finding its meaning defined in a dictionary, would Searle still insist that the AI is manipulating the word “without reference to any meaning”?

I consider my view of meaning to be much the same as Wittgenstein’s. Roughly speaking, meaning lies in use. I would recommend reading Wittgenstein’s example of the builder and his assistant, at the start of “Philosophical Investigations”. I think that does a similar job to my discussion of executing machine code instructions, in helping us see how ordinary and unmysterious meaning is. In that example, the meanings of the four words of the builder and his assistant lie in nothing more than their habits of using those words. The meanings would be just the same if we replaced the humans with machines having the same habits.

I’ve made no attempt here to explain mind or consciousness. Consciousness is not called the “hard problem” for nothing, so I chose to say nothing about it here. But I’ve said a little about meaning, because that’s not such a hard problem, as long as we don’t conflate it with consciousness. Searle chose to make the SSA an argument about meaning/semantics. He tries to limit this to meaning/semantics “of the sort that is associated with human understanding”, by which he apparently means the understanding of a conscious system. He’s given us no more reason to accept his conclusion about that subset of semantics than to accept a similar conclusion about semantics more broadly. For the purposes of my positive account I’ve widened the discussion to meaning/semantics more broadly, because I think that’s the best way to demystify the concept, and I think Searle is drawing a misleading dichotomy. Meaning is meaning, whether we’re talking about conscious or non-conscious systems. The conscious appreciation of meaning is another matter. In my view, those who insist on conflating the subjects of meaning and consciousness will never understand either of them.

APPENDIX 3. Formal Systems

It seems to me that the tendency to use the terms “syntactical” and “formal symbol manipulation” in relation to computer program execution has arisen from making a misguided association between precisely specified algorithms and mathematical formal systems. It’s true that we can specify an algorithm in the format of a mathematical formal system, but doing so is of little benefit to these discussions. To see that, I’ll proceed by sketching such a formal system.

Let’s say we want to model the process of program execution at the machine code level. Let the well-formed formulas of the system be strings of binary digits representing the possible states of the computer’s memory. We’ll need to include the processor’s internal registers. For example we might let the first 64 bits of each formula correspond to the processor’s program counter, which points to the next instruction to be executed. Then our single “axiom” will correspond to the initial state of the computer, with our program and starting data in memory. Our “theorems” will correspond to subsequent states of the computer, after the execution of each instruction. Our single “rule of inference” will tell us how to execute one instruction, whichever one is currently pointed to by the program counter. This single rule could be broken down into sub-rules, one for each different instruction in the instruction set. But I call it one rule in order to emphasise that there is no choice of rules to be applied, as there is in the case of a mathematical formal system. In the mathematical formal system, it’s open to the mathematician to decide which rule to apply to produce the next theorem, and there are many possible theorems he could produce. That’s why we can’t think of the mathematical system as specifying an algorithm. But in the case of program execution it’s more natural to think in terms of an algorithm than of a set of rules.

Mathematicians have sometimes formalised an area of mathematics (say number theory) by giving a set of axioms and precisely specified rules of inference. It is then possible, in principle, to derive theorems from the axioms purely by following these formal rules. The application of such rules is sometimes called “formal symbol manipulation”. Since the rules can be applied without using any prior knowledge of the meanings of the symbols, it has sometimes been said that the symbols of the formal system have no meaning, and consequently the word “formal” may have become associated with meaninglessness in some people’s eyes. But it’s not true to say that the symbols have no meaning at all. After all, the very fact that different rules are applicable to different symbols makes it useful to think of them as having different meanings. Different symbols mean different things to the reader, telling the reader what can be done with those symbols. So the axioms and rules confer a meaning on the symbols. And, because the axioms and rules have been chosen to make the symbols correspond to our mathematical practice with the familiar symbols, we can say that the meanings of the symbols are related to the familiar ones, e.g. the meaning of “+” in the formal system is related to the familiar meaning of “+”. I say “related” and not the same, because Godel showed us that the axioms and rules don’t confer the full meaning that the symbols ordinarily have. There are aspects of our normal mathematical practice with the symbols that are not captured by the axioms and rules. Consequently, there is no concept of truth within the formal system. Nevertheless, the formulas of the formal system have corresponding mathematical statements which may be true or false. It is this very correspondence that allowed Godel to say that there are formulas that are unprovable in the formal system but which nevertheless correspond to true mathematical statements.

On the other hand, in the case of our formal system specifying program execution, there is no corresponding concept of truth. There is no sense in which the state of a computer can be said to be true or false. (Consequently, Godel’s result cannot be applied to such formal systems, pace Roger Penrose.) This is the second major difference between the two types of formal system, and explains why it’s peculiar to use the terms “axiom”, “theorem” and “rule of inference” in this context. In short, nothing has been gained by talking about our algorithm in the language of mathematical formal systems, and the terms “formal symbol manipulation” and “syntactical” are unhelpful.

We can now more readily address a particular remark that Searle makes under P1 of the SSA:

“The computer operates purely by manipulating formal symbols, usually thought of as 0s and 1s, but they could be Chinese symbols or anything else, provided they are precisely specified formally.”

In the formal system I’ve described above, at the machine-code level, there is no explicit reference to any symbols apart from binary digits. Within our written specification we might represent these binary digits by the marks “X” and “Y”, or even use “1” and “0” to correspond to the digits 0 and 1 respectively. That would make no difference, except that we would have adopted a more confusing notation. As I explained in Appendix 2, our decision to interpret the flip-flops the usual way round is not arbitrary, and we cannot sensibly interpret them as anything we like. In other words, given that our formal system is intended to model a particular real-world process, we have no choice of symbol interpretations, only an insignificant choice of notation.

This is analogous to the fact that, when mathematicians axiomatise a pre-existing area of mathematics, their formal systems don’t usually use arbitrary symbols. They use the pre-existing mathematical symbols, and choose the axioms and rules of the system in such a way as to ensure an appropriate correspondence between the pre-existing use of those symbols and their use in the formal system. They could make the mark “-” in the formal system correspond to the mark “+” in our pre-existing mathematical practice, but that would only create a confusing notation. If the axioms and rules involving the mark are ones that are appropriate to addition, then the symbol corresponds to addition no matter what mark we choose. We cannot interpret the symbols however we like.

Of course, there needn’t exist any written specification of our computer system. A written specification is only a description of the system. So, when we say that the system is “formally specified” we must mean it in some more abstract sense. All we mean is that the system has the kind of regularity that could be modelled well by a formal specification. In this more abstract sense, the question of notation doesn’t even arise. If we say that the computer manipulates the “formal symbols” 0 and 1, all this really means is that, at a certain level of abstraction, it works with discrete binary states (e.g. flip-flops) which are manipulated in a very regular way, such that we can describe the process well with a precisely specified algorithm, and such that it’s meaningful to assign the states the values 0 and 1.

There is clearly no possibility of interpreting the binary symbols of the machine code model as Chinese characters, since there are only two different symbols! In a sense, strings of binary digits could be interpreted as Chinese characters. But this model doesn’t pick out any such strings. For comparison, note that it does pick out strings representing opcodes, i.e. strings of digits which tell the processor which instruction to execute: e.g. “10010111” for JUMP, and “00111001” for ADD. Our written formal system needn’t include the symbols “JUMP” and “ADD”. It could refer to the binary strings directly. But for the sake of a human reader it might be convenient to define “JUMP” to mean “10010111”, and thereafter write “JUMP” instead of “10010111”. Instead of a rule saying “if the next instruction is 10010111…”, it could then say “if the next instruction is JUMP…”. But there would be no point in defining Chinese characters in this way, because this formal system has no rules that pick out strings and manipulate them in a way that’s appropriate to Chinese characters. There is no useful sense in which Chinese characters are being manipulated in accordance with a formal system for the machine code level.

Suppose our program is one that answers Chinese questions. Then there must be some level of abstraction at which we can talk about what’s happening in terms of Chinese characters. For example, we could say that the system has just answered the question “ABC” with the answer “XYZ” (where for convenience I use “ABC” and “XYZ” to represent sequences of Chinese characters). But such statements don’t constitute an algorithm for answering Chinese questions. There needn’t be any level at which the process can be modelled by a formal system (i.e. a precisely specified algorithm) that picks out Chinese characters, in the way that our machine code model picks out machine code instructions. In other words, there need not exist any formal system that manipulates Chinese characters. There need not be any “formal manipulation” of Chinese characters. Supporters of classical or “symbolic” AI may be looking for a formal system at the level of such high-level symbols. But supporters of “sub-symbolic” AI are not restricting themselves in that way, and doubt that any such system could produce human-level verbal behaviour. Searle seems oblivious to sub-symbolic approaches to AI.

Yet Another Refutation of the Chinese Room Argument

[Edited November 9, 2015]

I recently listened to a discussion of AI between Massimo Pigliucci and Dan Kaufman, in which they both endorsed the Chinese Room Argument, if I remember correctly. (See here.) This led me to take another look at the Chinese Room Argument, and to write a detailed refutation of it. There have of course been a huge number of responses in the past, but many of them have not been to my satisfaction, and I think I can bring a couple of new insights to the subject. To summarise, I say that the argument has no genuine content, but just creates the illusion of an argument by means of a fallacy of equivocation. It conflates two different systems, and then jumps without argument from the fact that one doesn’t understand Chinese to the conclusion that the other doesn’t understand Chinese.

I’ll start by clarifying the argument, then show that it appears to be trying to get something for nothing, and then discuss the fallacy which creates the illusion of something when there’s nothing.

Here’s the most recent version of the argument that I can find:

Strong AI is answered by a simple thought experiment. If computation were sufficient for cognition, then any agent lacking a cognitive capacity could acquire that capacity simply by implementing the appropriate computer program for manifesting that capacity. Imagine a native speaker of English, me for example, who understands no Chinese. Imagine that I am locked in a room with boxes of Chinese symbols (the database) together with a book of instructions in English for manipulating the symbols (the program). Imagine that people outside the room send in small batches of Chinese symbols (questions) and these form the input. I know is that I am receiving sets of symbols which to me are meaningless. Imagine that I follow the program which instructs me how to manipulate the symbols. Imagine that the programmers who design the program are so good at writing the program, and I get so good at manipulating the Chinese symbols, that I am able to give correct answers to the questions (the output). The program makes it possible for me, in the room, to pass the Turing Test for understanding Chinese, but all the same I do not understand a single word of Chinese. The point of the argument is that if I do not understand Chinese on the basis of implementing the appropriate program for understanding Chinese, then neither does any other digital computer solely on that basis because the computer, qua computer, has nothing that I do not have. [“Chinese Room Argument”, Scholarpedia, 2009]

This argument is the one that is widely known by the term “Chinese Room Argument” (CRA). In the Scholarpedia article it’s followed by a second argument, based on syntax and semantics, and which I’ll call the “Syntax-and-Semantics Argument” (SSA). In this post I will address only the CRA, leaving the SSA for a later post, where I will also discuss the relationship between them.

Here’s the the earliest published version of the argument, omitting the description of the CR scenario, which was rather long:

As regards the first claim, it seems to me quite obvious in the example that I do not understand a word of the Chinese stories. I have inputs and outputs that are indistinguishable from those of the native Chinese speaker, and I can have any formal program you like, but I still understand nothing. For the same reasons, Schank’s computer understands nothing of any stories, whether in Chinese, English, or whatever, since in the Chinese case the computer is me, and in cases where the computer is not me, the computer has nothing more than I have in the case where I understand nothing. [“Minds, Brains and Programs”, 1980]

The CRA depends on the unstated assumption that, if Searle doesn’t understand any Chinese, then nothing in the CR understands any Chinese. This may seem so obvious as not to need stating, but I will argue that this move depends on an equivocation that renders it fallacious. So the crucial and fallacious move in the argument has gone unstated! For the time being, you don’t need to accept that the distinction I’m making is a significant one. But I do need to rewrite the argument in a way that makes the move explicit, so that I can address the point in due course. I will also break the argument into two parts, and clarify it in a couple of other ways. So here’s my version of the CRA, with the crucial move in bold:

1. According to Strong AI, there exist (at least in principle) certain programs, such that the execution of any such program would be sufficient to produce an understanding of Chinese. Let P be any such program. Let Searle take the role of a computer and execute P. Obviously, despite executing P, Searle doesn’t understand any Chinese. Since Searle doesn’t understand any Chinese, nothing in the CR understands any Chinese. Therefore, contrary to Strong AI, the execution of P by Searle was not sufficient to produce any understanding of Chinese.
2. Now consider any other computer executing P. That computer has nothing relevant that Searle does not have. So, since Searle’s execution of P was not sufficient to produce any understanding of Chinese, neither will be this computer’s execution of P. Since this argument works for any program P and any computer executing that program, it follows that no computer can produce an understanding of Chinese solely by virtue of executing an appropriate program.

Note that the crucial move has been made by the end of part #1, and the conclusion of that part already contradicts Strong AI. So I’ll treat part #1 as the argument to be addressed, and argue that it’s fallacious. I’ll ignore part #2, which depends on part #1.

First note how trivial the argument is. Apart from setting up the scenario, it just relies on appealing to the apparently obvious fact that Searle doesn’t understand any Chinese. If we accept the bolded move without question (or without noticing) it seems to follow immediately that this is a counter-example to Strong AI. Appealing to an obvious premise, from which the conclusion immediately follows, is not much of an argument. Is it really possible to establish a controversial philosophical point with such a trivial argument?

Perhaps all the real work has been done in the construction of a clever scenario, and there’s something about the CR scenario which enables us to see a significant fact that was previously hidden. So let’s look at the CR scenario. How was it constructed? All Searle did was take the familiar scenario of an electronic computer executing a program, and replace the electronic computer with a human computer, called Searle. Since the human computer is doing just the same thing as the electronic computer, this substitution seems irrelevant. (Searle himself seemed to treat the difference as irrelevant when he generalised from the human computer to other computers in part #2.) The argument remains trivial.

It would be instructive to see what Searle’s argument would have looked like if he hadn’t made the switch to a human computer. In other words, let’s try substituting an electronic computer for Searle (the human computer) in the argument. We then get the following assertion: Obviously, despite executing P, the electronic computer doesn’t understand any Chinese. This would have been a clear case of question-begging, since this is what Searle needs to establish, and he clearly wouldn’t have made any argument in support of it. No doubt Searle himself appreciates that, or he wouldn’t have switched to a human computer. He sees the switch to a human computer as an important move.

So Searle has taken a vacuous question-begging argument and tried to make it into a valid argument by replacing an electronic computer with a human computer. If the substitution makes no relevant difference, as I claim, then it can’t produce a valid argument from an invalid one. If you think the substitution does make a relevant difference, the onus is on you to make sure you understand just how it does so. Remember that the electronic and human computers are functionally equivalent. The only difference is in their internal operation. But why should that internal difference have any bearing on the argument? (Like Searle, I’m ignoring differences of speed, memory size and reliability, since these make no difference to the in-principle argument.)

If the substitution makes no relevant difference, how does it produce an argument that has enough appearance of validity to convince many readers? The answer is that it introduces a spurious complication that serves to distract readers’ attention from the question-begging, and creates an ambiguity around which a fallacy of equivocation can be constructed. In the electronic computer scenario, there is only one language-using system present, namely the Chinese-speaking system that arises from the execution of the AI program, P. In switching to a human computer, Searle has added a second language-using system. He’s given the computer its own English-speaking system. After the substitution we need to be careful to keep track of which system we’re thinking about, but Searle makes us attend to the wrong system.

Before proceeding, it might be useful to say something more about the Chinese-speaking AI program, P. Since Searle claims that his argument works for any P, I can stipulate any P that I like, and I’ll choose one that makes the issues clearer. Let P be a full-brain simulation of an actual Chinese person (call him Lee) down to whatever level of detail is needed to make sure that the behaviour of the program is near-enough equivalent to Lee’s behaviour. Since this is a thought experiment, we can even simulate every single atom in Lee’s brain. Searle accepts that an AI could be behaviourally equivalent to a real person, so he has no reason to deny that the simulation can do everything Lee can do, at least with regard to a text-based dialogue, as in the CR. The Lee-simulation (“Lee-sim”) will give responses that reflect all of the real Lee’s knowledge, abilities, memories, personality, etc. (I deliberately say “reflect”. I’m not begging the question by pre-supposing that Lee-sim will have any mental states of its own.) I invite you first of all to imagine Lee-sim running on an electronic computer. A Chinese interlocutor can submit Chinese questions to Lee-sim and get Chinese answers. (Assume the computer has a scanner for inputting questions in Chinese characters, and a printer for printing answers.) If the interlocutor asks, “What country do you live in?”, Lee-sim might answer in Chinese, “China”. If the interlocutor asks, “Do you understand Chinese?”, Lee-sim might answer, “Of course I understand Chinese. How else could I be answering your questions?”. It seems reasonable–and consistent with our usual way of speaking about computer systems–to talk about Lee-sim this way, treating Lee-sim as a system that we can refer to by a noun.

Now let’s return to the CR scenario, with its human computer, called Searle. Searle is executing Lee-sim, and since Searle is functionally equivalent to the electronic computer, there’s no reason we shouldn’t continue to refer to Lee-sim as a system, and talk about it just the way we’ve been doing so far. The Chinese interlocutor can submit the same questions to Lee-sim as before, and get similar answers. But now we can also submit English questions, and get answers from the English-speaking system that has all Searle’s “native” knowledge, abilities, memories, personality, etc. By “native” I mean to exclude any abilities or other traits that may arise from the execution of Lee-sim. In some ways those are Searle’s abilities, and in some ways they aren’t. In one sense Searle can speak Chinese, because he can execute Lee-sim and produce Chinese output, but in another sense it’s not really Searle who’s speaking Chinese, it’s Lee-sim. We can avoid such linguistic ambiguities by using the term “native-Searle” to refer to the system that includes only Searle’s native traits. Native-Searle can’t speak Chinese. We can direct questions to native-Searle (instead of Lee-sim) by asking them in English. If we ask in English “What country do you live in?”, native-Searle might answer, “the USA”. If we ask in English, “Do you understand Chinese?”, native-Searle will answer, “No”.

A major purpose of the last two paragraphs has been to justify talking about the scenario in terms of two systems. If you’ve heard the “Systems Reply” to the CRA before, you will have heard people talking of two systems, or two sub-systems of the whole system. Such talk may have seemed peculiar or even unacceptable to you. I feel it’s sometimes introduced without sufficient explanation. My talk of two sub-systems is not begging any questions about whether each sub-system has its own mind. I haven’t mentioned minds. Nor is it my goal to persuade you that there are two minds. I’m only refuting Searle’s argument, not arguing for a contrary position. In order to reveal Searle’s equivocation as clearly as possible, I need the vocabulary to refer to two different collections of traits that are present within the combined system, and the best way to do that is to talk in terms of two systems. Such a vocabulary is made useful and appropriate by the neat separation of the two sets of traits.

Now that I can talk in terms of two systems, I can say that of course native-Searle doesn’t understand Chinese: native-Searle can’t even speak Chinese, let alone understand it. If anything can understand Chinese, it’s Lee-sim. Again, it’s not my goal to persuade you that Lee-sim can understand Chinese. I just want to show you that Searle has failed to address the question of whether Lee-sim can understand Chinese. He’s only addressed the irrelevant question of whether native-Searle can understand Chinese, and used that to distract you from attending to the relevant question. If he’d stuck to an electronic computer scenario, it would have been obvious that he needed to address the question of whether Lee-sim can understand Chinese, because Lee-sim would have been the only system present. Introducing a second system (native-Searle) only served to distract your attention from that question.

More specifically, the distraction did its work by means of a fallacy of equivocation. Searle invited you to accept, without argument, the premise “I do not understand a single word of Chinese”. By creating a weird scenario involving two separate language-using systems in one body, he has made the word “I” ambiguous. It could refer just to native-Searle. Or it could refer to the whole system, which incorporates both native-Searle and Lee-sim. On the first reading, the premise is trivially true, but irrelevant. (Yes, native-Searle doesn’t understand any Chinese, but that’s not the relevant question.) On the second reading, the premise is question-begging. Searle is just asserting that the whole system doesn’t understand any Chinese, i.e. nothing understands any Chinese. But that’s what he needed to show, not just assert. (This is equivalent to the question-begging premise that the electronic computer doesn’t understand any Chinese, which he would have been using if he hadn’t switched to a human computer.) An unwary reader accepts the premise on the reading that makes it trivially true but irrelevant (native-Searle doesn’t understand any Chinese), and then follows along when Searle makes a non sequitur jump to the question-begging reading (the whole system doesn’t understand any Chinese). Searle has made no argument from the first of these propositions to the second. He has made no argument in support of the conclusion that nothing in the CR understands any Chinese.

In the Scholarpedia article, Searle responds to the Systems Reply as follows:

The Systems Reply can be answered as follows. Suppose one asks, Why is it that the man does not understand, even though he is running the program that Strong AI grants is sufficient for understanding Chinese? The answer is that the man has no way to get from the syntax to the semantics. But in exactly the same way, the whole system, the whole room in which the man is located, has no way to pass from the syntax of the implemented program to the actual semantics (or intentional content or meaning) of the Chinese symbols. The man has no way to understand the meanings of the Chinese symbols from the operations of the system, but neither does the whole system. In the original presentation of the Chinese Room Argument, I illustrated this by imagining that I get rid of the room and work outdoors by memorizing the database, the program, etc., and doing all the computations in my head. The principle that the syntax is not sufficient for the semantics applies both to the man and to the whole system.

The Systems Reply objects that the CRA conflates a sub-system with the whole system, and then illegitimately jumps from the fact that the sub-system doesn’t understand Chinese to the conclusion that the whole system doesn’t understand Chinese. Instead of addressing that objection, Searle has now appealed to a different argument (based on his syntax/semantics principle) in support of the claim that the whole system doesn’t understand Chinese. This isn’t defending the CRA; it’s invoking a different argument for the same conclusion.

Could we charitably assume that Searle has always expected us to take his syntax/semantics principle (or some similar general principle) as our basis for accepting that the whole system doesn’t understand Chinese, and that he therefore never committed the fallacy of equivocation that Systems Repliers have attributed to him? No. Not only is this interpretation inconsistent with the wording of the CRA texts, but it would make the Chinese Room scenario entirely redundant. If the argument was based on such a general principle, then that principle could just as well have been applied directly to electronic computers. The switch to a human computer (and back in part #2) would have been pointless.

There’s another major problem with Searle’s response to the Systems Reply. He’s appealing to his syntax/semantics principle to support his claim that there’s no understanding of Chinese in the CR. But, as we’ll see in my post on the SSA, he also appeals to that claim to support his syntax/semantics principle. So he’s arguing in a circle! I suggest that the reason he finds himself resorting to such desperate measures is because each of his arguments is vacuous.

Before I finish, I’ll briefly address a few points that I’ve omitted above, but which are often raised in connection with the CRA.

1. Appeal to intuition. The CRA has often been interpreted as just an appeal to intuition. If you don’t read the argument as committing a fallacy of equivocation, then it seems that Searle is just appealing to the intuition that the whole system doesn’t understand Chinese. Perhaps the best-known response to the CRA is by Dennett and Hofstadter (“The Mind’s Eye”, 1981). Though they defend the Systems Reply briefly, the bulk of their response addresses the CRA as an “intuition pump”. Even if the CRA isn’t strictly an appeal to intuition, there are clearly intuitions at work which are addressed well by Dennett and Hofstadter, and I recommend their response.

2. Location of the program. In his 1980 paper, Searle interpreted a crude version of the Systems Reply as a concern over whether his argument had taken into account all the physical stuff in the room, in particular the pieces of paper on which the program and working data were stored, and even “the room” itself. In response he modified his scenario, having himself memorise the program and data, and work outdoors. He alludes to that move in the more recent response that I’ve quoted above. The move is irrelevant to more careful versions of the Systems Reply, including my own. The fallacy I’ve described is the same regardless of the location or materials in which Lee-Sim is implemented. However, locating all the materials inside Searle’s head makes his equivocation more effective, since it’s easier to read the word “I” as referring to the whole system when both sub-systems are entirely implemented inside his head.

3. Types of understanding. Searle and his critics have often differed over the meaning of the word “understanding”, which has led to some talking at cross-purposes. Searle uses “understanding” as a proxy for mind in his argument, and seems to think that the word must be limited to systems with minds, or perhaps to conscious systems. He says that the “understanding” we attribute to other systems is only “metaphorical” and not real. Many of his critics think that this distinction is misguided. While I agree with the critics, the point is irrelevant to my response. The fallacy I’ve described is the same whichever way we take the word “understanding”.

4. Other arguments. In addition to the CRA and the SSA, Searle makes a number of other arguments, including a use of the statement that “simulation is not duplication”. I will address these other arguments in a third post.