This article is inspired by my BICA 2018 paper Toward Semiotic Artificial Intelligence

Nobody proposed so far the following solution to the Chinese Room Argument against the claim that a program can be constitutive of understanding (a human, non-Chinese-speaker, cannot understand Chinese just having run a given program, even if this program enables the human to have input/output interactions in Chinese).

My reply goes as follows: a program, to be run by a human, non-Chinese-speaker, may indeed teach the human Chinese. Humans learn Chinese all the time; yet it is uncommon having them learning Chinese by running a program. Even if we are not aware of such a program (no existing program satisfies said requirement), we cannot a priori exclude its existence.

Before enunciating my reply, let me first steelman the Chinese Room Argument. If the human in the mental experiment of the Chinese Room is Searle, he may not know Chinese, but he may now a lot of things about Chinese: that it has ideograms and punctuation, which he may recognize; that it is a human language, which has a grammar; that it has the same expressive power of a language he knows, e.g. English; that it is very likely to have a symbol for “man” and a symbol for “earth”, and so on. Searle, unlike a computer processor, holds a lot of a priori knowledge about Chinese. He may be able to understand a lot of Chinese just because of this a priori knowledge.

Let us require the human in the Chinese Room to be a primitive, e.g. an Aboriginal, with absolutely no experience of written languages. Let us suppose that Chinese appears so remote to the Aboriginal, that she would never link it to humans (to the way she communicates) and always regard it as something alien. She would never use knowledge of her world, even if somebody tells her to run a given program to manipulate Chinese symbols. In this respect, she would be exactly like the computer processor and have no prior linguistic knowledge. The Chinese Room Argument is then reformulated: can a program to be run by the Aboriginal teach her Chinese (or, as a matter of fact, any other language)?

I am going to reply that yes, a program to be run by the Aboriginal can teach her a language. I am going to call this reply the “semiosis reply”.

Semiosis is the performance element involving signs. A sign, during semiosis, get interpreted and related to an object. Signs can be symbols of Chinese text or of English text, that a human may recognize. An object is any thing available in the environment, which may be related to a sign. It has been suggested that artificial systems can also perform (simulated) semiosis [Gomes et al, 2003]. Moreover, it has been suggested that objects can become available not only from sensory-motion experience, but also from symbolic experience of an artificial system [Wang, 2005]. A sign as recognizable by a machine can be related to a position in an input stream as perceived by a machine. For example, the symbol "z" stands for something that is much less frequent in English text than the interpretant which stands to the symbol "e". Semiosis is an iterative process in which the interpretant can become a sign to be interpreted (for example, the symbol "a" can get interpreted as a letter, as a vowel, as a word, as an article, etc). At any given time the machine may select as potential signs any thing available to it, including previous interpretants such as paradigms and any representation it created. I suggest that the machine should also interpret its internal functions and structures through semiosis. This comprises "computation primitives", including conditional, application, continuation and sequence formation, but also high-level functions such as read/write. The meaning that the machine can give to the symbols it experiences as input becomes then increasingly complex. Such a meaning is not given by a human interpreter (parasitic meaning), but it is rather intrinsic to the machine. When a human executes the program on behalf of the machine, it arrives at the same understanding, at the same meaning, i.e. simulating semiosis ultimately amounts to performing semiosis and the Aboriginal can actually learn from the program. (note how existing artificial neural networks, including deep learning for natural language processing, are ungrounded and devoid of meaning. A human, even executing the training phase of the artificial neural network, cannot arrive at any understanding. This is because the artificial neural network, despite its evocative name, at no level simulates a human neural network. On the contrary, semiotic artificial intelligence, despite having no representation of neurons, simulates semiosis occurring in human brains)

Let me tell you how an Aboriginal, called SCA, could learn English just by running a program. Let us suppose that SCA is given as an input the following text in English, the book "The adventures of Pinocchio" (represented as a sequence of characters with spaces and new lines replaced by special characters):

THE ADVENTURES OF PINOCCHIO§CHAPTER 1§How it happened that Mastro Cherry, carpenter, found a piece of wood that wept and laughed like a child.§Centuries ago there lived--§"A king!" my little readers will say immediately.§No, children, you are mistaken. Once upon a time there was a piece of wood...

This input contains 211,627 characters, which are all incomprehensible symbols for SCA. (This input represents a very small corpus compared to those used to train artificial neural networks)

Let me tell you how SCA learns, through only seven reflection actions and via only three iterations of a semiotic algorithm, to output something very similar to the following: representing semiosis

(It is suggested that the first thing SCA could output is “I said”, while more processing would be needed to actually have her output “I write”. Yes, SCA prefers writing with a stick on the sand!)

SCA runs the following reflection actions (some of them at a level of meta-language) and iterations of the semiotic algorithm (characterized by a syntagmatic algorithm and by a paradigmatic algorithm).

Reflection action 1

SCA does not know anything about the text she reads, i.e. observes, but she knows that she can reproduce any of its symbols by writing with her stick on the sand. The actions of reading and writing can be put together such that the result of one action is input to the other action. Writing what is read is copying. Moreover, it is logical that when something is read, somebody wrote it.

Reflection action 2

SCA may of course know already about herself. However, let us make this explicit when she observes that an action of reading not corresponding to a previous action of writing indicates that there must be an agent in the world to which an “I” is opposed.

Iteration 1 semiotic algorithm

SCA discovers the paradigm of uppercase and lowercase letters, i.e. symbols which can be in a sequence such that when they follow certain other symbols the first one of them is capitalized and when they follow certain other symbols they are normally not capitalized (let us consider “.§Poor”, “.§”Poor”, “, poor” and “ poor”). Proper nouns, e.g. “Pinocchio”, are an exception as they obey their capitalization rules. This does not hinder SCA from learning that “P” and “p” belong to the same paradigm.

Reflection action 3

SCA looks for a way of applying the action of writing to itself. This amounts to a situation of “reported writing” when someone writes someone (else) wrote something. It identifies possible candidates for the content of this action in words which stand out from other words due to capitalization. One candidate could be capitalized words, i.e. the words that begin each new sentences. Another candidate could be all the words in direct speech (such as "A king!" in the excerpt above). This reflection action takes advantage from the fact that “The adventures of Pinocchio” contains direct discourse. It would be more complicated for SCA to identify reported speech in a text not using direct discourse, i.e. using only indirect discourse.

Iteration 2 semiotic algorithm

SCA considers any sequence of two words and discovers several paradigms of words (words are considered in both their uppercase and lowercase versions). A paradigm comprise “said”, “cried” and “asked”. Another one “Pinocchio” and “Geppetto”. A third one “boy”, “man”, “voice”, “Marionette” and “Fairy”. These paradigms are discovered only because of side effects existing in the text, in particular of word adjacency side effects.

Iteration 3 semiotic algorithm

SCA considers syntagms made up of words and comprising paradigms of the previous iteration. A more refined paradigm is created to contain words which are in a similar relationship to quotations and direct discourse as “said”, “cried” and “asked”. Let us refer to this paradigm as p_{saying_verbs}.

Reflection action 4

SCA compares occurrences of words in p_{saying_verbs} with occurrences of other words. She makes the hypothesis that words in paradigm p_{saying_verbs} correspond to the action of writing.

Reflection action 5

SCA makes the hypothesis that, when in combination with words corresponding to the action of writing, other words such as “Pinocchio” and “man” correspond to the agent of writing.

Reflection action 6

SCA considers the fact that there is a proper noun which is appearing almost only in direct discourse, but does not seem to be writing-capable: the first person pronoun “I”, which occurs more than 500 times in direct discourse. SCA makes the hypothesis that, when she reads “I” in direct discourse, someone is self-referring.

Reflection action 7

SCA makes the hypothesis that when she performs an action of writing, she may refer to this action using a word in the paradigm p_{saying_verbs}and the word “I”. Therefore, she may write: “I said”.

From “I said” to “I write”

SCA does not know about verb tenses. She can run however another iteration of the semiotic algorithm to find out the morphemes of verb conjugation “-s”, “-ed” and “-ing”. She then expands paradigm p_{saying_verbs} to include also “say” and “saying”. She uses Occam’s razor to select the simplest hypothesis for referring to an action of writing.

She considers:

the 6 occurrences in direct discourse of the sequence "I said" (always in the presence of quotations inside quotations);
the 6 occurrences in direct discourse of the sequence "I say" (mostly in the presence of exclamation marks);
the 2 occurrences in direct discourse of the sequence "I ask" (in the presence of question marks).

She finds that quotations inside quotations are more special than exclamations marks, which occur more than 600 times. Therefore, “I say” is the simplest explanation for referring to her action of writing. SCA can output this sequence.

Finally, interacting with SCA could make her learn to use the words “I write” instead. Let us suppose that we can send a message to SCA of the form "No, you write". Based on this message, SCA searches the corpus again and makes the hypothesis that "write" and "written" belong to the same paradigm ("written" is the irregular past participle of "write"). SCA retrieves the following 4 passages about written signs, which all contain the word "written":

"Oh, really? Then I'll read it to you. Know, then, that written in letters of fire I see the words: GREAT MARIONETTE THEATER.

on the walls of the houses, written with charcoal, were words like these: HURRAH FOR THE LAND OF TOYS! DOWN WITH ARITHMETIC! NO MORE SCHOOL!

The announcements, posted all around the town, and written in large letters, read thus:§GREAT SPECTACLE TONIGHT

As soon as he was dressed, he put his hands in his pockets and pulled out a little leather purse on which were written the following words:§The Fairy with Azure Hair returns§fifty pennies to her dear Pinocchio§with many thanks for his kind heart.

SCA then makes the hypothesis that the paradigm of "write" and the paradigm of "say" belong to the same paradigm, and that when reporting about her own displaying the former should be used. Finally, SCA outputs the sequence "I write", nowhere to find in the corpus.

We can give SCA instructions for each of the reflection actions and each of the semiotic algorithm iterations. We can write these instructions in a way they can be executed by a computer processor, i.e. in a way they are a computer program. I have suggested that the program should use compositable high-level functions only (operations in Peano arithmetics instead of calls to a black-box arithmetic logic unit, see [Targon, 2016]) so that it operates only with cognitively grounded semiotic symbols, it can automate reflective programming (see [Targon, 2018]) and it can simulate semiosis. It follows that when the program is executed by a human, the human achieves semiosis ("semiosis reply" to the Chinese Room Argument).

Bibliography

Gomes, A., Gudwin, R., Queiroz, J. (2003), "On a computational model of Peircean semiosis", In: Hexmoor, H. (ed.) KIMAS 2003: 703-708

Wang, P. (2005), "Experience-grounded semantics: a theory for intelligent systems", Cognitive Systems Research, 6: 282-302

Targon, V. (2016), "Learning the Semantics of Notational Systems with a Semiotic Cognitive Automaton", Cognitive Computation 8(4): 555-576.

Targon, V. (2018), "Towards Semiotic Artificial Intelligence", BICA 2018.

The "semiosis reply" to the Chinese Room Argument