Artificial Intelligence Essay, Research Paper
Artificial Intelligence is based in the view that the only way to prove you know the mind’s causal properties is to build it. In its purest form, AI research seeks to create an automaton possessing human intellectual capabilities and eventually, consciousness. There is no current theory of human consciousness which is widely accepted, yet AI pioneers like Hans Moravec enthusiastically postulate that in the next century, machines will either surpass human intelligence, or human beings will become machines themselves (through a process of scanning the brain into a computer). Those such as Moravec, who see the eventual result as “the universe extending to a single thinking entity” as the post-biological human race expands to the stars, base their views in the idea that the key to human consciousness is contained entirely in the physical entity of the brain. While Moravec (who is head of Robotics at Carnegie Mellon University) often sounds like a New Age psychedelic guru professing the next stage of evolution, most AI (that which will concern this paper) is expressed by Roger Schank, in that “the question is not ‘can machines think?’ but rather, can people think well enough about how people think to be able to explain that process to machines?”
This paper will explore the relation of linguistics, specifically the views of Noam Chomsky, to the study of Artificial Intelligence. It will begin by showing the general implications of Chomsky’s linguistic breakthrough as they relate to machine understanding of natural language. Secondly, we will see that the theory of syntax based on Chomsky’s own minimalist program, which takes semantics as a form of syntax, has potential implications on the field of AI. Therefore, the goal is to show the interconnectedness of language with any attempt to model the mind, and in the process explain Chomsky’s influence on the beginnings of the field, and lastly his potential influence on current or future research.
Chomsky essentially founded modern linguistics in seeking out a systematic, testable theory of natural language. He hypothesized the existence of a “language organ” within the brain, wired with a “deep structured” universal grammar that is transmitted genetically and underlies the superficial structures of all human languages. Chomsky asserted that underlying meaning was carried in the universal grammar of deep structures and transformed by a series of operations that he termed “transformational rules” into the less abstract “surface structures” that was the spoken form of the various natural languages. He showed also that mental activities in general can and should be investigated independently of behavior and cognitive underpinnings. This “idealization” of the linguistic capability of a native speaker brought Chomsky to his nativist, internalist, and constructivist philosophical views of language and mind.
This concept of generative grammar could be seen as “a ‘machine’, in the abstract Turing sense, that can be used to generate all the grammatical sentences in a given language.” Chomsky was searching for a formal method of describing the possible grammatical sentences of a language, as the Turing machine (more below) was used to specify what was possible in the language of mathematics. Chomsky’s transformational generative grammar (TGG) possessed the most influence on AI in that it was a specification for a machine that went beyond the syntax of a language, to their semantics, or the ways that meanings are generated. An ambiguous sentence like “I like her cooking” or “flying planes can be dangerous” could have a single surface structure from multiple deep structures, just as semantically equivalent sentences involving a transformation from active to passive voice or the like, could have different surface structures emerging from the same deep structure.
Computational linguists and AI researchers saw that these rules, once understood, could be applied, or mechanized, with a formal mathematical system. Here, “natural languages were strings of symbols constructed to different conventions, which needed to be converted to a universal human ‘machine code.’” From a computational viewpoint, language is an abstract system for manipulating symbols; the universal grammar could be purified in the sense of mathematics, in other words, being independent of physical reality. Semantics in this view would just be an application of the abstract syntax onto the real world. Chomskyan linguistics, as we shall see further on, does not acknowledge any application of syntax outside the internal realm of mind, semantics being one of the components of syntax.
The primary difficulty in AI work, and that which binds it so closely with philosophy, cognitive science, psychology, and computational and natural linguistics, is that in order to build a mind, we must understand that which we are building. While we understand the external functions which are carried out by the brain/mind (age old mind/body problem), we do not understand the mind itself. Therefore we could (though this is exceedingly difficult and has not yet been done fully) imitate the mind (or language) but not simulate it. That is not to say that this is impossible in the future, but rather that the current paradigm must be transcended and an entirely new way of understanding the mind and machines must be put forth. A computer imitating intelligence would be like an actor who plays someone smarter than himself, whereas “simulation is only possible where there is a mathematical model, a virtual machine, representing the system being simulated.” Research with the goal of imitation is called “weak AI” and that with the goal of simulation is called “strong AI”.
And so, as set forth by Chomsky, it is the goal of computational linguistics to create a mathematical model of a native speaker’s understanding of his language, as it is the goal of AI to create a mathematical model of the mind as a whole. This analogy is imbalanced in that computational linguistics is not a separate discipline, but rather could very well be the key to AI. In addition, the relationships between computational linguistics and linguistics, or of AI and cognitive psychology (or philosophy of mind) are not of dependence of one upon the other, but of interdependence. If AI researchers were to create a functional model of the human mind in a machine, this would provide (perhaps all-encompassing) insight into the nature of the human mind, just as a complete understanding of the human mind would allow for computational modeling. The understanding of the interrelatedness of these fields is essential because in the end it will most likely be through a synthesis of work in the various fields that progress will be made.
To return to the specifics of computational linguistics, we see that while Chomsky’s work was vastly responsible for spawning the modern field, the idea of natural language “understanding” (more on this below) has been intricately tied to AI since Alan Turing posed his “Turing Test” in 1950 (which, incidentally, he predicted would be passed by the year 2000) . This test, which would supposedly determine that a machine had attained “intelligence,” is essentially that a computer would be able to converse in a natural language well enough to convince an interrogator he was talking to a human being. Yet, as we discussed above, there is a great difference between a computer so extensively programmed as to be able to imitate linguistic ability (which in itself has thus far proven extremely difficult if not impossible) or another conscious cognitive function, and one which simulates it. For example, a computer voice recognition system (one far more perfected than those available in the present day) which has advanced pattern-recognition abilities and can respond to any natural language vocal command with the proper action, still would not be said to understand language. The true sign of AI would be a computer who possessed a generative grammar, the ability to learn and to use language creatively. This possibility may not actually be possible, and Chomsky would be the first to argue that it wouldn’t, yet an examination into his more recent work in his minimalist program shows some strands of thought whose implications are far outside of his rationalist heritage, and which could be important to AI in the future.
Attempts at language understanding in computers before Chomsky were limited to trials like the military-funded effort of Warren Weaver, who saw Russian as English coded in some “strange symbols.” His method of computer translation relied on automatic dictionary and grammar reference to rearrange the word equivalents. But, as Chomsky made very clear, language syntax is much more than lexicon and grammatical word order, and Weaver’s translations were profoundly inaccurate.
Contrary to their original speculations in the dawn of the AI age (50’s-60’s), the most complex human capabilities have proven simple for machines, while the simplest things human children do almost mindlessly, such as tying shoes, acquiring language, or learning itself, prove the most difficult (if not impossible). Numerous computer language modeling programs have been created, the details of which are not essential to the topic of this paper and will not be delved into, yet none as of yet can approach the Turing Test. Much difficulty arises from linguistic anomalies like the ambiguities mentioned above, as in the old AI adage “time flies like an arrow; fruit flies like a banana.” The early language programs, like Joseph Weizenbaum’s ELIZA (which was able to convince adult human beings that they were receiving genuine psychotherapy through a cleverly designed Rogerian system of asking “leading questions” and rephrasing important bits of entered data) had nothing to do with modeling of language. Rather, these were programs which were programmed to respond to input with a variable output of designed speech with no generative grammatical or lexical capability.
Early attempts at computational linguistics, under Chomsky’s influence, attempted to model sentences by syntax alone, hoping that if this worked, the semantics could be worked out subsequently, and only once, for the deep structure. However, as Chomsky showed much later on, semantics is part of syntax (the most important part), and thereby could not be dealt with post-syntactically. Not unsurprisingly, the only linguistic area where computers thus far have shown considerable ability is the area that humans find the most difficult, whereas the simplest human linguistic abilities remain elusive. Sentences known as recursive, or left or right-branching such as The monkey that the lion who had eaten the zebra wouldn’t eat ate the banana, have an infinite capacity for embeddings, allowing for the vastly superior memory of the computer to be more effective in parsing them.
Understanding that Chomsky’s original breakthroughs (those of Syntactic Structures and his 60’s work) had profound impact on Artificial Intelligence, the remainder of this paper will speculate on the potential impact of his minimalist program and the nature of what I will call the “syntactic mind.” The premise of the argument is presented by SUNY Professor William Rapaport in his essay “How to Pass a Turing Test: Syntactic Semantics, Natural Language Understanding, and First Person Cognition,” as a rebuttal to John Searle’s Chinese Room argument, which Rapaport describes as: “1) Computer programs are purely syntactic. 2) Cognition is semantic. 3) Syntax alone is not sufficient for semantics. 4) Therefore, no purely syntactic computer program can exhibit semantic cognition.”
Rapaport responds by saying that syntax is sufficient for semantics, and if you accept that, then you discover that a purely syntactic computer program can exhibit semantic cognition; in other words, if semantics can be incorporated into syntax, then the computer program can simulate the cognitive mind. This is a bold statement, so let’s see how it is derived from Chomsky’s work.
Syntax is defined as the relations among a set of markers (Rapaport refrains from calling them symbols as “symbol” implies an inherent connection to an external object), and semantics is the relations between the system of markers and “other things,” (their meanings). His argument claims that if the set of markers is merged with the set of meanings, then the resulting set is a new set of markers, a sort of meta-syntax. The mechanism that the symbol-user (native speaker) uses to understand the relation between the old and new markers is a syntactic one. The simplest way to put all this would be that semantics must be understood syntactically, and is therefore a form of syntax.
The crux of the argument is that a word (for example tree) does not signify an actual external tree-object, but rather signifies the internal representation tree found in the mind. This idea goes to back to Chomsky’s Lectures on Government and Binding where he introduces “Relation R,” elucidated by James McGilvray as “reference, but without the idea that reference relates an LF [Logical Form, or SEM, semantic form] that stands between elements of an LF and these stipulated semantic values that serve to ‘interpret it’. This relation places both terms of Relation R, LF’s and their semantic values, entirely within the domain of syntax, broadly conceived;. . .They are in the head.” Chomsky’s internalism goes back to the Cartesian view that all sensory input is subjective and therefore nothing can be known outside of the mind. Therefore language cannot refer to external objects, but rather, either to its internal representations of them based on sensory input, or to concepts (like Unicorns) which have no external source to represent. So Chomsky’s internalism and nativism allow for the syntactic phrase in its semantic interface “an internally constituted perspective that can play a role in individuating, and even constructing the things of a world.” The implications for AI lie in that the purely syntactic symbol manipulation of a computational system’s knowledge base suffices for it to understand natural language.
The end-pursuit of “strong” AI is to model or simulate human consciousness. If syntax exists only inside a larger mental meta-syntax (rather than semantics) then the human consciousness is a world of signifiers, our mental reality suffers a permanent disengagement from the signified. “It is not really the world which is known but the idea or symbol. . ., while that which it symbolizes, the great wide world, gradually vanishes into Kant’s unknowable noumena.” If we take the Chomsky/McGilvray idea of “broad syntax” one step farther, philosophically, we find that the labyrinth of signifiers which is the syntactic mind exists in a world in which there is no concept outside the mechanisms of representation. Strangely, the post-structuralist Jacques Derrida, who Chomsky despises, says the same thing. At the origin of language “in the absence of a center of origin, everything became discourse. . .that is to say, when everything became a system where the central signified, the original or transcendental signified, is never absolutely present outside a system of differences. The absence of the transcendental signified extends the domain and the interplay of signification ad infinitum.” What Derrida is talking about by a transcendental signified is the semantic, external reality to which syntax refers. It is transcendental in that it transcends syntactic representation, it transcends the syntactic mind.
The internalist view does not deny the existence of the external world, rather, when McGilvray refers to “constructing the things of the world” through language, it is the world of human consciousness to which he refers. In this theory, it is through Chomsky’s I-language, through syntax, that we construct our world. This is the essence of Chomsky’s constructivism.
So we see that if we are to construct a thinking machine (or for that matter, representations in our mind of a thinking machine) this broad syntax does significantly clarify how to go about designing a computer which can take discourse as input, remember and learn, etc. . .If we realize however the syntactic nature of the minds which create the machine, we can see that it is possible for a machine to think syntactically, or at least that Searle’s Chinese Room argument does not stand up, because cognition is not dependent on semantics. Thus, a thinking machine would be “a purely syntactic system” of symbols (a neural network) and algorithms for manipulating them.
So we have seen that Chomsky (despite his own description of AI as “natural stupidity) has had profound influence upon linguistics, and thereby upon AI, as computational linguistics are central to past and future attempts to simulate the human mind.
Artificial Intelligence is based in the view that the only way to prove you know the mind’s causal properties is to build it. In its purest form, AI research seeks to create an automaton possessing human intellectual capabilities and eventually, consciousness. There is no current theory of human consciousness which is widely accepted, yet AI pioneers like Hans Moravec enthusiastically postulate that in the next century, machines will either surpass human intelligence, or human beings will become machines themselves (through a process of scanning the brain into a computer). Those such as Moravec, who see the eventual result as “the universe extending to a single thinking entity” as the post-biological human race expands to the stars, base their views in the idea that the key to human consciousness is contained entirely in the physical entity of the brain. While Moravec (who is head of Robotics at Carnegie Mellon University) often sounds like a New Age psychedelic guru professing the next stage of evolution, most AI (that which will concern this paper) is expressed by Roger Schank, in that “the question is not ‘can machines think?’ but rather, can people think well enough about how people think to be able to explain that process to machines?”