Simulation Essay, Research Paper
Eric Fingerman
By a “superintelligence” we mean an intellect that is much smarter than the
best human brains in practically every field, including scientific creativity,
general wisdom and social skills. This definition leaves open how the
superintelligence is implemented: it could be a digital computer, an
ensemble of networked computers, cultured cortical tissue or what have
you. It also leaves open whether the superintelligence is conscious and has
subjective experiences.
Entities such as companies or the scientific community are not
superintelligences according to this definition. Although they can perform a
number of tasks of which no individual human is capable, they are not
intellects and there are many fields in which they perform much worse than
a human brain – for example, you can’t have real-time conversation with
“the scientific community”.
Superintelligence requires software as well as hardware. There are several
approaches to the software problem, varying in the amount of top-down
direction they require. At the one extreme we have systems like CYC which
is a very large encyclopedia-like knowledge-base and inference-engine. It
has been spoon-fed facts, rules of thumb and heuristics for over a decade by
a team of human knowledge enterers. While systems like CYC might be
good for certain practical tasks, this hardly seems like an approach that will
convince AI-skeptics that superintelligence might well happen in the
foreseeable future. We have to look at paradigms that require less human
input, ones that make more use of bottom-up methods.
Given sufficient hardware and the right sort of programming, we could
make the machines learn in the same way a child does, i.e. by interacting
with human adults and other objects in the environment. The learning
mechanisms used by the brain are currently not completely understood.
Artificial neural networks in real-world applications today are usually
trained through some variant of the Backpropagation algorithm (which is
known to be biologically unrealistic). The Backpropagation algorithm
works fine for smallish networks (of up to a few thousand neurons) but it
doesn’t scale well. The time it takes to train a network tends to increase
dramatically with the number of neurons it contains. Another limitation of
backpropagation is that it is a form of supervised learning, requiring that
signed error terms for each output neuron are specified during learning. It’s
not clear how such detailed performance feedback on the level of
individual neurons could be provided in real-world situations except for
certain well-defined specialized tasks.
A biologically more realistic learning mode is the Hebbian algorithm.
Hebbian learning is unsupervised and it might also have better scaling
properties than Backpropagation. However, it has yet to be explained how
Hebbian learning by itself could produce all the forms of learning and
adaptation of which the human brain is capable (such the storage of
structured representation in long-term memory – Bostrom 1996).
Presumably, Hebb’s rule would at least need to be supplemented with
reward-induced learning (Morillo 1992) and maybe with other learning
modes that are yet to be discovered. It does seems plausible, though, to
assume that only a very limited set of different learning rules (maybe as few
as two or three) are operating in the human brain. And we are not very far
from knowing what these rules are.
Creating superintelligence through imitating the functioning of the human
brain requires two more things in addition to appropriate learning rules
(and sufficiently powerful hardware): it requires having an adequate initial
architecture and providing a rich flux of sensory input.
The latter prerequisite is easily provided even with present technology.
Using video cameras, microphones and tactile sensors, it is possible to
ensure a steady flow of real-world information to the artificial neural
network. An interactive element could be arranged by connecting the system
to robot limbs and a speaker.
Developing an adequate initial network structure is a more serious problem.
It might turn out to be necessary to do a considerable amount of hand-coding
in order to get the cortical architecture right. In biological organisms, the
brain does not start out at birth as a homogenous tabula rasa; it has an
initial structure that is coded genetically. Neuroscience cannot, at its present
stage, say exactly what this structure is or how much of it needs be
preserved in a simulation that is eventually to match the cognitive
competencies of a human adult. One way for it to be unexpectedly difficult
to achieve human-level AI through the neural network approach would be if
it turned out that the human brain relies on a colossal amount of genetic
hardwiring, so that each cognitive function depends on a unique and
hopelessly complicated inborn architecture, acquired over aeons in the
evolutionary learning process of our species.
Is this the case? A number of considerations that suggest otherwise. We
have to contend ourselves with a very brief review here. For a more
comprehensive discussion, the reader may consult Phillips & Singer
(1997).
Quartz & Sejnowski (1997) argue from recent neurobiological data that the
developing human cortex is largely free of domain-specific structures. The
representational properties of the specialized circuits that we find in the
mature cortex are not generally genetically prespecified. Rather, they are
developed through interaction with the problem domains on which the
circuits operate. There are genetically coded tendencies for certain brain
areas to specialize on certain tasks (for example primary visual processing
is usually performed in the primary visual cortex) but this does not mean
that other cortical areas couldn’t have learnt to perform the same function. In
fact, the human neocortex seems to start out as a fairly flexible and
general-purpose mechanism; specific modules arise later through
self-organizing and through interacting with the environment.
Strongly supporting this view is the fact that cortical lesions, even sizeable
ones, can often be compensated for if they occur at an early age. Other
cortical areas take over the functions that would normally have been
developed in the destroyed region. In one study, sensitivity to visual
features was developed in the auditory cortex of neonatal ferrets, after that
region’s normal auditory input channel had been replaced by visual
projections (Sur et al. 1988). Similarly, it has been shown that the visual
cortex can take over functions normally performed by the somatosensory
cortex (Schlaggar & O’Leary 1991). A recent experiment (Cohen et al.
1997) showed that people who have been blind from an early age can use
their visual cortex to process tactile stimulation when reading Braille.
There are some more primitive regions of the brain whose functions cannot
be taken over by any other area. For example, people who have their
hippocampus removed, lose their ability to learn new episodic or semantic
facts. But the neocortex tends to be highly plastic and that is where most of
the high-level processing is executed that makes us intellectually superior to
other animals. (It would be interesting to examine in more detail to what
extent this holds true for all of neocortex. Are there small neocortical
regions such that, if excised at birth, the subject will never obtain certain
high-level competencies, not even to a limited degree?)
Another consideration that seems to indicate that innate architectural
differentiation plays a relatively small part in accounting for the
performance of the mature brain is the that neocortical architecture,
especially in infants, is remarkably homogeneous over different cortical
regions and even over different species:
Laminations and vertical connections between lamina are
hallmarks of all cortical systems, the morphological and
physiological characteristics of cortical neurons are equivalent in
different species, as are the kinds of synaptic interactions
involving cortical neurons. This similarity in the organization of
the cerebral cortex extends even to the specific details of cortical
circuitry. (White 1989, p. 179).
In the seventies and eighties the AI field suffered some stagnation as the
exaggerated expectations from the early heydays failed to materialize and
progress nearly ground to a halt. The lesson to draw from this episode is not
that strong AI is dead and that superintelligent machines will never be built.
It shows that AI is more difficult than some of the early pioneers might have
thought, but it goes no way towards showing that AI will forever remain
unfeasible.
In retrospect we know that the AI project couldn’t possibly have succeeded
at that stage. The hardware was simply not powerful enough. It seems that at
least about 100 Tops is required for human-like performance, and possibly
as much as 10^17 ops is needed. The computers in the seventies had a
computing power comparable to that of insects. They also achieved
approximately insect-level intelligence. Now, on the other hand, we can
foresee the arrival of human-equivalent hardware, so the cause of AI’s past
failure will then no longer be present.
There is also an explanation for the relative absence even of noticeable
progress during this period. As Hans Moravec points out:
[F]or several decades the computing power found in advanced
Artificial Intelligence and Robotics systems has been stuck at
insect brain power of 1 MIPS. While computer power per dollar
fell [should be: rose] rapidly during this period, the money
available fell just as fast. The earliest days of AI, in the mid
1960s, were fuelled by lavish post-Sputnik defence funding,
which gave access to $10,000,000 supercomputers of the time. In
the post Vietnam war days of the 1970s, funding declined and
only $1,000,000 machines were available. By the early 1980s, AI
research had to settle for $100,000 minicomputers. In the late
1980s, the available machines were $10,000 workstations. By the
1990s, much work was done on personal computers costing only
a few thousand dollars. Since then AI and robot brain power has
risen with improvements in computer efficiency. By 1993
personal computers provided 10 MIPS, by 1995 it was 30 MIPS,
and in 1997 it is over 100 MIPS. Suddenly machines are reading
text, recognizing speech, and robots are driving themselves cross
country. (Moravec 1997)
In general, there seems to be a new-found sense of optimism and excitement
among people working in AI, especially among those taking a bottom-up
approach, such as researchers in genetic algorithms, neuromorphic
engineering and in neural networks hardware implementations. Many
experts who have been around, though, are wary not again to underestimate
the difficulties ahead.
Once artificial intelligence reaches human level, there will be a positive
feedback loop that will give the development a further boost. AIs would
help constructing better AIs, which in turn would help building better AIs,
and so forth.
Even if no further software development took place and the AIs did not
accumulate new skills through self-learning, the AIs would still get smarter
if processor speed continued to increase. If after 18 months the hardware
were upgraded to double the speed, we would have an AI that could think
twice as fast as its original implementation. After a few more doublings this
would directly lead to what has been called “weak superintelligence”, i.e.
an intellect that has about the same abilities as a human brain but is much
faster.
Also, the marginal utility of improvements in AI when AI reaches
human-level would also seem to skyrocket, causing funding to increase. We
can therefore make the prediction that once there is human-level artificial
intelligence then it will not be long before superintelligence is
technologically feasible.
A further point can be made in support of this prediction. In contrast to
what’s possible for biological intellects, it might be possible to copy skills
or cognitive modules from one artificial intellect to another. If one AI has
achieved eminence in some field, then subsequent AIs can upload the
pioneer’s program or synaptic weight-matrix and immediately achieve the
same level of performance. It would not be necessary to again go through
the training process. Whether it will also be possible to copy the best parts
of several AIs and combine them into one will depend on details of
implementation and the degree to which the AIs are modularized in a
standardized fashion. But as a general rule, the intellectual achievements of
artificial intellects are additive in a way that human achievements are not,
or only to a much less degree.
Given that superintelligence will one day be technologically feasible, will
people choose to develop it? This question can pretty confidently be
answered in the affirmative. Associated with every step along the road to
superintelligence are enormous economic payoffs. The computer industry
invests huge sums in the next generation of hardware and software, and it
will continue doing so as long as there is a competitive pressure and profits
to be made. People want better computers and smarter software, and they
want the benefits these machines can help produce. Better medical drugs;
relief for humans from the need to perform boring or dangerous jobs;
entertainment — there is no end to the list of consumer-benefits. There is
also a strong military motive to develop artificial intelligence. And
nowhere on the path is there any natural stopping point where technofobics
could plausibly argue “hither but not further”.
It therefore seems that up to human-equivalence, the driving-forces behind
improvements in AI will easily overpower whatever resistance might be
present. When the question is about human-level or greater intelligence then
it is conceivable that there might be strong political forces opposing further
development. Superintelligence might be seen to pose a threat to the
supremacy, and even to the survival, of the human species. Whether by
suitable programming we can arrange the motivation systems of the
superintelligences in such a way as to guarantee perpetual obedience and
subservience, or at least non-harmfulness, to humans is a contentious topic.
If future policy-makers can be sure that AIs would not endanger human
interests then the development of artificial intelligence will continue. If they
can’t be sure that there would be no danger, then the development might
well continue anyway, either because people don’t regard the gradual
displacement of biological humans with machines as necessarily a bad
outcome, or because such strong forces (motivated by short-term profit,
curiosity, ideology, or desire for the capabilities that superintelligences
might bring to its creators) are active that a collective decision to ban new
research in this field can not be reached and successfully implemented.
Depending on degree of optimization assumed, human-level intelligence
probably requires between 10^14 and 10^17 ops. It seems quite possible
that very advanced optimization could reduce this figure further, but the
entrance level would probably not be less than about 10^14 ops. If Moore’s
law continues to hold then the lower bound will be reached sometime
between 2004 and 2008, and the upper bound between 2015 and 2024. The
past success of Moore’s law gives some inductive reason to believe that it
will hold another ten, fifteen years or so; and this prediction is supported by
the fact that there are many promising new technologies currently under
development which hold great potential to increase procurable computing
power. There is no direct reason to suppose that Moore’s law will not hold
longer than 15 years. It thus seems likely that the requisite hardware for
human-level artificial intelligence will be assembled in the first quarter of
the next century, possibly within the first few years.
There are several approaches to developing the software. One is to emulate
the basic principles of biological brains. It is not implausible to suppose
that these principles will be well enough known within 15 years for this