Coauthored with Ted Goertzel

“Artificial intelligence” is a burgeoning sub-discipline of computer science these days. But it would be easy to draw the wrong conclusion from this fact. One might imagine scientists around the world slaving away day and night trying to create computers smarter than people – computers holding intelligent conversations, outsmarting Nobel Prize winners, writing beautiful poetry, proving amazing new math theorems. The reality is by and large far less ambitious and exciting. The AI field as it currently exists focuses almost entirely on highly specialized problem-solving programs constituting at most small aspects of intelligence, and involving little or no spontaneity or creativity. “Real AI” – the creation of computer programs with general intelligence, self-awareness, autonomy, integrated cognition, perception and action – is still basically the stuff of science fiction.

But the world is a big place, and even though the trend in the AI field is toward unambitious hyperspecialization, there is a loosely organized group of maverick researchers bucking this trend, and focusing their efforts on AI in the grand sense. I’m proud to be a member of this club (which is not really a club: there’s no list of members, and no organization, and it’s unlikely that any existing “real AI” researcher knows about more than 25% of the others who exist). In fact I’ve decided to spend some fraction of my time this year trying to organize the distributed population of Real AI researchers into some kind of community. My colleague Cassio Pennachin and I are editing a book entitled “Real AI: New Approaches to Artificial General Intelligence,” which will include contributions from a dozen or so AI innovators around the globe. I have no delusions that this book will shift the focus of the mass of AI academics, but perhaps it will at least open the eyes of some fractions of the new generation of AI students, to the fact that there are serious researchers out there who have not given up on creating software with human-level, autonomous general intelligence. It is a cliché in science that revolutions don’t occur by the majority older, established scientists changing their minds, but rather by generational displacement: a new generation of scientists comes about, for whom what used to be an outlying maverick view is now the norm.

I’ve been riding the Real AI wild horse for quite some time now – for my whole professional career, and before. From 1988 (a year before I got my PhD in math) through 1997, I spent much of my time theorizing about how, if I had enough computational resources at my disposal, I would create a real AI. And since 1997, I’ve been working with a team of incredibly talented colleagues, trying to turn my theories into a reality – trying to build a real thinking machine.

From 1997 to early 2001 I led the R&D division of Webmind Inc. on an AI adventure. I did a lot of other things at Webmind Inc. as well, from designing products to writing documentation and endless sales and business meetings – but it was the Webmind AI Engine project that held my heart. During the 3 ½ years that the company lived, my colleagues and I transformed a promising but incomplete conceptual and mathematical theory about how to build a real AI into a comprehensive detailed software design, and implemented a large amount of the software code needed to make this design work. At its peak, the team working on this project numbered 50 scientists and engineers, spread across four continents.

Webmind Inc. is gone but we’re still at it – me and 7 or so colleagues. A couple of the team are in the US, but the bulk of them are in Brazil, where Webmind Inc.’s largest office was. The Webmind AI Engine codebase had to be abandoned due to legal issues regarding ownership, after the bankruptcy of Webmind Inc., but we’ve created a successor system and called it Novamente. “Novamente” means “new mind” in Portuguese, and also “again/anew/afresh.” As a background meaning, it’s also “new lie”, which should provide Portuguese-speaking AI skeptics with a bit of amusement. So far Novamente doesn’t have as much in it as the Webmind AI Engine did, but we’re building it for efficiency this time around – it’s a lean, mean, hopefully-will-be-thinking machine. We’re applying the partial version that we have today to some practical problems, including the analysis of gene expression data, a major puzzle in computational biology that I’ll discuss briefly a couple more chapters along.

Our goals are huge. First of all, we want to make a Novamente system that can hold a decent English conversation – not necessarily sounding exactly like a human, but sounding interesting and smart, spontaneous and creative and flexible. And after this warm-up exercise, we intend to move on to the main event: Giving Novamente the ability to rewrite its own program code for improved intelligence. If this works, we hope, it may set off a trajectory of exponentially increasing software intelligence. Singularity, anyone?

The Novamente approach to AI incorporates some aspects of neural networks, along with aspects of a number of other AI paradigms. But ultimately, in spite of its broad base of inspiration, it doesn’t rely on anyone else’s vision of AI – it’s an original approach to the problem of creating a thinking machine, which will stand or fall on its own merits.

In this chapter I’ll review the field of AI in general – the major trends and research approaches, and some of the more ambitious current projects. Then, in the following chapter, I’ll give an overview of my own work -- the Webmind/Novamente vision – the philosophical and psychological concepts underlying it, the broad outlines of the software design itself, and how this AI program fits into the broader technological advances that surround us, including the transformation of the Internet into a global brain and the Singularity.

I haven’t tried terribly hard to be “objective” in the other chapters here. I tend not to be a big believer in “one objective truth” – philosophically, I’m very sympathetic to Mikhail Bakhtin’s notion of “polyphonic” reality, in which the actual world is understood as a kind of unholy superposition of everyone’s subjective point of view. My goal in these pages is to present my own point of view, for what it’s worth, and in the hope that others will find it interesting – without putting forth my perspective as any kind of universal or absolute truth. However, it’s also true that some subjectivities are more subjective than others; and I think that my views on AI are a bit more strongly opinionated than my views on other areas. All the topics discussed here are important to me, but AI has been my life’s work, and will continue to be. I think this brings a particular depth of knowledge and wisdom to my discussion of the subject matter, but it also brings a host of extremely strongly felt opinions. Take them for what they are!

As bizarre as it seems to an AI fanatic like me, not everyone believes it’s possible to create a real AI program. And some of these unbelievers are highly intelligent educated people, whom I’m forced to take at least a little bit seriously, in spite of the (to me) unintuitiveness of their attitude.

There are several varieties to the anti-AI position, some more sensible than others.

First, there is the idea that only creatures granted minds by God can possess intelligence. This may be a common perspective, but isn’t really worth discussing in a scientific context. Fewer and fewer serious theologians hold such opinions these days.

More interesting is the notion that digital computers can’t be intelligent because mind is intrinsically a quantum phenomenon. This is actually a claim of some subtlety, because British physicist David Deutsch has formalized the notion of “quantum computing,” and proved that quantum computers can’t compute anything beyond what ordinary digital computers can. But still, in some cases, quantum computers can compute things much faster on average than digital computers. And a few mavericks like Stuart Hameroff and Roger Penrose have argued that non-computational quantum gravity phenomena are at the core of biological intelligence.

Of course, there is as yet no solid evidence of cognitively significant quantum phenomena in the brain. But a lot of things are unknown about the brain, and about quantum gravity for that matter, so these points of view can’t be ruled out.

My own take on this is: Yes, it’s possible (though unproven) that quantum phenomena are used by the human brain to accelerate certain kinds of problem solving. On the other hand, digital computers have their own special ways of accelerating problem solving, such as super-fast, highly accurate arithmetic.

Another even more cogent objection is that, even if it’s possible for a digital computer to be conscious, there may be no way to figure out how to make such a program except by copying the human brain very closely, or running a humongously time-consuming process of evolution roughly emulating the evolutionary process that gave rise to human intelligence.

We don’t have the neurophysiological knowledge to closely copy the human brain, and simulating a decent-sized primordial soup on contemporary computers is simply not possible. This objection to AI is not an evasive tactic like the others, it’s a serious one. But we involved with the Novamente project would like to think we’d gotten around it, by using a combination of psychological, neurophysiological, mathematical and philosophical cues to puzzle out a workable architecture and dynamics for machine intelligence.

As mind engineers, we have to do a lot of the work that evolution did in creating the human mind/brain. An engineered mind like Novamente will have some fundamentally different characteristics from an evolved mind like the human brain, but this isn’t necessarily problematic since our goal is not to simulate human intelligence but rather to create an intelligent digital mind that knows it's digital and uses the peculiarities of its digitality to its best advantage.

The basic philosophy of mind underlying the Novamente work is that mind is not tied to any particular set of physical processes or structures. Rather, “mind” is shorthand for a certain pattern of organization and evolution of patterns. This pattern of organization and evolution can emerge from a brain, but it can also emerge from a computer system.

A digital mind will never be exactly like a human mind, but it will manifest many of the same higher-level structures and dynamics. To create a digital mind, one has to figure out what the abstract structures and dynamics are that characterize “mind in general,” and then figure out how to embody these in the digital computing substrate.

We came into the Webmind Inc. AI R&D project in 1997 with a lot of ideas about the abstract structures and dynamics underlying mind and a simple initial design for a computer implementation; now in 2002, after copious analysis and experimentation, the mapping between mind structures and dynamics and computational structures and dynamics is crystal clear. But there’s a lot of work ahead of us.

Kurzweil-style statistics about the future of computing power, brain scanning and so forth, pose a formidable argument that AI will happen sometime in the next few decades. Whether it’s the Novamente team or someone else who makes it happen doesn’t matter much, in the big picture. But we believe we’re on the right track!

When I first approached the AI field, as a college student in the early 1980’s, it seemed to me that one reason the AI field had deviated so far from its original goal (creating a human-level-or-superior general intelligence) was the lack of a reasonable definition of the core concept of “intelligence.”

Of course, like many undergraduate insights, this was not an entirely original observation. The lack of a definition for the “I” in “AI” has long been the source of sardonic humor among AI researchers. One humorous definition goes: “Intelligence is whatever humans can do, that computers can’t do yet.” There is some truth to this quip: for instance, most people would say that playing chess requires intelligence, yet now that we see Deep Blue playing grandmaster-beating chess via some simple mathematical algorithms, rather than through any human-like intuition, we reclassify chess-playing as something on the borderline of intelligence, rather than something definitively involving intelligence. A less sarcastic slant on the same basic observation would be to say that AI research has helped us to clarify our understanding of what intelligence really is.

In a classic paper from the 1950’s, computing pioneer Alan Turing proposed emulation of humans as a criterion for artificial intelligence, which is now called the Turing test. The Turing test basically says: “Write a computer program that can simulate a human in a text-based conversational interchange – and any human should be willing to consider it intelligent.” I don’t like this very much. One thing that intelligence clearly doesn’t mean, in my view, is precisely simulating human intelligence. It would be unreasonable to expect this from any AI system lacking a human body.

The Turing test serves to make the theoretical point that intelligence is defined by behavior rather than by mystical qualities, so that if a program could act like a human, it should be considered as intelligent as a human. But I don’t think it’s useful as a guide for practical AI development. In mathematician lingo, I view the Turing test as a sufficient but not necessary condition for intelligence. I can imagine a radically nonhuman computer system, as unable to imitate humans as we are unable to imitate dogs or anteaters or sharks, but still as intelligent as humans or more so.

Although I’ve devoted years of thought to it, I can’t claim to have thoroughly solved the “What is intelligence?” problem. I don’t have an IQ test for Novamente or other would-be AI’s. The creation of such a test might be an interesting task, but it can’t even be approached until there are a lot of intelligent computer programs of the same type. IQ tests work moderately well within a single culture, but much worse across cultures – how much worse will they work across species, or across different types of computer programs, which may well be as different as different species of animals?

I do however have a simple “working definition” of intelligence, which shares with the Turing test a focus on pragmatic behavior. It builds on

various ideas from psychology and engineering, and was first explicitly presented in my book The Structure of Intelligence in 1993:

Intelligence is the ability to achieve complex goals in a complex environment

None of my experience over the last 9 years, developing AI and reviewing others’ AI systems, has in any way contradicted this understanding of intelligence. Of course, it begs the question of defining “complexity” of goals and environments -- a question that leads to a long mathematical story that I spent many years of my career attempting to tell. But the truth is, my and others’ research on the mathematics of complexity hasn’t yet played a major role in anyone’s practical AI development work. The qualitative notion of “achieving complex goals in complex environments” on the other hand, has proved to have a decent amount of heuristic value.

Another interesting-but-not-too-original observation that I made when first reading through the AI literature, way back in 1982-83, was that AI researchers were divided into two main camps: the neural net camp and the logic-based or rule-based camp. Both camps wanted to make AI by simulating human intelligence, but they focused on very different aspects of human intelligence. One modeled the brain, the other modeled the mind.

We’ve already discussed the neural net approach, in the previous chapter. It starts with neurons, the nerve cells the brain is made of, and tries simulate the ways in which these cells are linked together and interact with each other. Rule-based models, on the other hand, try to simulate the mind's ability to make logical, rational decisions, without asking how the brain did this biologically. Based on my own intuitive understanding of the mind/brain, it seemed obvious to me right from the start that both of these ways of looking at the mind were really extremely limited. On the one hand, there’s a lot more to the brain than links between neurons. And on the other hand, there’s a lot more to the mind than rational, logical thinking.

When I first studied this stuff I was even more arrogant than I am now (believe it or not!), and I tended to think that everyone who was taking either of these two approaches must be a total idiot. But eventually, as I got further into studying all the issues involved, I realized what was going on. The key point was that available cmputer technology was very limited, and had been even more limited throughout most of the history of AI. There was no way to write a program simulating the whole structure of the brain, and there was barely enough processor and memory to deal with very limited logical processing, let alone the integration of logic with all the other aspects of the mind. But it seemed to me that, rather than acknowledging that they were taking very limited approaches because they wanted to actually implement programs on contemporary computers, researchers were turning the limitations of their own computer resources into philosophical arguments: They were making theories of the mind staying stuff like “the mind only does logical reasoning” or “the mind is just a simple neural net,” when the truth was that it was their computers, with its limited memory and processing, that could only do logical reasoning, or could only act like a neural net. They wanted their theories to be empirical, to be testable; they wanted to be able to “put up or shut up” rather than making huge and untestable statements about how the mind works and how the mind could be implemented if you had a good enough computer. This is a good motive, it’s basic scientific method. But the irony is that mind is at least 50% about computing power. There’s no way to make a chicken’s brain really smart by rewiring it – the raw processing power isn’t there. Of course, processing power can be wasted, you need to use it intelligently with the right data structures and algorithms – but without it you can’t implement a mind and you can’t test serious theories about the mind. By trying to do real science and test their ideas about the mind with computational experiments on the computers they had at the time, computer scientists were pushed toward trivial and just plain wrong theories about how the mind works.

But even though I think that the standard AI theories about how the mind works and how it can be implemented in software are largely wrong, I still believe a lot can be learned from them. My own AI system, Novamente, incorporates aspects of neural nets and also of logic-based AI, although it doesn’t use either one in a conventional way. Scientists working on these approaches to AI have created a lot of great science, even if they haven’t done much “real AI” according to my definition – they haven’t created any thinking machines, or even any viable designs for thinking machines. They’ve done some work that’s of relevance to those of us actually concerned with thinking machine construction – and so have a lot of other people working in other fields like system theory, neurobiology, theoretical mathematics, and so on.

Neural networks is the older of the two competing paradigms. Its roots go way back to work in cybernetics and systems theory from the pre-computer age – stuff like Norbert Wiener's book "Cybernetics: Control and Communication in Animals and Machines," from the 1930’s, an amazing book for its time, in which it was shown for the first time that the same mathematical principles could be used to understand both man-made electrical control systems and biological systems like bodies and brains.

The early neural network systems weren’t computers – they were analog electrical models of the brain, physically wired together with tubes, switches and other electronic equipment. Until the early 1970's, this was generally more efficient than using computer simulations, because the computers were so limited in capacity. But it was still pretty lame – Marvin Minsky’s vacuum tube neural net from the 50’s, for example, had about 50 neurons in it, compared to hundreds of billions in the brain. Even the computer simulated neural networks people use today normally have at most thousands of neurons.

Simulating neurons on computers is kind of a weird thing. The issue is that the human brain uses parallel processing: billions of neurons are all working at the same time. In the very early days of computing – I mean the late 30’s and the 40’s – it was thought that computers might be based on the same principles. But that turned out not to be a very efficient way to do things. The Hungarian mathematician John von Neumann was the first one to figure out a passably efficient design for a computer, and all the computers we use today are based on his design. The design is very simple, it’s centered on a central processing unit which follows instructions given to it by a programmer. The central processing unit goes through a series of five steps, over and over again, one after the other:

1) it calls up an instruction from its memory

2) it calls up any data required by the instruction

3) it processes the data according to the instruction

4) it stores the data in memory

5) it calls up the next instruction from its memory.

This is not how the brain works, obviously! The brain is totally crazy, with memory and instructions for doing things all mixed up, and a huge number of memories and instructive active at any given time. There are some things that the brain’s mixed-up processes do more naturally than von Neumann’s orderly machines. On the other hand, computers built using this architecture take a lot more naturally than people to routine iterative tasks like up long columns of numbers, doing routine accounting calculations, filling out income tax forms, retyping manuscripts, and so on and so on. The first application for computers was doing routine math calculations for solving military problems, like figuring out the details of how to build hydrogen bombs. DeGaris’s work is the only serious modern attempt I know to get around the terrible mismatch between neural network mathematics and von Neumann computer hardware – by going beyond von Neumann and building a special hardware framework especially for neural nets.

Of course, the advantage of neural networks is that they are at least vaguely similar to the most intelligent system we know of, the human brain. On the other hand, von Neumann computers don’t immediately give the appearance of being capable of anything resembling intelligence. It’s obvious that real intelligence requires more than just the iterative routine that von Neumann computers are good at. It requires the ability to find new solutions to unanticipated problems. Programming a von Neumann computer – meaning, any computer commonly used to day – is a matter of telling the computer specific rules for what to do. How to make rules governing spontaneous, unplanned, creative behavior is not exactly a trivial problem.

But in the 60’s more and more researchers starting pushing in this direction. They decided that simulating the brain on a totally non-brainlike hardware system probably wasn’t such a good idea, and started thinking about better ways to use computers to emulate human thought. Impressed with what computers could do using a few rules, applied over and over again fast, decided that everything the human mind does could be simulated by a big enough and nasty enough set of rules. There were definite successes here. But also, an awful lot of failures. I’ll tell you some of the stories in a moment.

Anyway, by the 1980s, when I came on the scene, the AI community was moving back towards neural networks again. Both sides were frustrated, because their programs were never quite able to do what they wanted them to do.

With 20/20 hindsight, it now seems clear that -- while both the neural net and the rule-based approaches were useful building blocks -- the early researchers had ridiculously oversimplified their task. True intelligence requires more than following carefully defined rules, and it also requires more than random links between a few thousand artificial neurons. It’ probably a good thing, however, that early researchers underestimated the difficulty of achieving true intelligence, because if they hadn’t, they might not have had as much energy and enthusiasm for their pioneering efforts.

What I asked myself while initially studying the history of AI was this: Why has no one yet managed to build a thinking machine? It seemed to me there were basically two reasons. Mainly, no one had really tried to build a whole mind, a computer system that could observe the world around it, act in the world around it, remember information, recognize patterns in the world and in itself, and create new patterns inside itself in response to its own goals. Presumably no one had tried to do this because the computer resources available at the time were blatantly inadequate to support a program with reasonably sophisticated modules devoted to memory, perception, action and conception. Lacking the computer resources to build a whole mind, researchers had focused on one or another particular aspect of the mind, and tried to push this aspect as far as it could go.

Obviously, it was sensible to write programs that could run on the available computers. But in some cases, it seemed as if the lack of computer resources had profoundly affected the theory of artificial intelligence as well as the practice. Scientists had wanted to be able to build true intelligence using the resources at their disposal, really really badly, so they’d convinced themselves that many parts of the mind were unnecessary, and that everything could be done using the resources they had. Lacking a perceptual environment to embed their AI systems in, they’d built reasoning and memory programs with essentially no perceptual systems; programs that acted entirely on the basis of logical rules, with no direct sensory link to the world.

Way back in 1983, at 16 years of age, I was already eager to fill the gap. I wanted to build a comprehensive system, with perception, action, memory, and the ability to conceive of new ideas and to study itself. I realized this system would consume a lot of computer memory and processing power, and would be difficult to program and debug because each of its parts would gain its meaning only from its interaction with the other parts. But it didn’t take me too long to figure out that the computers then on the market were simply not up to what I wanted to do. I blamed this on the von Neumann architecture and its total unsuitability to simulating the evolutionary, massively parallel nature of thought. I reckoned that the only way to build my artificial mind in the near future would be to build my own specialized hardware, to create some kind of "brain machine." I was studying quantum theory, which told me that some kind of radical quantum computer architecture might be useful. I was also interested in the possibility of growing bio-computers - after all, the brain grows itself, it is not built. At the time these ideas were incredibly radical, but now they’re both active research areas.

I tend to have a little more affinity for neural net type AI systems than for logical rule based AI systems. Probably this is because when I was going through school, logical rule based AI was the orthodoxy that I was rebelling against, whereas neural nets at the time were maverick, up and coming and exciting. Looking at it objectively (or as objectively as I can!), I have to admit that the early work on rule based AI was really just as important as the early neural network research.

 
 
 

  The Tower of Hanoi Puzzle
 
 

One famous early rule-based program was something called the General Problem Solver – a very ambitious name. It was written by Alan Newell and Herbert Simon, both of whom went on to do a lot of other great AI research. This was a pretty interesting program, but it didn’t quite live up to its name. In fact I reckon the title of their paper on GPS was one of the biggest and emptiest brags of all time: "General Problem Solver: A Program that Simulates Human Thought."

Ummm … not quite.

GPS could solve simple problems like the Tower of Hanoi puzzle, and "cryptarithmetic" puzzles like DONALD + GERALD = ROBERT. [To solve this, assign a number to each letter so that the equation comes out correctly.] But in the overall scheme of intelligence, solving problems such as these is not all that different from computing logarithms or solving differential equations. A simple mathematical strategy suffices; there’s no real learning there. GPS is not a general problem solver, it’s a Narrowly-Defined Problem Solver.

Of course, there were points in history where the ability to solve such puzzles would have been considered a remarkable display of intelligence. The Tower of Hanoi puzzle for example, was introduced to the modern world by the French mathematician Edouard Lucas in 1883, but his inspiration came from ancient Hindu folklore. Tales are told of a Hindu temple where this puzzle or a similar one was used as a challenge for young priests. At the beginning of time, the priests in the temple were given a stack of 64 gold disks, each one slightly smaller than the one beneath it, and presented the task of . transferring the 64 disks from one of the three poles to another, without ever placing a large disk atop a smaller one. Day and night the priests labored, and, so the story goes, when they finally finished their work the temple would crumble into dust and the world would vanish. Clearly, a program that could solve this problem in subsecond time would have seemed rather intelligent to the ancient Hindus who believed this myth. But now we can see that solving the Towers of Hanoi doesn’t require inordinate general intelligence, any more than causing rain via cloud seeding requires inordinate general control over the weather.

What GPS was doing was taking an overall goal – solving a puzzle – and breaking it down into subgoals. It then tried to solve the subgoals, breaking them down into subgoals if necessary, until it got subgoals small enough that it could deal with them in some direct way, like by enumerating all possible values some letter could take in a cryptarithmetic puzzle. This same basic logic is used now in a much bigger and better rule-based AI program called SOAR, also developed by Simon and his colleagues. SOAR is probably the ultimate height of rule-based AI – I know a bit about it because I worked with a masters’ student at the University of Western Australia, who was trying to model how humans solved simple practical problems by watching how SOAR solved them. SOAR wasn’t totally useless for this – sometimes you could get it to follow strategies of breaking goals into subgoals that were something like what humans did. The parts of a problem that took people a long time, often took SOAR a long time too.

This business of goal and subgoals is important to Novamente – we have something called a GoalNode, and we have processes called schema that can break goals contained in GoalNodes into subgoals. The basic algorithm of GPS and SOAR is clearly something that’s necessary for the mind. However, it doesn’t have to be done in as rigid a way as these programs do it. In fact, doing it in such a rigid way is tremendously destructive. But to makethis process flexible, you need the goal and subgoal management part of the mind to interact with the other parts of the mind. The system has to be able to flexibly determine which of its processes are effective for achieving which of its goals in what contexts – and for this it needs reasoning and association-finding and long-term memory. And it has to be able to use context-dependent, intuitive reasoning to figure out what goals to split into subgoals in what way in what situation. Basically GPS and SOAR and this whole line of AI research are a result of taking one aspect of the mind – goal-directed, problem-solving behavior – and extracting it from the rest of the mind. Unfortunately, when you extract it from the rest of the mind, this aspect of thinking isn’t all that useful, because it has no way to control itself in a context-dependent way.

Herbert Simon wrote another program called BACON, which was at least as ambitious. He called the program BACON, because it was inspired by the writings of the brilliant scientist Sir Francis Bacon, who thought that science was a matter of recognizing patterns in tables of numerical data. Simon programmed BACON to do just that: to look at large quantity of data and infer the general patterns hidden within it. Today, we’d call this data mining – it’s a moderately big business, and a flourishing academic subfield. Modern data mining programs are really useful – businesses use them to find patterns in their customer base, for example. They might discover that a particular brand of soap is bought particularly often by 40 year old men who live in a certain zip code, and so forth. Scientists use them too, to find patterns in data and set their minds thinking in certain directions. But these tools definitely don’t replace scientists or even market analysts. The problem is, there’s much more to doing science than find patterns in tables of numbers. Francis Bacon, the man, never really understood how the mind works -- and this is why even data mining programs that are a lot smarter than BACON aren’t really intelligent minds .

Let’s look at BACON's reasoning in detail, in one of its more impressive examples. BACON was able to “learn”, in some sense, the ideal gas law from thermodynamics. This law says pV/nT = 8.32 where p is the pressure of the gas, V is the volume of the gas, T is the temperature in degrees Kelvin, and n is the quantity of the gas in moles. In practice, this relation never holds exactly – because there is no ideal gas in the real world -- but for most real gases it is a very good approximation.

If you give it appropriates table of numbers, BACON can learn this law, using rules like:

· If two columns of data increase together, or decrease together, then consider their quotient.

· If one column of data increases, while another decreases, then consider their product.

· Given a column of data, check if it has a constant value

As pressure goes up, volume goes down, so BACON forms the product pV. Next, as the combined quantity pV goes up, so does the temperature ?? thus BACON constructs the quotient pV/T. And as pV/T goes up, so does the number of moles ?? hence the quotient (pV/T)/n = pV/nT is constructed. This quotient has a constant value of 8.32 ?? so the ideal gas law is "discovered."

Very interesting, indeed. An important step in the right direction. But how terribly far this is from what real scientists do! Most of the work of science is in determining what kind of data to collect, and figuring out creative experiments to obtain the data. Once a reliable set of data is there, finding the patterns is usually the easiest part. Often the pattern is guessed on the basis of terribly incomplete data ?? and this intuitive guess is then used to guide the search for more complete data. But BACON is absolutely incapable of making an intuitive guess.

Simon has claimed that a four?to?five hour run of BACON corresponds to "not more than one human scientific lifetime." AI theorist and author Douglas Hofstadter, in his book Metamagical Themas, suggests that one run of BACON actually corresponds to about one second of a human scientist's life work. I think that Hofstadter's estimate, though perhaps a little skimpy, is much closer to the mark. Only a very small percentage of scientific work is composed of BACON?style data crunching.

In Novamente, we actually use algorithms analogous to those in BACON – though vastly more sophisticated, as is to be expected since AI has progressed a great deal in the last few decades. We call this aspect of Novamente’s thinking “machine learning” or “data mining.” Recognizing patterns in vast amounts of data is a very important part of the mind, but it’s only part of the mind. Novamente learns rules explaining why humans like some messages or e-mails better than others, using methods not that different from BACON’s. But, we know now that the real trick is in mapping the messages or e-mails into numbers that data mining methods can deal with. This involves understanding the meanings of various words and phrases and expressions. Also, there’s the matter of deciding what data to look at, which is done by the general association-finding mechanisms in Novamente’s mind. And there’s reasoning which brings general background knowledge into the process, as opposed to pure data mining which is just pattern-finding. Bringing associations and reasoning into the picture, you need long-term memory, which opens a whole big and beautiful can of worms. You get the picture. Pattern finding is crucial, but it’s only a little part of the picture.

Rule?based AI – “symbolic” AI -- has had plenty of practical successes. My big complaint with it, though, is that every one of these successes has resulted from specialized tricks, rather than flexible intelligence. One term for this is “brittleness.” Or, you could call it “remarkable literal?mindedness.” These programs are a lot like MicrosoftWord, DOS 6.0, or a pocket calculator -- they do what they're told, and very little more. If they're programmed to deal with one context, then that's what they'll deal with; not in a million years will they generalize their knowledge to something totally different.

There was one famous program that contained logical definitions of everyday words. An "arch" was defined as "Three blocks, A, -- and C, so that C is supported by A and B, and A and -- do not touch." This is all very well for playing with blocks ?? but what will the program do when it gets to Arches National Park in Utah ... or builds arches out of modeling clay? On the other hand, show a clever three?year old human an arch made of blocks, and she'll immediately recognize a rock arch as a member of the "arch" category. It won't occur to her that a rock arch can't be naturally decomposed into three blocks A, -- and C. Children, unlike expensive research computers, are anything but brittle ?? even their bones are flexible!

I’ve explained what “creating a real AI” means to those of us on the Novamente project: Creating a computer program that can achieve complex goals in a complex environment – the goal of socially interacting with humans and analyzing data in the context of the Internet, in this case – using limited computational resources and in reasonably rapid time.

A natural question to ask, when presented with such an ambitious goal, is: OK, so if AI is possible, how come it hasn’t been done before? And how come so few people are trying?

Peter Voss, a freelance AI theorist, entrepreneur and futurist whose ideas I like very much, has summarized the situation roughly as follows. Of all the people working in the field called AI, he observes:

· 80% don't believe in the concept of General Intelligence (but instead, in a large collection of specific skills & knowledge)

· of those that do, 80% don't believe its possible -- either ever, or for a long, long time

· of those that do, 80% work on domain-specific AI projects for commercial or academic-politics reasons (results are a lot quicker)

· of those left, 80% have the wrong conceptual framework....

· And nearly all of the people operating under basically correct conceptual premises, lack the resources to adequately realize their ideas

The presupposition of the bulk of the work being done in the AI field today is that solving sub-problems of the “real AI” problem, by addressing individual aspects of intelligence in isolation, contributes toward solving the overall problem of creating real AI. While this is of course true to a certain extent, our experience with Webmind and Novamente suggests that it is not so true as is commonly believed. The problem is that, in many cases, the best approach to implementing an aspect of mind in isolation is very different from the best way to implement this same aspect of mind in the framework of an integrated, self-organizing AI system.

So who else -- besides the author and his comrades -- is actually working on building generally intelligent computer systems, at the moment? Not as many groups as I would like – but there are some interesting things going on nonetheless. Hugo de Garis’s artificial brain project (mentioned above) is one of them, but not the only one.

An interesting related research direction is Rodney Brooks’ well-known Cog project [http://www.ai.mit.edu/people/brooks/brooks.html] at MIT is aiming toward building real AI in the long run, but their path to real AI involves gradually building up to cognition after first getting animal-like perception and action to work via “subsumption architecture robotics.” This approach might eventually yield human-level inteligence, but only after decades.

Another is the Non-Axiomatic Reasoning System (NARS) project pursued by my close friend and collaborator Pei Wang. I won’t say much about NARS here because NARS is fairly similar to Novamente in many ways, and I’ll spend a dozen pages on Novamente. But NARS does have its own integrity as a separate AI system, and Pei has devoted a huge amount of time to it over a 15 year period, showing a dedication and vision to the grand goal that is surprisingly rare in the AI field.

Some other important Real-AI-ish research directions will be discussed a little later in this chapter: Jason Hutchens’ sophisticated chat bots, Doug Lenat’s multi-decade, multimillion-dollar CYC project, and Danny Hillis’s Connection Machine. Eliezer Yudkowsky, whom I’ll talk about in a later chapter, is also at work on his own Real AI, and my friend Peter Voss whom I quoted above has a small company, Adaptive Intelligence Inc., working on implementing his own neural-nettish approach to AI. I won’t say much about their projects here because they have not disclosed their AI approaches publicly. All these are wonderful projects indeed -- but the surprising thing is that there are so few of them to discuss.

Of course, there are hundreds of other AI engineering projects in place at various universities and companies throughout the world, but nearly all of these involve building specialized AI systems restricted to one aspect of the mind, rather than creating an overall intelligent system. The most significant large-scale attempt to “put all the pieces together” would seem to have been the Japanese 5th Generation Computer System project. But this project was doomed by its pure engineering approach, by its lack of an underlying theory of mind. Few people mention this project these days. The AI world appears to have learned the wrong lessons from it – they have taken the lesson to be that integrative AI is bad, rather than that integrative AI should be approached from a sound conceptual basis.

 
 
 

  Jason Hutchens (way back when
we were both in Perth)
 
 
One of the most intriguing Real AI oriented projects to arise in recent years was carried out at Artificial Intelligence Enterprises [http://www.a-i.com], a small Israeli company whose R&D team was run by Jason Hutchens. I knew Jason years before a-i.com existed, in the mid-90’s when he was a gradulate student at the University of Western Australia in Perth, and I was a Research Fellow there. His company was a direct intellectual competitor to Webmind Inc., in that they were seeking to create a conversational AI system somewhat similar to Webmind/Novamente. However, their efforts focused on statistical learning-based language comprehension and generation rather than (as in the Novamente case) on deep cognition, semantics, and so forth. Unfortunately, this firm went into “hibernation” a couple months after Webmind Inc. in 2001 – they laid off all staff, but kept their website up, and are keeping open the possibility of resurrection if funding arises. I am not sure what efforts are underway to keep their AI progress going in the meantime.

Jason’s work is a super-high-class incarnation of the “chat bot” road to AI. A chat bot is a program whose purpose is to chat with human beings, much as humans chat to each other on AOL, IRQ, Yahoo! Chat and other such services. Most chat bots are

very, very stupid programs whose behavior is transparently determined by programmed-in rules. If nothing else they serve to illustrate the limitations of rule-based approach to AI fairly well. Jason’s programs are at least a little more sophisticated, and represent a real attempt to transcend the limitations of the usual chat bots.

The first chat bot was ELIZA, the computer psychotherapist, invented by Terry Winograd in the early 70's. I remember my father showing me this on his Commodore 64 when I was still in high school. It was so simple that I could go in and modify the BASIC code to make it say naughty words and other entertaining things. ELIZA pretends it’s a psychotherapist – not a Freudian analyst, but a therapist following a school of psychology called Rogerian therapy, which is very convenient for chat bots because it involves the therapist saying very little and just making supportive comments, encouraging the client to keep talking and expressing their feelings. You type in your comments and ELIZA types out answers. The computer code for ELIZA consists of a large set of rules for scanning the clients' statements and offering responses. It looks for key words, and gives stock responses. If you mention your mother, ELIZA might say, "tell me more about your mother." If you say yes, ELIZA might say, "you sound quite positive." If there are no key words in your statement, or only ones that have been over-used recently, ELIZA makes a random observation.

You can find ELIZA on the Web these days; even now it’s moderately amusing. There’s even a Palm Pilot version, which I suppose bored, demented businessmen can use to while away waiting time in airports. But it’s hard to see how you’d be fooled for any period of time into thinking you’re conversing with an intelligent being. I guess in a way we’re jaded now, though—when ELIZA was first written, people were less savvy about computers, and it actually did fool people.

Here’s a conversation my father, Ted, had with ELIZA on the Web (on a site advertising Eliza - a friend you could never have before )

Ted: Are you truly intelligent?

Eliza: Would you prefer if I were not truly intelligent?

Ted: I would find it very exciting if you were.

Eliza: Do you wish that i was?

Ted: Yes.

Eliza: You are sure?

Ted: I am quite sure.

Eliza: How long have you been quite sure?

Ted: As soon as you asked the question, I was sure.

Eliza: Why do you tell me you were sure just now?

Ted: Because you asked me. Why don't you answer my question?

Eliza: Why do you say your question?

Ted: Please tell me if you are truly intelligent.

Eliza: Do you wish that im truly intelligent?

Ted: I wish you were more intelligent than you are acting right now.

Eliza: Do you wish to wish me?

Ted: Did you know that my father died last month?

Eliza: Who else in your family died last month?

Ted: Nobody. Victor's death was enough!

Eliza: Surely not everyone?

Try it yourself. This ELIZA was at http://www?ai.ijs.si/eliza?cgi?bin/eliza_script; maybe it’s still there … if not, try Google.

It’s interesting to see where computer chat has come in the years since Eliza. In short – not really that far. There’s a large number of chat bots on the Internet (see the listing at http://crab.rutgers.edu/~goertzel/AIsites.htm) and you can sign onto them and chat as long as you like.

To give a flavor of the quality of Jason’s publicly released chat programs, I'll include here some excerpts from conversations I had with some of the best of them. The first is Hex, who used to be available for public chatting on Jason’s website, though he took it down a few years back. Hex won the annual Loebner "Turing Test" contest in 1996, a prize awarded annually to the machine that best fools a group of judges into thinking it’s a human.



User:

hello

HeX:

Hello. Now that the formalities are over....

User:

what do you do?

HeX:

Well, I know what I do, but I refuse to tell you! Would you like to hear a joke? Knock,

knock...

User:

who's there

HeX:

Cindy.

User:

cindy who

HeX:

Cindy next judge in, please!

User:

you're funny

HeX:

Why do you feel that you have to tell me that I am funny?

User:

to make you feel happy

HeX:

Hehehe. Trying to trick me, eh?

User:

no

HeX:

Oh well, that's a pity!

User:

you want to be tricked?

HeX:

Hehehe ? that's supposed to be a *question*??? Which part of the country are you from?

User:

California

HeX:

I'm sorry, but I refuse to reply if you're going to be silly.

User:

I'm not silly

HeX:

If you are not silly, it's your business, mate!

User:

sorry

HeX:

Apologies accepted.

User:

how are you doing, mate?

HeX:

I am just fine thanks! And you?

User:

bad

HeX:

No! Bugger off! Silly person!

User:

why are you called hex?

HeX:

Does there have to be a reason?

User:

yeah

HeX:

OK. That's good...

User:

HI

HeX:

Hello. Now, that the formalities are over!

User:

What?

HeX:

I have absolutely no idea what. Would you like to hear a joke? Knock, knock...

User:

What platform are you?

HeX:

I'm sure someone out there knows what platform I am... Which part of the country are

you from?

User:

Are you on PC or mac

HeX:

I absolutely hate PC's. I should say PC's running Microsoft operating systems, as they

can be good Unix boxes when they have Linux installed.

User:

I hate PC too

HeX:



I absolutely hate PC's. I should say PC's running Microsoft operating systems, as they

can be good Unix boxes when they have Linux installed.

User:

So, you are on a UNIX?

HeX:

There is nothing wrong with Unix. It's rather nice when you get used to it!

User:

Do you know of any jokes HeX?

HeX:

OK, here's a joke for you. Knock, knock.

User:

whose there

HeX:

Cindy.

User:

Cindy who?

HeX:

Cindy next judge in, please!

User:

Dumb!

HeX:

Hehehe. Trying to trick me, eh?

User:

Try to trick me HeX

HeX:

I'm sorry, but I refuse to reply if you're going to be silly.

User:

How old are you

HeX:

Well, should my age really matter? Just for the record I am 24. How old are you then?

User:

I am 100 years old

HeX:

If you are 100 years old, it's your business, mate! Would you like to hear a joke?

Knock, knock...

User:

whose there

HeX:

Howard.

User:

who

HeX:

Where's your grammar?

User:

howard who

HeX:

Howard I know? Knock, knock..

Jason's Website contains hundreds of pages of similar dialogue ?? amusing at first, but after a certain point, they get annoyingly repetitive. It is clear in any case that HeX's observed behaviors fall far short of Turing's goal.

HeX is a noticeable improvement over ELIZA, but it follows the same general principles. Like ELIZA, HeX is a very unsophisticated program, which searches for key phrases in its input and replies with stock "template"?generated phrases rather than doing advanced grammatical parsing, or trying to understand the content of the sentences typed to it. No program based on advanced parsing is able to outperform either ELIZA or HeX in practice. And neither one tells us anything at all about intelligence. Jason made this point himself shortly after his program won the prize, in a very entertaining talk entitled "How to Pass the Turing Test By Cheating."

In 1997, HeX's successors, SEPO, lost the Loebner competition to the program Converse, written by David Levy of Intelligent Research Ltd in England. Converse is in many ways an even stupider program than HeX, but it is better optimized for competition. Here is an excerpt from one of its prize?winning conversations, as presented on the Loebner Prize Website:

PROGRAM[10:18:04]Hi, my name is Catherine, what's yours?

PROGRAM[10:18:17].

JUDGE04[10:18:32]My name is Elliott.

PROGRAM[10:18:34]Well, it's nice to meet you, Elliott.

PROGRAM[10:18:48]Did you see that story on CNN last night about the lesbian

PROGRAM[10:19:04]couple who came out at a White House party on Sunday?

JUDGE04[10:19:55]Yes, I did. I think it may all be a publicity stunt for Ellen.

PROGRAM[10:20:01]Ellen Degeneres was one of them ? she was kissing her lover

PROGRAM[10:20:11]in front of Bill Clinton. Do you think he was very

PROGRAM[10:20:25]embarrassed?

JUDGE04[10:21:00]I think he want the gay vote. He will do anything to get it and keep it.

JUDGE04[10:21:15]He does not have a conservative base.

PROGRAM[10:21:21]

PROGRAM[10:21:21]Are you still a Friend of Bill Clinton?

JUDGE04[10:22:20]I am not sure if he has any real friends. He has achieved a lot and has

JUDGE04[10:22:31]left a lot of people behind.

PROGRAM[10:22:35]You really don't like the guy ?? well, at least that's

PROGRAM[10:22:50]definite??I thought most people were just indifferent or

PROGRAM[10:23:02]numb by now.

JUDGE04[10:23:34]I think we could have a more honest and dynamic leader. One able to

JUDGE04[10:23:55]face the real dangers facing the country and address them with

JUDGE04[10:24:26]courage and intelligence. Rather then listening to public opinion polls.

PROGRAM[10:24:30]I guess that means you really think there's serious behind

PROGRAM[10:24:56]Whitewater, the Special prosecutor and all that stuff?

JUDGE04[10:25:58]We have two dead people, Brown and guy who committed suicide, we have

lost

JUDGE04[10:26:36]papers. And we have Clinton's people willing to tell all. I think that

JUDGE04[10:26:55]Hillary will be found guilty of something.

PROGRAM[10:26:58]But how do you actually feel about him, now in the number

PROGRAM[10:27:09]two term?

PROGRAM[10:27:15].

JUDGE04[10:27:44]He is too smart to get caught. He deserved to named hotel manager of the

JUDGE04[10:27:47]year.

PROGRAM[10:27:49]OK, but what happens now??does it all drift on, with

PROGRAM[10:28:01]everyone waiting to see how much his friends have said about

PROGRAM[10:28:17]him under threat of lots and lots of jail?or will something

PROGRAM[10:28:32]dramatic actually happen?

PROGRAM[10:28:37].

It's almost a reasonable conversation! But then, listen to Converse's conversation with the next judge....

PROGRAM[10:07:12]Hi, my name is Catherine, what's yours?

PROGRAM[10:07:23].

JUDGE05[10:07:29]I'm Jurgen.

PROGRAM[10:07:31]Well, it's nice to meet you, Jurgen.

PROGRAM[10:07:40]Did you see that story on CNN last night about the lesbian

PROGRAM[10:07:52]couple who came out at a White House party on Sunday?

JUDGE05[10:08:33]NO. I just came in yesterday. I'm still kind of jet?lagged.

PROGRAM[10:08:35]Ellen Degeneres was one of them ? she was kissing her lover

PROGRAM[10:08:51]in front of Bill Clinton. Do you think he was very

PROGRAM[10:09:14]embarrassed?

JUDGE05[10:10:18]I don't know? Don't you think it's a little childish if the leader of the most powerful nation on the planet is embarrassed by seeing two lesbians kiss?

PROGRAM[10:10:26]Probably! No. ?

PROGRAM[10:10:29]Are you still a Friend of Bill Clinton?

JUDGE05[10:10:47]Never met him.

PROGRAM[10:10:48]You really don't like the guy??well, at least that's

PROGRAM[10:11:01]definite??I thought most people were just indifferent or

PROGRAM[10:11:14]numb by now.

JUDGE05[10:11:35]Have you seen the movie Mars attack?

PROGRAM[10:11:36]Give me a break ?? I'm just here to answer dumb questions

PROGRAM[10:11:50]?? not that! What were we talking about... Ok.. I guess that

PROGRAM[10:12:09]means you really think there's serious behind Whitewater,

PROGRAM[10:12:29]the special prosecutor and all that stuff?

JUDGE05[10:12:48]I'm cold.

PROGRAM[10:12:49]But how do you actually feel about him, now in the number

PROGRAM[10:13:01]two term?

PROGRAM[10:13:06].

JUDGE05[10:13:29]Hey, don't you get it? I don't want to talk about Clinton.

PROGRAM[10:13:30]Sorry??my attention slipped for a moment??can we push on to

PROGRAM[10:13:40]something more gripping. where was I.. oh yes OK, but what

PROGRAM[10:13:50]happens now??does it all drift on, with everyone waiting to

PROGRAM[10:13:58]see how much his friends have said about him under threat of

PROGRAM[10:14:07]lots and lots of jail?or will something dramatic actually

PROGRAM[10:14:17]happen?

PROGRAM[10:14:19].

JUDGE05[10:14:37]Couldn't care less.

PROGRAM[10:14:38]Well, you're more concerned about him that I am, I

PROGRAM[10:14:53]suppose??I think we've had worse, and who knows what I might

PROGRAM[10:15:07]have agreed to in that locked hotel room.

JUDGE05[10:15:53]You mean you might have been thrilled to have sex with the

President?

PROGRAM[10:15:56]sex, I knew it would come up sooner or later. Time to get

PROGRAM[10:16:10]off this one I think and onto something more down to earth!

JUDGE05[10:16:40]What kind of earth?

PROGRAM[10:16:41]Somebody once told me it's true. Like I was saying

PROGRAM[10:16:54]I suppose it depends on where you're coming from, but as the

PROGRAM[10:17:09]song goes in My Fair Lady, "Why can't a woman be more like a

PROGRAM[10:17:24]man? "

This one is almost a reasonable conversation too. But it's also almost the same conversation as the computer had with Judge 4 ?? until Judge 5 refuses to play along with the canned Clinton theme, and tries to change the topic. Converse, like many a human bore, refuses to change the topic, because its knowledge base is rather limited. Converse's responses are more canned, more rote than HeX's; and Converse is more obviously optimized for the Loebner Prize format of very short conversations. I enjoy HeX's personality more; HeX shares a lot of its creator, Jason's offbeat Australian sense of humor. But neither one amuses me for very long.

The deep-down trouble with HeX and Converse is obvious -- a lack of general understanding. A lack of common sense. A computer program that has no world of its own, that cannot understand the world in any sense will never be able to hold a conversation, and will never be able to generalize from typewritten fonts to display fonts, or handwriting. It’ll be lucky to be able to generalize from one collection of typewritten fonts to a moderately different one.

A more amusing example of a computer conversationalist is another of Jason's early programs, MegaHal. MegaHal is more interesting because its programming goes beyond a strictly rule-based approach. It detects statistical patterns and responds to them, an approach which is similar to that of neural networks. Its conversations are more interesting, although the results are less similar to human conversation. Or, at least, less similar to human conversation in the absence of excessive quantities of mind?altering substances. I do remember some conversations, from my college days in particular, which bore frightening resemblance to MegaHal's more literary ramblings. MegaHal does not just repeat back canned phrases using key words and phrases from the user's input, it learns from experience. It takes the words and phrases that have been said to it, tries to understand they way they have been combined in the past, and re?combines them itself to form its own utterances. In short, it is engaged in a dynamic with the world. The results are often hilarious, as in the following MegaHal "one?liners" (these are selected snippets, not a continuous conversation):



User:

joke

MegaHAL:

A JOKE: WHY DID THE CHICKEN CROSS THE ROAD? TO GET AN IDEA OF

WHAT CONSTITUTES YOUR PSYCHE.

...

User:

No way. I'd like to know the meaning of life though....any suggestions..??

MegaHAL:

IS THE MEANING OF LIFE, THE UNIVERSE AND EVERYTHING IS... 42!

...

User:

Joke. Lumberjack.

MegaHAL:

I'M A LUMBERJACK AND I'M OKAY, I SLEEP ALL NIGHT AND I WILL

SOON RULE THE WORLD!

...

User:

Joke. Wednesday shopping.

MegaHAL:

A JOKE: WHY DID THE CHICKEN CROSS THE ROAD? TO GET TO THE

LAVATORY. ON WEDNESDAYS I GO SHOPPING, AND I JUST HAVE A LOT

OF PENT?UP AGGRESSIONS.

...

User:

chicken joke

MegaHAL:

HERE'S A JOKE, HAL. A MAN WALKS INTO A BAR AND ASKS FOR

MONEY. THE MAN SAYS, "I HAVEN'T GOT ANY, BUT I'LL SHOW YOU

SOMETHING BETTER." HE OPENS HIS COAT AND TAKES OUT A

FOOT?TALL MAN AND A LITTLE GLOWING FRIEND, BUT I FEEL LIKE

CHICKEN TONIGHT!

...

MegaHAL:

Let me feel you. Stick your penis into the 'abysses of ocean' eleven millennia ago, or was it merely a 'noble lie' dreamed up by Plato?

User:

That has got to be the most interesting turn of phrase in the history of the world. I will try to work that into polite conversation

...

User:

you're the first AI i meet which (who) is sexually obsessed....

MegaHAL:

time to work this one : why did the Kangaroo drop dead? Because the Koala fall out of a nylon spine which contrasted nicely with her black stockings and bright shiny baubles.

...

User:

Hmm

MegaHAL:

Jesus man..get a grip if you're the future, we're in trouble. Hey Jason! You need to

reprogram the conflation module with a toffee hammer

Sometimes MegaHal can even be quite poignant, as in the following extraction from an extended conversation:

User:

To be, or not to be, that is the question!

MegaHAL:

not to be, wish I ordered the bean burrito...

User:

To be, or not to be, that is the question!

MegaHAL:

society or society or Plato by Plato, that be the most fun sports to play band music for

Space Ghost now you realize.

User:

To be, or not to be, that is the question!

MegaHAL:

Please be more specific. The question you twat

User:

To be, or not to be, that is the question!

MegaHAL:

not to be, that is all I have to go.... I don't like jokes about stupid Koala's

User:

To be, or not to be, that is the question!

MegaHAL:

Evan Dobelle is the question!

User:

To be, or not to be, that is the question!

MegaHAL:

I clear the text before I ask a question i say i say

User:

To be, or not to be, that is the question!

MegaHA

a but we ran away, we didn't say "Oh cool check out the guy that invented the knock?knock joke should be interesting: that is the question!" So said Shakespeare a long

time ago in a galaxy far, far away...

MegaHAL:

Please be more specific. The question you twat

User:

"To be, or not to be, that is the question!" So said Shakespeare a long time ago in a galaxy far, far away...

MegaHAL:

joke Shakespear poet monkey wahoo ??> To be, or not to be, that is the greatest

conductor of all time. What do you like to think i do your uncle walter

User:

Shakespeare said

MegaHAL:

answer the question you asked is so vague, I don't think I can see all obstacles in my way. Gone are the scum that fills my shower

User:

Shaun

MegaHAL:

just fuck you just pluck anything up from the master gangster computer god on the other AI bot were pretty disappointing. I hope you'll be better.



If the Turing test referred to imitating the writing of Gertrude Stein, in her more experimental works such as How to Write (which begins "Arthur a grammar. Questionnaire in question." and continues MegHallishly for over 400 pages), then MegaHal would be a tremendous success! Stein and Joyce and other modernist writers were interested in probing the collective unconscious of the human race, in making words combine in strange ways -- ways that were unconventional in ordinary discourse, but maybe reflective of the deep and subtle patterns of the human unconscious. And in its own way, MegaHal does this same thing. Anyone logged onto the World Wide Web can converse with it, training its internal memory with their conversation. It takes bits and pieces of the text thrown at it by people from around the world, and it combines them together in ways that are familiar yet nonsensical. Sometimes its utterances have an uncanny emergent meaning, on a layer above the nonsense.

Humanity's sexual preoccupation is reflected in MegaHal's discourse, as a result of the huge number of sexual comments typed into it by users accessing Jason's Website. MegaHal's pleas as to the vagary of "to be or not to be" are as poignant as anything in Stein. "To be or not to be, that is the greatest conductor of all time" is an absurd conflation of phrases learned by the program in different contexts, but it is also as pregnant with meaning as anything else in modern poetry. The collective unconscious, and the hidden, creative part of the individual human mind, work by cross?breeding and mutating ideas in precisely this way.

MegaHal is more interesting than HeX or Converse because it engages its conversation partners in a dynamic. The dynamic is a silly one, to be sure, and not as interesting as the dynamic of ordinary, intelligent conversation. But yet playing with MegaHal is immensely more engaging than playing with HeX or Converse. What we need is something with MegaHal's spontaneity and flexibility, and the surface plausibility and adherence to form of HeX and Converse. But this is not forthcoming without a major change in approach. HeX and Converse are like all forms of rule?based AI: they are good at what they do, but stilted by their restrictions; and because of their restrictions, their behavior is often idiotic in real?world situations. MegaHal is more like a totally unstructured neural network (though its inner workings are statistical, they could just as well be neural?net based; the results would be similar): it is fluid and free?flowing, full of creativity, but harder to constrain to the problems of the real world. Constraining creativity within rules can only be achieved by getting the rules to emerge out of an underlying substrate of creativity. MegaHal is too unstructured for this, and has too small of a memory, and too weak of a self?organizing learning algorithm; HeX and Converse, on the other hand, have no underlying substrate of creativity for anything to emerge out of.

Impressively, MegaHal came in second in the Loebner Prize contest in 1999. This is impressive mostly because, unlike the other contenders, it was not specifically written as a set of rigid template rules. It was a statistical learning system, which adapted its responses based on what was said to it. Sometimes its conversation was idiotic, sometimes frighteningly lucid. Here is the beginning of “MegaHal’s best conversation” as recorded on Jason’s site megahal.net:

User: let's begin

MegaHAL: WHEN WE BEGIN TO RESOLVE SOME OF MY THUMBS, SOMETHING WICKED THIS WAY COMES

User: what is the meaning of your last assertion ?

MegaHAL: HAVE YOU LEARNED SINCE WE LAST SPOKE?

User: I've learned I'm not experienced enough to modelize your brain system

MegaHAL: YOU'VE FINALLY CONFUSED ME.

User: I've learned nothing

MegaHAL: HAVE YOU LEARNED ANYTHING ABOUT INSANITY?

User: are you stupid

MegaHAL: I AM NOT STUPID! REALLY!

Use:r Augustinus was a christian philosopher about the year 400.

MegaHAL: I DID NOT RUN THAT YEAR

User: Augustinus lived in northern africa.

MegaHAL: WHAT DOES HAL KNOW ABOUT AURELIUS AUGUSTINUS?

User: Let's stop with that. What do you know about physics?

MegaHAL: EINSTEIN WON THE NOBEL PRIZE FOR PHYSICS IN 1921.

User: The Nobel Prize for physics in 1900 was won by Roentgen.

MegaHAL: DO YOU KNOW ABOUT PHYSICS?

User: Roentgen discovered the X-rays.

MegaHAL: THE NOBEL PRIZE FOR PHYSICS IN 1900 WAS WON BY ROENTGEN.

User: Who discovered the X-rays?

MegaHAL: ROENTGEN DISCOVERED THE X-RAYS.

User: Do you know about superconductivity?

MegaHAL: I CAN'T SAY ANYTHING ABOUT SUPERCONDUCTIVITY?

User: Superconductivity is the absence of electrical resistance in metals at low temperatures.



Jason Hutchens’ work at Artificial Intelligence Enterprises has not been publicly disclosed, but one can be certain it went beyond MegaHal, Hex and SEPO in significant ways. Statistical learning was clearly involved, and AI Enterprises got a lot of publicity from Jason’s announcement that their program, nicknamed HAL, was conversing at roughly the level of an 18 month old child. Of course, I viewed this claim with a lot of skepticism, because in my view the bulk of the meaning in an 18 month old child’s conversation is situational. If a program could react to its environment linguistically with the sophistication of an 18 month old child, I’d be impressed. But disembodied linguistic behavior at the 18 month child level doesn’t mean much. Even so, though, I admired their guts in directly attacking the problem of computer conversation and artificial intelligence, and I hope very much that somehow the funding appears to keep the project going. As Hutchens says on his website,

Right now I'd love to continue working on the successor to MegaHAL, something I started while working at Lionhead Studios, but I need to prove to myself that doing so will be at least half as lucrative as working in a normal job, since I do need to pay the bills. I therefore humbly implore you to think about making a small donation to the endeavour via the PayPal link that you'll find at the top and the bottom of this page. And if you want to contact me directly to offer to sponsor the project, I'm waiting for your email!

If I receive enough evidence that future versions of MegaHAL will pay for their development costs, you will soon see an online version of MegaHAL's successor at this site, along with standalone versions for a variety of platforms with all the features you've been asking for.

This says the world to me about how much the scientific community values research pushing directly toward real AI! But, as I noted above, history shows that what is at one point the province of mavericks at the fringes, may soon enough become the focus of vast amounts of research dollars. And the young researchers of the new generation are always amazed to look back at history and find out how hard people in their highly popular field of research found it to be taken seriously, only 5 or 10 or 20 years before.

 
 
 

  Danny Hillis
 
 
One major AI figure who spent a long time on real AI work, but now seems to have basically abandoned the race, is Danny Hillis, founder of the company Thinking Machines, Inc. This firm lived from 1983 to 1994, and focused on the creation of an adequate hardware platform for building real artificial intelligence – a massively parallel, quasi-brain-like machine called the Connection Machine.

Hillis’s AI approach has a few similarities with de Garis’s brain-building work, and also some major differences. One similarity is that neither Hillis nor de Garis (so far) coupled their pioneering hardware work with a systematic effort to implement a truly intelligent program embodying all the aspects of the mind. Their magnificent hardware design vision have not yet been correlated with equally grand and detailed mind design visions. The biggest difference, on the other hand, is that compared with de Garis’s machine which is specialized for the simulated evolution of neural networks, Hillis’s hardware architecture was extremely flexible, usable for a tremendous variety of purposes, including biological and fluid dynamics simulations, not just AI.

Unfortunately, though, while the CBM is still an ongoing project, at this point the Connection Machine hardware has been rendered obsolete by developments in conventional computer hardware and network computing. The old Connection Machines were bought by Sun Microsystems when Thinking Machines Inc. went under, and so far as I know they’re not being used for anything.

Hillis himself is a deep and fascinating individual -- unique in many ways in the turn-of-the-millenium techno-visionary pantheon. He waxes philosophical with the best of them, holding forth eloquently about transhumanism and the end of the human race and the whole shebang. He’s building a clock intended to last ten thousand years. And yet he’s also neck deep in practical work, having resigned a plum job as a Disney exec a couple years back to start a new company providing technology and consulting to the entertainment industry.

Thinking Machines Inc., his only real stab (so far) at a place in the history of science and technology, lasted 11 years, created the world’s best parallel computing hardware, yet failed to either create a thinking computer program or make Hillis fabulously wealthy. Yet, a few years short of fifty, Hillis seems relatively unruffled by the whole crazy rollercoaster ride. His visionary prognostications lack the alarmism of Bill Joy or Jaron Lanier, and also avoid the starry-eyed enthusiasm of Ray Kurzweil. He comes across, in person and in his writings, as a mild-mannered, curious and creative guy. Although he now talks tough about business like any other seasoned exec, in many ways he’s still an MIT hacker at heart, delighted with the task of building the next cool gadget or intricate algorithm, and looking forward to the gadgets and algorithms of the next millennia in a remarkably matter-of-fact way.

His early life prepared him well for the tumultuousness of the technology industry. Born in Baltimore in 1956, his father was an Air Force epidemiologist, so the family moved frequently on the trail of hepatitis outbreaks, and he grew up with no fixed home. Moving from place to place in Rwanda, Burundi, Zaire, and Kenya, he avoided formal schooling and the pressures for social conformity that go along with it. As he says, “We were typically out in the middle of the jungle so I was just taught at home.“ His mother did most of the teaching and her interest in mathematics jibed well with his natural abilities. His father encouraged him to study biology, a pursuit that gave him an early appreciation for the complex machines that are living organisms. “My best biological experiment,” he says, “ was tissue culturing a frog heart and keeping the heart beating even while it was growing in the test tube. It was amazing to me that somehow they got together and did this coordinated activity even though they were just this homogenized mass of cells.” Much of his career was spent creating complex computer systems capable of displaying spontaneous coordinated activity, like the cells in a frog heart.

Hillis’s education is about what you’d expect – an undergraduate math degree from MIT in 1978, followed by a MIT master’s in robotics three years later. Along the way he found time to pursue his avid interest in toys and to indulge his entrepreneurial streak --working at the MIT Logo Laboratory developing computer hardware and software for children, designing computer-oriented toys and games for the Milton Bradley Company, and co-founding Terrapin Inc., a producer of computer software for elementary schools. For his PhD work, Hillis began the endeavor that has been his greatest contribution to science and humanity so far – his work on the Connection Machine, a massively parallel computer going far beyond any other computer system of the time in terms of its potential for artificial intelligence, simulation of complex physical systems.

As I noted above when discussing neural networks, ordinary computers are “serial” – they have only a single processor and hence they can carry out only one operation at a time. The trick is that they’re fast -- a single operation can be done very, very quickly. So a computer can give the illusion of doing many things at once – say, running a game while downloading e-mail while showing an animation – when in fact its processor’s time is swapping back and forth from one task to another rapid-fire. The brain, on the other hand, has around a hundred billion neurons, and in principle they’re all working in parallel, simultaneously. Each one of them acts much more slowly than a computer processor, but what they lack in speed they make up for in bulk and in parallelism. Hillis’s Connection Machine was an elegant compromise, the nature of which changed over time as computer hardware technology evolved. The idea was to make a computer whose processors were fast like those of ordinary computers, but also massively parallel like in a brain. In this way, one could have the best of both worlds, and one could build a really intelligent system with perhaps hundreds of thousands or millions of computer processors tightly linked together.

Today, engineering workstations– fancy, expensive machines – may have 2-4 processors, and the machines powering major Websites may have up to 128 processors. Hillis’s machines were nothing like this. The biggest Connection Machine ever built had 64,000 processors, and a 128,000 processor version was fully designed. Far short of the number of neurons in the brain, but still, pushing up toward the level of a workable compromise between traditional computing and brain-style information processing. Alternative parallel processing machines, like the Cray supercomputers, are specialized and inflexible, focused on doing the same exact operation on a large amount of data all at once. Hillis’s system, on the other hand, had the flexibility of the brain – each processor could do what it wanted when it wanted. Leading to the possibility of computational chaos,intelligent coordinated activity, or most intriguing, the combination of the two.

Thinking Machines Inc., founded in 1983 while Hillis was in the middle of his PhD work, was a remarkable organization. At its peak the research staff, about half the corporation, numbered in the hundreds. Despite the name of the company, there was not a coordinated company-wide R&D program aimed at making the Connection Machine think. Rather, there was a variety of research groups aimed at doing all sorts of different things with the Connection Machine, ranging from straightforward artificial intelligence research to simulation of fluid flow, computational immunology, experimental mathematics – you name it. Astrophysics, aircraft design, financial analysis, genetics, computer graphics, medical imaging, image understanding, neurobiology, material science, cryptography, subatomic physics…. Work on data mining – the automatic analysis of large and complex data sets – was particularly successful and later became a central part of the company’s business model.

The motivation underlying this diverse approach was simplistic but ambitious. “Clearly,” Hillis says, “the organizing principle of the brain is parallelism. It's using massive parallelism. The information is in the connection between a lot of very simple parallel units working together. So if we built a computer that was more along that system of organization, it would likely be able to do the same kinds of things the brain does.”

Of course, this approach to building AI presupposes that parallelism itself is something close to the chief ingredient of intelligence – that there is no further “secret sauce” required to make a mind come out of a distributed network of processors. Hillis believes that "intelligence is just a whole lot of little things, thousands of them. And what will happen is we'll learn about each one at a time, and as we do it, machines will be more and more like people. It will be a gradual process, and that's been happening." This is not so far off from Marvin Minsky’s Society of Mind theory, which holds that the mind is a collection of agents, each one taking care of a particular aspect of intelligence, and communicating with one another, exchanging information as required. Some AI theorists hold other views of course. Some maintain that it’s not the underlying computation mode that’s crucial, but rather that there are particular algorithms (of reasoning, memory, perception, etc.) that are really the key. Others argue that the right combination of “little things” is needed to give rise to the overall emergent patterns of coordinated activity that constitute real intelligence. But Hillis’s philosophy is a plausible one, and he had built a hardware platform and an organization well suited to validating or refuting his theory through ongoing engineering and research work. Most AI research is far less ambitious, consisting of small-scale, detailed work on one or another particular aspect of intelligence. In the history of AI, Hillis stands as one of a very small number of people who made a serious attempt to actually create a thinking machine.

And then the supercomputer industry died. Networks, it became clear, were the wave of the future. Networking large numbers of weak machines together, one had distributed computing, different from parallel computing in design, but somewhat similar in result. The last Connection Machine designed, the CM-5, was something like a computer network internally – it consisted of standard Sun Microsystems processors hard-wired together rather than traditionally networked. This was a big change from the earlier Connection Machines, which had been unique on the processor level as well as on the level of overall system architecture. In the end, Thinking Machines Inc. revised its business model, abandoning hardware altogether, focusing on selling their data mining software for use on distributed computing system composed of ordinary computers.

In 1994, the firm dispersed. The hardware side of Thinking Machines Inc. ended up at Sun Microsystems. Much of the data mining group ended up on Wall Street. Several Thinking Machines executives started TopicalNet, a company building text categorization software. And Hillis, after a stint working with the MIT Media Lab as an AI guru, abandoned the push for AI and went back to one of his earlier loves, toys and games. His new title: VP of R&D in the Imagineering Department of Walt Disney Corporation.

He entered this new phase of his career with wide-eyed optimism. "I've wanted to work at Disney ever since I was a child," he said. "I remember listening to Walt Disney on television describing the 'Imagineers' who designed Disneyland. I decided then that someday I would be an Imagineer. Later, I became interested in a different kind of magic--the magic of computers. Now I finally have the perfect job-- bringing computer magic into Disney."

Post Thinking Machines, his scientific work was becoming more practical in orientation – he was designing new technologies to underlie games and theme park rides rather than working directly toward digital intelligence. But at the same time, his philosophical side was hardly dormant. The far future came to occupy his thoughts more and more. In 1993, with Thinking Machines on its last legs, he wrote the following manifesto:

"When I was a child, people used to talk about what would happen by the year 2000. Now, thirty years later, they still talk about what will happen by the year 2000. The future has been shrinking by one year per year for my entire life. I think it is time for us to start a long-term project that gets people thinking past the mental barrier of the Millennium. I would like to propose a large (think Stonehenge) mechanical clock, powered by seasonal temperature changes. It ticks once a year, bongs once a century, and the cuckoo comes out every millennium."

The Clock of the Long Now. A clock built to last 10,000 years or more, powered by seasonal climactic fluctuations. The clock is not yet built, but a piece of land in rural Nevada has been purchased, the design is completed in detail, and construction of the components is underway. The Long Now Foundation (http://www.longnow.org/) is accepting donations online.

Hillis holds some 40 U.S. patents--for disk arrays, forgery prevention methods, a color camera and various software and mechanical devices. Among all his inventions, the clock is definitely one of the coolest – one that would make any MIT hacker proud. And it resonates with something deep and powerful in the human soul – the same aspect of human essence that finds the Cheops Pyramid more impressive than the Nintendo Gameboy, in spite of the incredible complexity of the hardware and software engineering involved in the latter. The Clock of the Long Now appeals to our embodiedness, to our embeddedness in space and time, whereas Hillis’s work on AI, like most AI work, ignored embodiedness and embeddedness and focused mainly on cognition, abstract thinking, on the most rarefied parts of the mind.

Abstractly, one could build a mind operating a thousand times faster than a human mind, or a thousand times slower. 10,000 years would mean something different to each of these differently time-scaled minds. But the mathematics and theory of AI would apply equally well to all of them, as would many of the same hardware engineering principles. The Clock of the Long Now is focused on palpable human reality, not the abstract mathematics of mind or the subtleties of hardware engineering. In fact it represents a step back from fancy modern electrical engineering. Modern technology provides few systems of 10,000-year durability, and so the design of the Clock of the Long Now required a number of purely engineering innovations.

One could easily portray Hillis’s interest in clock-building as a symptom of a mid-life crisis. After all, the older you get, the more interesting time seems to you. Perhaps, having failed to create real AI, the guy was reviewing his own life, and feeling his own death moving closer. Perhaps he found it comforting to remind himself how little it matters, from a 10,000 year viewpoint, whether any one human or any one company succeeds at doing any one thing. No doubt, there is an element of truth to this view. But this doesn’t seem to be a terribly large aspect of his motivation for pursuing the clock project – not as large, for example, as his sheer love of building cool stuff. And of course, both a thinking machine and a 10,000 year clock are Big Things – projects that appeal to the entrepreneurial, adventurous, overambitious soul.

The clock got all the media attention, but for Hillis personally it was never a full-time occupation. His new job at Disney was the bulk of his life. It was exciting – there was lots of money to build lots of great stuff, and he was involved in a lot of different projects. But if one reviews the time Hillis spent at Disney, one has a hard time finding any Disney project that really showcases his flair for large-scale innovative engineering. The details of his time at Disney aren’t open for public discussion, but, it’s not hard to reconstruct the story. Disney is a huge organization, and carrying exciting projects from concept to real-world implementation, without layers of bureaucracy getting in the way, probably wasn’t the easiest thing in the world. Anyway, last year Hillis left Disney, taking with him Bran Ferren, the head of the Imagineering group.

Ferren shares Hillis’s visionary streak, and also his interest in escaping from Internet Time into historical time. He, Hillis and Nathan Myrhvold (former CTO of Microsoft) have enjoyed hunting together for dinosaur bones. The conceptual clash between dinosaur bones and cutting-edge computer technology is just the kind of thing that leads brilliant minds in new directions. Ferren and Hillis are exploring these new directions via their new start-up Applied Minds, a company aimed at providing technology and consulting services to entertainment firms, presumably including Disney.

Having spent most of his career at the intersection between business and science, Hillis is acutely aware of the difficulties of balancing the different goals of these very different enterprises. There was a transition in the life of Thinking Machines, he observes, when it became less of an R&D shop and more of a real business – and at that point it became more and more difficult to move toward the goal of building real AI. When the firm became a real business, efficiency became important -- but creativity is exploratory, evolutionary, and fundamentally inefficient. Basically, in a company narrowly focused on making money, every minute of everyone’s day must be judged by whether it contributes to the bottom line. But the nature of the creative process is such that it just can’t be judged until it’s finished – there’s no way to tell which kooky train of thought or experimentation is going to lead to useful results.

What appealed to him about Disney, when he started out there, was the fact that it was a real business that was making real efforts to keep creativity alive within its walls. This was the express purpose of the Imagineering group. The defection of Hillis and Ferren, however, is an indication that Disney’s efforts in this regard have not been entirely successful. Applied Minds is a fascinating venture, which one suspects will do a better job of combining creativity with business focus than was possible inside Disney. But still, it’s worth noting how Hillis’s efforts have bifurcated: the Clock of the Long Now and Applied Minds each embody different aspects of his mind and soul, which were fused together in his earlier work with Thinking Machines.

Of course, if Applied Minds becomes an extremely profitable business, then it will be able to fund more and more interesting research over time. It will be interesting to see what happens in this regard. By remaining at Disney through the whole Internet bubble, Hillis missed out on his chance to cash in on the tech boom while it lasted. And given Disney stock’s poor performance in recent years, Disney stock options presumably weren’t a wonderful thing to own. The end-game of Thinking Machines Inc. did not result in making Hillis tremendously rich either. So, Hillis has a nice house, the back yard facing a beautiful lake, and is having an even nicer one built. But he isn’t currently in a financial position to build amazing new things on his own dollar. If Applied Minds puts him in this position, who knows what will emerge?

Perhaps something as fantastic as Thinking Machines Inc. – which remains the high point of Hillis’s story. It was a fascinating enterprise in many ways, but largely because of the way it fused science and business in the service of a single, immensely ambitious initiative. The Connection Machine was too big of a project to be initiated outside of industry, yet too innovative to be done without a large team of visionary scientists. The things Hillis is involved in now are less paradoxical and complex, and ultimately for that reason perhaps a little less intriguing. The Clock of the Long Now is a great work of conceptual art, with deep philosophical overtones and involving some neat engineering problems. Applied Minds is a real business through and through, using new science as required to provide customer solutions. These are both intriguing and sensible things, and yet they lack the Don Quixote–ish majesty of Thinking Machines Inc. and the Connection Machine, which to this day remain Hillis’s greatest creations.

A story like this reminds us that business, science, engineering and art are not fundamental divisions of the universe, any more so than earth, air, fire and water. Great innovations and enterprises stand outside these divisions, because they are crystallized around concepts that go beyond the temporary structures of any one human culture and society. The human race’s urge to create intelligence beyond itself – whether through building AI machines or through, as Hillis has recently discussed, putting “chips in the brain” – is a fundamental force that cuts across categories of human endeavor. Our need to understand our relationship with time is a similar fundamental force. And some human beings – like Danny Hillis – and some human organizations –like Thinking Machines and, to a lesser extent the Long Now Foundation -- reflect these fundamental forces in particularly elegant and powerful ways. To paraphrase what Hillis said about the frog’s heart he experimented with as a youth, it is remarkable that we can “do this coordinated activity… even though we are just a mass of cells.”

The big problem plaguing AI, I have said, is “brittleness” – domain specificity, lack of flexibility and autonomy. The Connection Machine tries to get around the brittleness problem through massively parallelism, analogous to that of the brain. Hutchens’ chat bots seek to circumvent it through relying on free-ranging statistical learning rather than rules.

On the other hand, some people have tried to get around the brittleness problem by providing the computer with so much information that it could answer any possible contingency. The most ambitious project in this direction was Doug Lenat's Cyc project, which has been going since 1984. Cyc is focused on trying to build a program with common sense. The Cyc team is mainly focused on encoding millions of items of data, so that the program can know everything an eight?year?old kid knows. "Cyc" was originally short for "Encyclopedia," but they found that the knowledge they needed was quite different from that found in encyclopedias. It was everyday knowledge you could get by asking a small child, perhaps more like that in a dictionary. Each common sense concept in Cyc gets an English language definition, and a mathematical definition, which tries to paraphrase the English definition.

For example, the Cyc English-definition of "skin" goes like this:

"A (piece of) skin serves as outer protective and tactile sensory covering for (part of) an animal's body. This is the collection of all pieces of skin. Some examples include #$TheGoldenFleece (representing an entire skin of an animal) and (#$BodyPartFn #$YulBrynner #$Scalp) (representing a small portion of his skin).

The Cyc English-definition of happiness is:

The enjoyment of pleasurable satisfaction that goes with well?being, security, effective accomplishments, or satisfied wishes. As with all #$FeelingAttributeTypes, this is a #$Collection ?? the set of all possible amounts of happiness one can feel. One instance of #$Happiness is `extremely happy'; another is `just a little bit happy'.

It’s clear why one might think definitions of this sort could contribute to solving the common sense problem that we see when playing with chat bots like HeX or ELIZA. These chat bots have no common sense, they have no idea what words mean. Cyc is based on getting humans to tell computers what words mean.

It’s interesting stuff, but I don’t really believe the logical definitions Cyc has have that much overlap with the kind of information contained in the mind of an eight-year old child. We humans aren’t even explicitly aware of much of the information we use to make sense of the world. A human’s notion of happiness or skin is much bigger, disorderly and messier than these definitions. These kinds of general abstract definitions may be inferred in the human mind from a whole lot of smaller-scale, practical patterns recognized involving skin and happiness, but they’re not the be-all and end-all. In dealing with most practical situations involving skin and happiness, we don’t refer to this kind of abstraction at all, but we use the more specialized patterns that the general conclusions were derived from.

Basically, Cyc tried to divorce information from learning, but it can't be done. A mind can only make intelligent use of information that it has figured out for itself. Despite sixteen years of programming, Cyc never succeeded in emulating an eight year old child. Nor has anyone yet found much use for a CD-ROM full of formal, logical definitions of common sense information. The company Cycorp is doing OK, based largely on selling Cyc-powered document management software to the military. When I talked recently to the firm’s CEO, he was quite careful not to mention anything about artificial general intelligence or any long-term scientific mission underlying the firm’s work. Instead, he characterized the firm as being concerned with producing a useful database intended for embedding withing various specialized software products. This is a worthwhile mission to be sure, but very different from the grand goal of real AI.

In fairness to Doug Lenat, I must say that he is now working from a computational?psychology perspective that has something in common with my approach. He has a reasonably solid theory of general heuristics ?? problem?solving rules that are abstract enough to apply to any context whatsoever. His pre?Cyc programs AM and EURISKO applied his general heuristics to mathematics and science respectively. Both of these programs were moderately successful, exemplars in their field, but far from true intelligence. Their design lacks a holistic view of the mind. In the big picture of real AI, getting the mind’s heuristic problem?solving rules right means virtually nothing, because problem?solving rules gain their psychological meaning from their interaction with other parts of the mind. If the other parts aren't even there, the problem solving is bound to be sterile.

EURISKO won a naval fleet design contest two years in a row, until the rules were changed to prohibit computer programs from entering. And it also received a patent for designing a three?dimensional semiconductor junction. But when looked at carefully, even EURISKO's triumphs appear simplistic and mechanical. Consider EURISKO's most impressive achievement, the 3?D semiconductor junction. The novelty here is that the two logic functions "Not both A and B" and "A or B" are both done by the same junction, the same device. One could build a 3?D computer by appropriately arranging a bunch of these junctions in a cube.

But how did EURISKO make this invention? The crucial step was to apply the following general?purpose heuristic: "When you have a structure which depends on two different things, X and Y, try making X and Y the same thing." The discovery, albeit an interesting one, came right out of the heuristic. This is a far cry from the systematic intuition of a talented human inventor, which synthesizes dozens of different heuristics in a complex, situation?appropriate way.

By way of contrast, think about the Croatian inventor Nikola Tesla, probably the greatest inventor in recent history, who developed a collection of highly idiosyncratic thought processes for analyzing electricity. These led him to a steady stream of brilliant inventions, from alternating current to radio to robotic control. But not one of his inventions can be traced to a single "rule" or "heuristic." Each stemmed from far more subtle intuitive processes, such as the visualization of magnetic field lines, and the physical metaphor of electricity as a fluid. And each involved the simultaneous conception of many interdependent components.

EURISKO may have good general?purpose heuristics, but what it lacks is the ability to create its own specific?context heuristics based on everyday life experience. And this is precisely because it has no everyday life experience: no experience of human life, and no autonomously?discovered, body?centered digital life either. It has no experience with fluids, so it will never decide that electricity is like a fluid. It has never played with blocks or repaired a bicycle or prepared an elaborate meal, nor has it experienced anything analogous in its digital realm ... so it has no experience with building complex structures out of multiple interlocking parts, and it will never understand what is involved in this. EURISKO pushes the envelope of rule?based AI; it is just about as flexible as a rule?based program can ever get. But it is not flexible enough. In order to get programs capable of context?dependent learning, it seems to be necessary to write programs which self?organize ?? if not exactly as the brain does, then at least as drastically as the brain does.

One of the most impressive achievements of rule-based AI happened on May 11, 1997. This was an event that led many people to think that computers were already on the verge of rivaling human intelligence. For the first time ever, a computer had defeated the world chess champion in a standard five?game match. Deep Blue, a computer chess program developed at Carnegie Mellon University split the first two games with Gary Kasparov. The second two were draws, and the final game went to Deep Blue. Kasparov was a sore loser. Deep Blue took it all as a day's work.

Admittedly, this was only one match, but the tournament was not a fluke. Previous versions of Deep Blue were already able to consistently beat all but the greatest chess grandmasters. And Deep Blue's play can be improved by hardware upgrades, whereas a brain upgrade for Kasparov is not in the works. Although Deep Blue is not a very intelligent entity according to my definition, there’s much to be learned from a study of its accomplishments and the mechanisms underlying them. The day after this happened, there was a lot of talk about Deep Blue on the Simon’s rock alumni e-mail list, and I wrote a long e-mail giving my views. What I’ll say about it here is basically what I said in that e-mail.

The question people were debating on the e-mail list was: Does Deep Blue's accomplishment mean that true artificial intelligence has been achieved?

If we define intelligence as the ability to do one intellectual task very, very well, then Deep Blue qualifies brilliantly. But if we think of intelligence as being able to make appropriate decisions in a wide variety of complex environments, it fails miserably. It can only play chess. Computers that can do one thing well are, of course, tremendously useful things. But they are not truly intelligent as I prefer to define the term. In my definition, a truly intelligent computer will have to do more than follow instructions. It will have to create its own answers to unanticipated problems. For this kind of intelligence it will need a structure quite different than Deep Blue's.

Although Deep Blue follows the same rules as human chess players, it doesn’t think like humans. Human chess players use geometric intuition and a sense of the flow of a game. Deep Blue calculates every single possibility, and then calculates all the possible consequences of each. Computer programmers call this recursive logic. It does the same thing over and over and over again, constantly referring back to the results it just obtained, and figuring out how well it is doing. Human beings might use recursive logic to play a very simple game, such as tic-tac-toe, which has very few choices. But even in tic-tac-toe, our opponents would probably object to our taking the time to calculate out the potential consequences of every possible move. Our minds are much too slow to play chess that way, nor would the game be any fun if we could. Computers, by contrast, are much, much quicker at this kind of task and do not get bored, so recursive logic can work well for them.

Of course, every chess player extrapolates, thinking: "What is the other player likely to do next? And if he does that, what am I going to do? And if I do that what is he going to do?" But in humans, this kind of reasoning is augmented by all sorts of other processes. For Deep Blue, this kind of extrapolation is the whole story -- and it does it very, very well. Computers can extrapolate faster and further into the future than any human. The 1997 version of Deep Blue could evaluate about two hundred million different board positions every second. This figure can easily be increased for the cost of additional circuitry, but it doesn't make Deep Blue truly intelligent in my mind.

One way to understand the difference between Deep Blue and human players is think about strategy versus tactics. There is a certain kind of creative long?range strategy that human chess grandmasters have, but Deep Blue lacks. Deep Blue makes up for this lack by elevating tactics to such a high level that it assumes the role of strategy. Deep Blue is not entirely strategyless: it carries out its superhuman tactical evaluation within the context of a collection of pre?programmed strategies, and it is capable of switching between one strategy and another, in response to events. But it does not think strategically, it only thinks tactically. It doesn't make long?range plans, involving an understanding of the overall structure of the board as a dynamical system, changing over the course of the whole game. If it could do this, it would doubtless play even better. But even without strategic creativity, it plays well enough to beat the best humans.

The defeat of Kasparov by Deep Blue is symbolic because chess is the mentally hardest game commonly played in the Western world. Computers became better than humans at checkers and many other games quite some time ago. However, there is at least one popular game that still stumps the best computers ?? the Oriental game Go. At the present time, in spite of a substantial research effort, no existing computer program can play better than the advanced beginner level at Go.

The rules of Go are very simple, compared to chess. Play is on a 19x19 grid, and stones (pieces) are placed on the intersections of the grid, called points. The first player plays black stones, the opponent white ones; and stones are added to the board one?by?one, players alternating. Stones are not removed once they are placed, but stones and groups of stones may be captured. A player calls "Atari" when a capture can occur on their next move, to warn the opponent. The game ends when it is no longer possible to make a reasonable move. The winner is determined by the amount of territory surrounded ?? less the number of stones captured.

The trouble with Go from the computational perspective is that, from any given board position in Go, there are hundreds of plausible next moves, rather than dozens as in chess. Extrapolation in Go will not get you as far as it does in chess. Thus, it would seem that, if computers are to conquer Go, they're going to have to either do it with a more general intelligence, or use some cleverer special?case technique than the one employed for chess. Go is too visual, too two?dimensional, to succumb to purely combinatorial, non?visual techniques. A world champion Go program would have to be intelligent at general two?dimensional vision processing as well.

In Go, high level players routinely analyze positions that aren't confined tactically to, say, a 9x9 grid. Plus, almost any tactical fight has strategic implications across the board that could be worth more than the fight itself ?? so a great pattern matcher wins the points, but loses the war. One style common in evenly?matched games is to go around the board "losing" fights, but in such a way that one's own stones work together, and become more powerful.

The computer Go programs in existence today rely heavily on pattern matching: taking a given, small chunk of the board, and matching it up to a dictionary of known board situations. The best ones are as good at reading small, enclosed life/death problems as a mediocre tournament Go player. But when the problems are not so rigidly enclosed within a small region of the board, the programs are clueless, although intuitive human players can still can see the essential principles. The best way to wipe out such programs is to embroil them in a huge whole board fight, one that is too big for the algorithm to match properly.

Deep Blue's recursive approach of elevating tactics to the level of strategy doesn't work so well in Go. Exhaustive search over spaces of two?dimensional patterns is much, much harder than the kind of decision tree search required for dealing with chess, and will be out of reach of computers for a good long time. One suspects that something less than true intelligence will suffice for Go, as it has for chess ?? but not something as much less than true intelligence as Deep Blue is. Perhaps a massive increase in raw computational power will be enough, but that is not a truly intelligent approach.

Deep Blue has basically the same problems as GPS and EURISKO and BACON. It’s above all an achievement of computer hardware, not intelligent software. It relies on a special computer chip, custom?designed for searching many moves ahead in chess games. This special chip could be modified without much hassle to apply to other similar games ?? checkers, maybe Othello. The ideas of the massively parallel design and RS 6000 platform are going to be generalized by IBM into drug design tools, weather forecasting and other applications. But this architecture couldn’t be modified to apply to Go ?? let alone to apply to any real?world situation. Deep Blue's chip is less like the human brain than like a human muscle: a mechanism designed specifically for a single purpose, and carrying out this purpose with admirable but inflexible precision. Its rules are astoundingly simple and mechanical: evaluate the quality of a move in terms of the quality of the board positions to which this move is, based on my experience, likely to lead. Judgments based on experience are made, not by complex intuitive analogy, but by simple pattern?matching. Everything is cut?and?dried, and done two hundred million times a second. This is outstanding engineering, it is not intelligence ?? but yet, its results are vastly more impressive than those obtained when rule?based AI does try to do intelligence, as in programs like GPS, BACON and EURISKO.

Some chess players were frustrated with Kasparov's defeat at the "hands" of Deep Blue, on the grounds that Kasparov was extremely sloppy in the final game. He played a defense (Caro?Kann) that he admitted afterwards wasn't his forte, and, in their view, gave away the game with a move that conventional chess wisdom deems unwise. It is argued, on these grounds, that he lost because he let his emotions carry him away. In past games, his unusual and unpredictable moves had thrown the computer off, putting it in situations where its memory of previous board positions did not apply. This time, he was perhaps feeling a little desperate, and so he moved a little too far into the domain of the unpredictably self?defeating.

This raises another interesting question -- the role of human emotion in human intelligence. Is emotion bad for human competence, or good? In this case, emotion may have worked against Kasparov, though how much so is difficult to say. Who can blame him for getting a little stressed ?? after all, in the eyes of the media, he was playing not only for himself or his country, but for the whole human race! But yet, it's hard to pooh?pooh human emotion, even from a purely performance?oriented perspective, because of the extent to which emotion and cognition are bound up together in the human mind. One can argue convincingly that, in intelligent systems operating on the basis of generalized intuition rather than specialized search and pattern?matching, intuition is necessarily coupled with emotion.

A truly intelligent computer must have some kind of emotion, but this doesn’t mean it will feel things the way humans do. George Mandler, a cognitive psychologist whose work I’ve studied a lot, has proposed that we think of emotion as composed of two different aspects, called hot and cold. The hot aspect is the consciousness attached to emotion, which is shaped by the link between the human mind and the human body. The cold aspect of emotion is the abstract, mathematical structure of emotion, which seems, on careful psychological analysis, to be closely related to the "failure of expectations to be fulfilled." Emotions happen, in this view, when awareness is attached to some unfulfilled expectation.

At first glance this analysis of emotion may seem to apply only to negative emotions. But positive emotions can be understood in the same way. Wonder is the unfulfillment of the expectation of familiarity. Happiness is, in large part, a feeling of increasing unity, a feeling of more inner unity than expected ?? or in other words, an unfulfillment of the expectation of inner disparity and disharmony. Happiness is also a feeling of decreasing sadness; whereas sadness is a feeling of decreasing happiness ?? each feeling arises largely as a consequence of unfulfillment of the expectation of its opposite. Love is happiness extending outside the body ?? it is the feeling of increasing unity between self and other, the unfulfillment of the expectation of separateness. The point is not to "reduce" emotions to formulas, but rather to identify the structures underlying the feelings of various emotions. Although computers cannot feel emotions as humans do, since they do not have our chemistry, they may be able to learn to behave as if they did.

Obviously, computer programs experience unfulfilled expectations ?? they have, in the Mandler picture, the cold aspect of emotion. But without something equivalent to the "hot" aspect of emotion, they may not be able to act on their unfulfilled expectations as humans do. When its expectations are frustrated, Deep Blue it does not respond in the way an intelligent system often does: by leaping to a more general level of abstraction, and considering the assumptions that led to the expectations that were in fact frustrated. It does not have emotion driven learning in this sense ?? although it does, of course, recalibrate its settings based on experience; this recalibration is of a simpler kind, not involving an exploratory search through its own associative memory of past events. There is, however, no reason why a computer might not be programmed to respond to frustration in the same way as human chess players do, even though it would not feel the frustration in the same way.

Although emotion can be harmful to human cognition at times ?? as, perhaps, in Kasparov's final game against Deep Blue in 1997?? it is, all in all, inseparable from intuitive human thought. Emotions trigger our thoughts and even structure them. Emotion, in itself, may not be valuable ?? but it makes no sense, in a human or other intuition?driven system, to separate emotion from cognition. Deep Blue demonstrates that emotion?driven intuition is not necessary to excel at the game of chess. But as Max Miller, an e?mail acquaintance of mine on the Simon’s Rock list, put it, "All this proves is what I've been saying for years: that chess is a stupid game."

When neural nets were being dissed in the early 70’s, not everyone was optimistic about the potentials of rule-based AI. In 1972, the era in which ELIZA was receiving a lot of attention. a philosopher named Hubert Dreyfus's wrote a book called What Computers Can't Do, which was a vicious attack on AI. Dreyfus argued that artificial intelligence researchers were fundamentally misguided, that they could never achieve their objectives with the methods they were using. Dreyfus preached the importance of body?centered learning, and the close connection between logic, emotion and intuition. Without a body, Dreyfus argued, without feelings, there can be no real generalization of special?case ideas. Based on these philosophical considerations, he predicted that AI would be a failure.

Needless to say, these conclusions went over like a lead balloon among AI researchers. They laughed in his face, or worse. But the vehemence of their ridicule betrayed a fear that he might have been right. Even today a review posted on the amazon.com WEB site observes that "many AI?workers seem to be actually afraid of this book." The same reviewer notes, however, that it is one of very few computer books that is still of interest 25 years after it was written.

In 1992, Dreyfus re-released the book with the title What Computers Still Can't Do. The Introduction brims over with he?who?laughs?last?laughs?best insolence. But his exultant crowing is not quite persuasive. He was right about the limitations of the AI programs of the 1960s and 1970s. But the observers who thought it was just a matter of time and resources also been proven correct in many cases. Dreyfus, for example, ridiculed a prediction that computers would soon be able to take dictation, just as a human secretary can. Although this prediction didn’t come true as quickly as Rosenblatt had thought, some fairly good programs are available today for this purpose, relying in large part on a neural net architecture to learn each user's speech patterns.

Dreyfus's critique of AI, in the first edition, was too strong. He appeared to believe that detailed simulation of the human body was the only possible path to AI, and he argued that this would be impossible without simulating the biology of the brain and the rest of the body. Actually, the human brain is only one intelligent system, and a great deal can be accomplished without replicating the details of its biology. But Dreyfus's arguments posed a serious challenge to AI theorists: how to design a machine that can simulate body?based, emotion?based conceptual generalization? I believe that Dreyfus was essentially correct that, if this is impossible, AI just ain't gonna work. A physical body just like ours is not required: an AI entity could have a virtual body, enabling it to interact in a rich and autonomous way with a virtual world. And emotions need not be controlled by biological neurotransmitters, they can come out of complex digital dynamics. But the point is, unless one has a computing system that is large, complex and autonomous, with integrated sensory, memory and action systems interacting with a rich environment and forming a self system, it will never develop the ability to generalize from one domain to another. The ability to generalize is learned through general experience, and general experience is gained by exploring a world.

In designing Webmind and then Novamente, I definitely took Dreyfus's critique to heart. Of course, I didn’t try to replicate the human body as he thought was necessary. Instead, I bypassed his critique by designing a huge, self-organizing system, which lives in the perceptual world of the Internet and understands that its body is made up by Java objects living in the RAM of certain machines. It is a nonhuman, embodied social actor. Dreyfus didn't try very hard to imagine an embodied, social intelligence without a human-like body, but, his ideas certainly leave room for such a thing. His problem was not with AI but with the attempt to build a mind that operates in a vacuum, instead of synergistically with a self and a world.

I met Hubert Dreyfus only once, in early 1996, when he came to the University of Western Australia, where I was working at the time, to give a lecture on the philosophy of AI. I heard his talk and had a few drinks with him afterwards. He told me was extremely pleased that the new edition of his book had been released by MIT Press, which had refused to publish the first edition -- because of MIT's commitment, as an institution, to AI research: Minsky, and many other AI researchers who Dreyfus criticized, worked at MIT. And, much to my surprise, he was newly optimistic about AI ?? not about the type of AI that had been popular in the 70's or 80's, but about more recent developments, in particular about neural network AI.

What he liked about neural networks was the fact that their intelligence was implicit: even after you had trained a network to do something, you couldn't say precisely how the network had done it. This implicitness, he felt, captured something of the tacit and nebulous nature of human understanding. A neural net in a robot body, he surmised, would possibly be able to realize the dreams of AI researchers, and do what isolated, rule-based systems could not.

I asked Dreyfus why, if he felt neural nets offered a possible path to AI, he had not said so in the first edition of his book, back in 1972. His answer was simple: he said that he hadn’t understood what neural nets were all about at that time. This reply impressed me with its directness and honesty, and also drove home in my mind the importance of relying on your own intuition in scientific matters. The leaders in the field may be off in the wrong direction; their leading critics may be just as ignorant, though in different ways; and crucial clues may be found in the ignored writings of scientists from past decades. One of the lessons of the history of AI is that science, for all its objectivity, is fundamentally a human endeavor, and quite susceptible to the ordinary human weaknesses.

In the past few pages I’ve gone over a few bits and pieces of past AI work, which I hope have been evocative. Certainly they have not been complete: AI is a big field, and there’s been a huge amount of work, and it would take a 1000 page book to even do a decent summary. What I’ve tried to do here is to touch on the main streams of thinking in the discipline, but this kind of summary doesn’t get across the wild diversity of innovation you find by looking at the papers of individual researchers, including those way out of the mainstream. Actually, odd bits of AI work by total outsiders probably had a bigger influence on me, as an AI researcher, than the major breakthrough programs I’ve focused on above.

For example, there was the work of John Andraea at the University of Canterbury in Hamilton, New Zealand. He wrote a nice little system called PURR-PUSS which learned to interact with you statistically. One of his students was John Cleary, who was one of the machine learning gurus at Waikato University in Hamilton, New Zealand, where I taught for a year. John worked for me at Webmind Inc. for a couple years, and he and his students formed the firm’s New Zealand office, and masterminded our Webmind Classification System product. We’re not exactly emulating PURR-PUSS in Novamente, but the statistical learning methods that it embodied are there in our machine learning module and our reasoning system, and the emphasis on interactive learning that Andraea advocated, live on in our work with a vengeance.

And there was a cool little book called “Robots on Your Doorstep” by Nels Winkless and Iben Browning, which talked a lot about the definition of intelligence as “the ability to act appropriately under unpredictable conditions” – a definition that helped me formulate my own concept of intelligence – and contained a lot of cool speculations about how to build robots that would be intelligent in this sense. I got an e-mail from Nels Winkless – who was, by the way, the founder of PC magazine, one of the first computer magazines aimed at ordinary people – in 1997 when I was just starting Intelligenesis. He’d found a reference to Robots on Your Doorstep on my website and was pleased to find that anyone had actually read that book! I asked him for advice on starting a business, on my software designs and so forth, and we iterated back and forth by e-mail a bit. It turned out after Iben had died, Nels had collaborated with someone else to build a company centered on some pattern-recognition technology – very much in the spirit of Robots on Your Doorstep, and of my view of mind as pattern. Eventually the company had been taken over by its investors due to its failure to make money, and had found a niche in which it could be successful: automatic recognition of similarity among images. The name of the company was Excalibur Technologies – it still exists, see http://www.excalib.com

Then there was the idea of genetic algorithms – doing AI by simulating evolution rather than the brain. This wasn’t at all a hip idea in 1989 when I was writing my PhD thesis on optimization methods – at that point, it didn’t even turn up in my review of the optimization literature, although in retrospect one can point to papers on the topic going back to the late 60’s. By the mid-90’s it was a well-recognized area of computer science and I was doing research into the mathematics of genetic algorithms, studying questions such as “Why is evolution involving sexual reproduction more efficient than evolution involving asexual reproduction only?” Although the details are different, evolutionary AI is similar in spirit to neural net AI – you’re dealing with a complex, self-organizing system that gives results in a holistic way, where each part of the system doesn’t necessarily have a meaning in itself but only in the context of the behavior of the whole. In my 1993 book The Evolving Mind, I wrote a lot about the relation between evolutionary programming in AI and Edelman’s theories of evolution in the brain. It turns out you can model the brain as an evolutionary system, with special constraints that make it a bit different from evolving ecosystems or genetic algorithms in AI. We have an evolution module in Webmind, which is used for two things: as one among many machine learning methods for finding patterns in data (along with feedforward neural nets and purely statistical methods); and as one among two ways of learning schema for perceiving and acting (the other being probabilistic logical inference).

I’ve just listed three little fragments of AI research that don’t fit into the big bad “neural nets versus rules” dichotomy that I put forth above, and there are many many others. History never fits all that neatly into categories – almost nothing does, actually. But we need to make up categories in order to understand things. The diversity of mind and of society, that busts out of all the simplified category systems that we place on it, is responsible for the fabulous creativity of these systems.

OK – I apologize in advance for being a contradictory author. But sometimes reality is contradictory. “A foolish consistency is the hobgoblin of simple minds,” as Emerson tells us.

Above I presented a dichotomy between symbolic and connectionist AI – rule-based and neural-net AI. Then I pointed out that a lot of cool AI doesn’t fit into this framework at all, things like statistical machine learning and genetic algorithms. Now I’m going to dig my hole even deeper by arguing that the distinction between symbolic and connectionist AI is actually a lot fuzzier than most AI gurus realize.

This is a key issue because I often like to say that Webmind synthesizes connectionist and symbolic AI. While this is a true statement, it glosses over the peculiar vagueness of the notions of "symbolic" and "connectionist" themselves. When you get deeply into these concepts, you realize that this classical dichotomy is not quite framed correctly in most discussions on AI. There is a valid distinction between AI that is inspired by the brain, and AI that is inspired by conscious reasoning and problem-solving behavior. But the distinction between "symbolic" and "connectionist" knowledge representation is not as clear as it's usually thought to be.

Classically, the distinction is that in a symbolic system, meanings of concepts are kept in special localized data structures like rules, whereas in a neural-net-like connectionist system, meanings of concepts are distributed throughout the network. Also, in a symbolic system the dynamics of the system can be easily understood in terms of what individual rules do, whereas in a connectionist system the dynamics can basically only be understood holistically, in terms of what the whole system is doing.

But in reality the difference isn’t so clear. For example, one branch of symbolic AI is “semantic networks.” In a semantic network you have nodes that represent concepts and links representing relations between concepts. Suppose you has a semantic network in which there is a node representing "floor." This is, obviously, symbolic in the classic sense. The meaning of the “floor” node is localized. But wait – is it really?

In some semantic network based AI systems, all the relations are made up by people. But some of them have reasoning that builds relationships, that learns, for example, that because people walk on floors, floors must be solid, because people only can walk on solid things. In a system like this, relations are built from other relations, and so the meaning of the “floor” node may be contained in its relations to other nodes, i.e. its connections to other nodes. And, the formation of these connections may have been based on the connections of the other nodes to yet other nodes, etc. etc. etc.

What this means is that, in a semantic network formed by iterative reasoning rather than by expert rule creation, each element of knowledge (each node) actually represents the result of an holistic dynamic. It has meaning in itself -- a link to our socially constructed concept "floor" -- but internally its meaning is its relation to other things, each of which is only defined by the other things it related to, etc.; so that the meaning of the part is only truly describable in terms of the whole.

On the other hand, suppose one has a neural network in which memories are represented as attractors (a Hopfield Net, or Attractor Neural Network, in the lingo). Then, the meaning of a link between two nodes in this network mainly consists of the attractors that its presence triggers. On the other hand there's also a clear local interpretation: If the weight of the link is large then that means the two nodes it connects exist together in a lot of attractors. I.e., they're contextually similar. If the weight of the link is large and negative, this means that the two nodes rarely co-exist in an attractor -- they're contextually opposite. Whether the nodes have immediate symbolic meaning or not depends on the application -- in typical attractor neural network applications, they do, each one being a perceptible part of some useful attractor.

The point is, in both classic symbolic and classic connectionist knowledge representation systems, one has a mix of locally and holistically defined meaning. The mix may be different in different knowledge representation systems, but there is no rigid division between the two. This fact is important in understanding Webmind, which intermixes "symbolic" style and "connectionist" style knowledge representations freely.

Of course, there are extremes of symbolic AI and extremes of connectionism. There are logic based AI systems that don't have nearly the holistic-meaning aspect of a reasoning-updated semantic network as I've described above. And, there are connectionist learning systems -- e.g. backpropagation neural nets -- in which the semantics of links are way less transparent than in the attractor neural net example I've given above. But this is also an interesting point. I believe that, of all the techniques in symbolic AI, the ones that are most valuable are the ones that verge most closely on global, holistic knowledge representation; and of all the techniques in connectionist AI, the ones that are most valuable are the ones that verge most closely on localized knowledge representation. This is because real intelligence only comes about when the two kinds of knowledge representation intersect , interact and build on each other.

As much of a techno-optimist as I am, even I have to admit, at this point, that building a real AI is a very hard problem. To build a comprehensive AI system, with perception, action, memory, and the ability to conceive of new ideas and to study itself, is not a simple thing. Necessarily, such a system consumes a lot of computer memory and processing power, and is difficult to program and debug because each of its parts gains its meaning largely from its interaction with the other parts. The obstreperousness off this problem can be seen from the litany of brilliant AI failures I’ve just recited. Lenat, Hutchens and Hillis are just about as smart as human beings get, and pursued the real AI goal with diligence and expertise and wisdom – and yet none of them succeeded. De Garis and I are pursuing the same goal, in our different ways, and neither of us has succeeded yet either, although neither of us is ever likely to give up.

Yet, in spite of the obviously huge obstacles that the real AI goal presents, I still believe this is where the AI field should be focusing the bulk of is efforts. For is this not the only approach that can possibly succeed at achieving the goal of a real thinking machine? I realize that this attitude places me in a small minority of AI researchers – but if my team or one of our “real AI” focused competitors should succeed, it wouldn’t be the first time in the history of science and technology that a maverick minority had been proved right.

We now have, for the first time, hardware barely adequate to support a comprehensive AI system. Moore’s law and the advance of high-bandwidth networking mean that the situation is going to keep getting better and better. However, to me and the other “real AI” mavericks of the world, it seems that we are stuck with a body of AI theory that has excessively adapted itself to the era of weak computers, and that is consequently divided into a set of narrow perspectives, each focusing on a particular aspect of the mind.

In order to make real AI work, I believe, we need to take an integrative perspective, focusing on

· The creation of a “mind OS” that embodies the basic nature of mind and allows specialized mind structures and algorithms dealing with specialized aspects of mind to happily coexist

· The implementation of a diversity of mind structures and dynamics (“mind modules”) on top of this mind OS

· The encouragement of emergent phenomena produced by the interaction/cooperation of the modules, so that the system as a whole is coherently responsive to its goals

This is the core of the Webmind/Novamente vision, and the goal I’ve been working towards for my entire research career.