Novamente:

An Artificial General Intelligence in the Making

Ben Goertzel, Cassio Pennachin, Stephan Vladimir Bugaj

March 2002

1          Introduction

This article describes, at a fairly general and nontechnical level, an AI software program called Novamente, or the “Novamente AI Engine.”

This program is still under development, and our ambitions for it are large indeed.  We believe that one day (when completely implemented and tested and after a perhaps substantial teaching period), Novamente will be the world’s first real Artificial General Intelligence – the first piece of software with roughly human-level general thinking power ... and then after that, through consistent goal-directed self-modification, thinking power on a substantially and increasingly superhuman level. 

The definition of “general intelligence” is a big job in itself, and one that has been addressed in one of the authors’ (Goertzel’s) prior research work.  What we mean informally and intuitively, however, is not difficult to impart.  We do not mean the kind of intelligence demonstrated by existing AI programs, which display their capabilities only in very limited, specialized domains.   And we do not necessarily mean closely human-like general intelligence -- the Novamente system is not designed to emulate the human brain, and its early perception/action domains are not intended to emulate human experience.   Rather, what we mean is a system that has can perceive and act in a variety of rich environments, that can control its own behavior, set its own goals, form its own ideas, communicate its own desires.  The Novamente system is intended to ultimately function in a manner that is at the same level of generality and complexity as human intelligence, but is not isomorphic to the human mind in either form or function.

We realize quite well that the claim of having a design for an artificial general intelligence is a big one; and that similar claims have been made before, which have proved overblown.   Because of this, we do not make this claim lightly: the ideas presented here are the result of years of effort on the part of dozens of researchers and engineers with diverse backgrounds and philosophies of AI, and most of these ideas have already been through several cycles of criticism at the hands of various informed individuals.  But even so, we do not of course require that the reader accept that our work has the potential we say it does.  All we ask is that the reader approach the text and the ideas in it with an open and an active mind.

We are often asked how long it will be until the Novamente is “done.”  Hard experience has taught us to be conservative about answering such questions.  Any unprecedented act of creation is bound to come along with unanticipated difficulties, and the Novamente team is presently dividing its time between short-term business-related AI projects and the implementation of the overall Novamente design for general intelligence.  With these caveats, our current estimate is that a first “complete” version of the system may be available sometime in 2004-2005.  This version will have many of the more basic parts of the system working in a well-tested and parameter-tuned way, but some of the subtler and more nondeterministic aspects will still require significant experimentation and tweaking.  Another year beyond that, if things go well, may bring us to a system that is being taught more than it’s being tweaked and parameter-tuned.  However, we’ve overcome enough previous obstacles that if future obstacles delay these dates, we have every intention of continuing the work until success is achieved

It may be a hackneyed metaphor, but we view the creation of real AI as in some ways analogous to the invention of aircraft.   Keeping a human up in the air was argued to be impossible by many learned people for many reasons; some even claimed mathematical proofs of impossibility based on the laws of physics.  Eventually a couple guys just did it, and although their first attempt was rather lousy compared to the airplanes, helicopters, and spacecraft we have now, it nevertheless demonstrated the same basic principles as the machines that have followed it.  

We don’t claim that Novamente is going to be the perfect and ultimate AI; but merely that, when completed, tested, tuned and taught, it will display genuine general intelligence.  Without a doubt this will be the beginning rather than the end of a huge scientific, engineering, human and ultimately transhuman adventure.

2          History and Prehistory

Novamente has a more colorful history than most pieces of AI software.  One of the authors (Ben) is currently working on a trade book (tentative title: Waking Up from the Economy of Dreams) that covers, among other topics, many aspects of this history.  For the present, however, a brief summary version will suffice. 

Conceptually, the  Novamente system is a successor to the Webmind AI Engine designed and constructed at Webmind Inc.  Webmind was the subject of intensive R&D work for several years by an interdisciplinary team numbering nearly 50 at its peak.   In spite of the conceptual similarity, however, Novamente is very different from Webmind both mathematically and in software architecture terms.  

The initial ideas for Webmind came out of Ben Goertzel’s theoretical work from the period 1987-1996, as recorded in the books The Structure of Intelligence (1993), The Evolving Mind (1994), Chaotic Logic (1994), and From Complexity to Creativity (1997) and a host of associated research papers.  What these books provided was a speculative complex-systems-theoretic mixture of philosophy, mathematics, computer science, psychology, linguistics, neurobiology, and other disciplines.  During the period 1994-97 Ben sought to transform the theoretical concepts he’d developed about mind and its physical basis, into a workable design for a general-purpose AI system.

In late 1997, the company Webmind Inc. (initially named Intelligenesis Corp.) was formed, with the dual aims of creating the Webmind AI Engine, and making a profit from software products formed from interim, non-generally intelligent Webmind versions.  When the company dissolved in April 2001, a casualty of the general tech stock crash, it had achieved neither of its two goals.  However, some very useful software products had been created and used in the real world (Webmind Market Predictor, Webmind Classification System); and, a huge prototype Webmind AI Engine had been coded in Java.  The Java Webmind AI Engine didn’t work very well as a software system, due mainly to unfortunate collisions between its design (as a generalized actor system) and the memory management properties of the Java Virtual Machine.  But a huge amount was learned via its creation, and via the experimentation that we did with it.

The Webmind AI Engine created at Webmind Inc. started out with Ben Goertzel’s ideas, but by early 2001 it had changed significantly from its initial form.  These changes did not alter the basic philosophical or conceptual nature of the system, but they certainly changed the underlying software structures, and significantly enhanced the underlying mathematical structures.  Very many individuals in the Webmind Inc. R&D group contributed to these changes, but chief among them were Cassio Pennachin (Webmind Inc. VP of AI Development) and Pei Wang (Webmind Inc. Director of Research, whose NARS system is the authors’ second favorite AI system).   Among others who contributed very significantly were Webmind Inc. co-founder Ken Silverman, in the early days; Anton Kolonin, the Siberian Madmind; Jeff Pressing, always an insightful critic, and the co-inventor of the PTL inference approach used in Novamente; and others such as Cate Hartley, Andre Senna, Thiago Maia, and Stephan Vladimir Bugaj. 

After Webmind Inc. folded, the authors and a half dozen determined to continue with the Webmind AI project full-time, in spite of the temporary lack of funding.  It was decided to re-code the AI Engine from scratch, using C++ this time.  And along with this reimplementation, it was resolved to simplify the system design as much as possible, based on all the lessons learned over the past few years.   This quest for simplification led to much greater revisions than had been anticipated, including a complete reformulation of the mathematical foundations of the system.

At that time, to reflect the new effort, and the new architecture, as well as to avoid any confusions with the defunct company, we decided to rename the system Novamente.  The new name reflects the largely Brazilian composition of the science/engineering team, and it has two fitting, Portuguese, meanings. Novamente as a single word means “one more time”.  As the joining of the words “nova” and “mente” it is Portuguese for “new mind”.

Occasionally in this article we will wish to contrast Novamente with the Webmind AI Engine.   When we do so we will use the term Webmind 2000, to refer to the Webmind Inc. Java Webmind AI Engine, in its final form as of March 2001. This system wasn’t really finished in 2000, and it probably never will be finished, unless some avid historian comes along sometime in the future.

3          The Underlying Philosophy of Mind

Unlike many other approaches to AI, Novamente originally emerged from a “philosophy of mind” perspective rather than an algorithmic or mathematical perspective.  The algorithms and mathematics have been devised to fit the philosophy of mind, along with the realities of implementation on contemporary computer hardware. 

Much of the philosophy underlying Novamente can be found in the pre-Webmind work of Ben Goertzel and Pei Wang (which of course relies heavily on a large amount of prior thinking by others).   These pre-existing philosophical concepts were significantly deepened and enriched during the course of Novamente AI development – by the authors, by Pei Wang, and by the collective mind of the Webmind Inc. AI Development division.   And a few critical conceptual insights were born only out of the destruction of Webmind Inc. and the remaining team’s subsequent soul-searching about how to simplify the Webmind system as much as possible to enable plausible implementation with their greatly reduced numbers.

In this section, we will briefly give our reactions to the standard objections to “Strong AI.”   We will then review the basic ideas on key concepts like “intelligence” and “mind” that underlie the Novamente system.  The psynet model of mind, a complex-systems philosophy of mind critical in the design and development of the Novamente system, will be briefly reviewed, and the key aspects of the Novamente design will be surveyed from a psynet model perspective.  The critical notion of “experiential interactive learning” will be presented, and finally, some of the more futuristic speculations that have been inspirational to Novamente design in various ways will be discussed.

3.1         Defending “Strong AI”

Part of what is normally covered in discussions of “AI theory” is the foundational question of whether AI is possible at all.  Obviously, the reader already knows what the authors believe on this topic.   We are avid advocates of what is called “Strong AI.”  We think the AI skeptics are wrong, for one reason or another, and that human-level artificial general intelligence is achievable using current technology (though not necessarily human-analogue AGI, a subtle but very important distinction).  We would like to see the skeptics explain to us in detail why they think Novamente or another similar system can’t possibly work. 

We won’t delve into this too deeply here, but it seems worthwhile to give our point of view on some of the more common objections.  We will consider, in the next few paragraphs, several of the more common anti-AI philosophies.   Perhaps our specific analysis of the common objections to the possibility of AI will be somewhat revealing of the philosophy underlying the Novamente AI Engine. 

Firstly, there are various theological arguments against AI, which center around one or another form of the idea that only creatures granted minds by God can possess intelligence.  This may be a common perspective, but isn’t really possible to discuss it in a scientific context.  This a purely theological discussion, and even in theological circles, there is no consensus about this particular dogma.  Some of the more liberal churches might be willing to accept that the pursuit and creation of intelligence by humans is not in opposition to their concepts of God and his power and works.. 

A different take is the notion that digital computers can’t be intelligent because mind is intrinsically embodied in neural quantum phenomena.  This is actually a claim of some subtlety, because David Deutsch (Deutsch, 1985) has proved that quantum computers can’t compute anything beyond what ordinary digital computers can.  But still, in some cases, quantum computers can compute things much faster on average than digital computers.  And a few mavericks like Stuart Hameroff (Hameroff, 1998) and Roger Penrose (Penrose, 1994) have argued that non-computational quantum gravity phenomena are at the core of biological intelligence.

Of course, there is as yet no solid evidence of cognitively significant quantum phenomena in the brain.  But a lot of things are unknown about the brain, and about quantum gravity for that matter, so these points of view can’t be ruled out.  However, no work has yet been presented  which gives a compelling argument that the capacity for general intelligence, or even human-style intelligence, is restricted to systems that  rely upon quantum computing. 

Our own take on this is: it’s possible (though unproven) that quantum phenomena are used by the human brain to accelerate certain kinds of problem solving;however digital computers have their own special ways of accelerating problem solving, such as super-fast, highly accurate arithmetic.  Thus, the appeal to quantum theory has a higher probability of validity if one views Artificial Intelligence as the attempt to emulate, or replicate, human minds.  We don’t currently pursue that goal, and are perfectly happy with the notion that our Artificial Intelligence will differ from humans in numerous aspects, some of which may possibly be a consequence of the lack of quantum phenomena.  As it is not even our goal to replicate the mind of a human, let alone the brain of a human, refutations based on specific physical elements of the human brain are not directly relevant to our quest, unless they are explicitly augmented with arguments that high-level cognitive intelligence is unimplementable in a non-brainlike system.  So far, we have seen no work in this direction which has the specifics needed to dissuade us from pursuing strong AI on networks of von Neumann computers.

Now, on to what we consider the most interesting objection to human-level artificial general intelligence.  Some thinkers have argued that, even if it’s possible for a digital computer to be intelligent, there may be no way to figure out how to make such a program except by copying the human brain very closely, or by evolving one running a prohibitively time-consuming process of evolution roughly emulating the evolutionary process that gave rise to human intelligence. Whether this simulation would be any more likely to create human-like AI than our approach, both being necessarily far from a perfect analogues of the way brains and minds evolved in humans, is questionable. We don’t currently have the neurophysiological knowledge to closely copy the human brain, and simulating a decent-sized primordial soup on contemporary computers is simply not possible.   So, by these arguments, AI may not be possible for a long while.

Out of the whole field of skeptical arguments, this might be the most plausible one (at least until more is understood about the relationship between quantum computing, neurophysiology, and general principles of intelligence).  However, we believe one can get around the “hardness of AI design” problem by using a combination of psychological, neurophysiological, mathematical and philosophical cues to puzzle out a workable architecture and dynamics for machine intelligence. 

However, this objection does bring up an interesting point: As mind engineers, we have to do a lot of the work that evolution did in creating the human mind/brain.  An engineered mind like the Novamente AI Engine will have some fundamentally different characteristics from an evolved mind like the human brain, but this isn’t necessarily problematic since our goal is not to simulate human intelligence but rather to create an human-level intelligent digital mind that knows it's digital and uses the peculiarities of its digital nature to its best advantage.  (Though, Novamente’s mind will become less “engineered” over time, through its evolutionary/emergent, and deliberate/reasoned self-modification properties.)

A related objection is that posed by Hubert Dreyfus in his excellent book What Computers Can’t Do   (Dreyfus, 1992).  Coming from a Continental philosophy perspective, Dreyfus argues that intelligence isn’t just about minds, it’s about bodies, and computer programs without bodies can never be intelligent.  This isn’t an anti-robotic-AI argument, just an anti-disembodied-AI argument.  We believe there is a lot of truth to this perspective; however, we also believe that Dreyfus assesses the nature of embodiment a little too narrowly.   We have sought to analyze very carefully what a mind obtains from being attached to a body; and this has led to what we call the Experiential Interactive Learning approach, a crucial part of the Novamente AI philosophy.  In short, we believe that a mind does need to be embedded in an information-rich perception/action domain, in which it can interact with other minds.  But we believe there are many ways to achieve this other than close emulation of the human body.  

Part of our work will be to determine how similar embodiment must be for an AI to appear intelligent to humans, though this is as much as issue of human perception of intelligence as the nature of intelligence itself.  It is possible, using a non-anthropomorphic approach to AI, to create an AI which even its creators may not recognize as truly intelligent.  However, since we do not consider a successful Turing Test to be a necessary measure of general machine intelligence – because we neither consider human-analogue intelligence to be the only feasible kind of high-level cognition nor feel such a subjective test is adequate – we would need to develop our own methodically testable criteria for general intelligence.  In the (likely very, very long!) time prior to developing such rigorous, testable criteria, like Turing, we are left with our own subjective measure of observable intelligence and to try to make it as rigorous as possible.  This criteria, and how we formalize it to use in our experimentation, will be discussed in more detail below but can be summed up as: “the ability to achieve complex goals s in a complex environment” Or – a bit more cumbersomely but also more precisely- as: “the ability, without being given task-specific modifications by a human programmer, to perform complex tasks of its own volition, which have not necessarily been performed previously, in a complex environment, which has not necessarily been previously experienced.”

Finally there are objections such as Searle’s Chinese Room argument (Searle, 1980), which argue that the notion of an engineered mind is somehow ill-founded.  Searle argues that a program translating Chinese into English via a huge lookup table could act as if it knew Chinese without really knowing Chinese.  Similarly, a program carrying out conversations via a huge lookup table could act as if it were intelligent without really being intelligent.   This argument effectively questions the meaningfulness of an interspecies definition of intelligence.  It asks us, among other things: Is intelligence just about manifested behavior, or is does it have to do with achieving certain behaviors using limited resources?   What is the intelligence in Searle’s thought-experiment: is it the program or is it the committee of humans who created the posited lookup table?  We will return to this sort of foundational question of the nature of intelligence a little later.  But our overall reaction to this kind of anti-AI argument is that, while it surely stimulates the mind, it doesn’t prove much of anything except that defining concepts like “intelligence” and “mind” in a conceptually bulletproof way is pretty tricky.  We can’t define “beauty” in a fully philosophically sound way and yet we can engineer beautiful things; the same, we believe, will be true of “intelligence.” 

 In essence, Searle’s argument merely states that computer scientists may be able to create something which behaves exactly like an intelligence, but isn’t one.  While this view would have ramifications for what political rights and freedoms were granted such a machine in the human world, it has no real bearing on whether or not any particular engineering approach could create this intelligence-like thing he precludes calling intelligence.  We will call it an AI because we will use a pragmatic rather than theological or ontological definition of intelligence.  In terms of formal philosophy, our point of view is tied with Charles S. Peirce’s pragmatism, which states that the nature of an entity is contained entirely in its measurable properties.

3.1.1   What about Consciousness?

We’ve saved the most frustrating anti-AI argument for last.  Perhaps the most common objection to strong AI is the objection by reference to consciousness.   This of course has a good bit of overlap with some of the other objections mentioned above, especially the ontological complexity of defining consciousness.  Various skeptics will posit that consciousness is granted by God, or emerges only from quantum or quantum gravity phenomena, or is tied with the physical body and its chemical processes, etc.

Our chief reaction to the “consciousness objection” is the same as our objection to Searle’s Chinese Room argument.  Yes, “consciousness” is a very thorny concept.  But it’s pretty thorny in human life too.  The problem of solipsism has never been solved satisfactorily.  How can I know that anyone else, even my own wife or children, is really aware, and not just an automaton placed there by an evil scientist to fool me?  Consciousness essentially hinges upon introspection, and the unspoken contract that each human will gregariously allow for the assumed consciousness of the other (I think, therefore so do you).  If we can’t solve the problem of whether other humans are conscious in a philosophically bulletproof way, why should we expect to be able to solve the problem of consciousness for intelligent computer programs? 

It could be argued that ultimately brain science will allow us to resolve the mystery of how brains become conscious, and that this answer will help us to understand computer programs and whether or not they can be conscious.  Surely, there is some truth to this: unraveling the neural roots of attention and awareness will be fascinating and informative.  But we suspect that no “magic ingredient” is going to be found in this way.  Rather, we posit that consciousness in the brain will be found to be a complex emergent process tied in to many different subsystems, and the link between subjective experienced consciousness and its neural roots will remain subjectively quite mysterious.  Furthermore, what is found in human brains will apply specifically to human brains.  While these are the only examples of cognitive systems we humans are willing to call intelligent and conscious (the jury is still out on other primates, whales, elephants, extraterrestrial intelligences, etc.), we can not simply assume that no other system could ever possibly embody high-level intelligence just because we’ve not seen (or believed) the example.

Beyond this judicious sidestepping of the problem of consciousness, one perspective that we have found useful is the distinction between hot and cold aspects to consciousness, as emphasized in the work of the cognitive psychologist George Mandler (Mandler, 1985).

The cold or cognitive aspect to consciousness refers to a certain set of mind structures relating perception, action and memory – this is what psychologists currently call attention.   Attention can be empirically studied in humans, and the structures underlying human attention can be embodied in software, at various levels of abstraction.   We have carefully studied what is known about the structures and dynamics underlying human attention and are applying these notions at various functional levels of the system.

On the other hand, there is the hot or experiential aspect of consciousness, which is addressed extensively in Chinese, Indian, and modern European philosophy, and is ultimately not a scientific notion, but more of a human, subjective notion.  Whether to consider another human or another computer program conscious in this sense is perhaps more of an existential decision than a scientific one.  Stone Age man habitually made the existential decision to consider trees , non-human animals, and the Earth itself conscious.  Today few of us consider trees conscious, largely because we relate to them in a different sort of way than our predecessors did, and intense debate rages regarding the nature of animal minds and the changing nature of their relationship to humans.  In the future, as computers display greater intelligence and more complex forms of attention, we will likely come to relate to them in ways that will induce us to subjectively consider them as “conscious”

Finally, it should be mentioned that the psynet model of mind to be presented in Section 4 below -- one of the primary conceptual foundations of Novamente -- bears a significant resemblance to the statements of various mystical philosophers about the way the mind feels.   This is not particularly relevant from the point of view of the present article, but for the interested reader, a discussion of the psynet model of mind from the perspective of consciousness and spirituality can be found in Goertzel’s rough-draft online manuscript The Unification of Science and Spirit (Goertzel, 1996)

3.2         Intelligence and Mind

Although the complete, philosophically unassailable definition of “intelligence” and “mind” is a difficult and probably unsolvable problem, nevertheless, as AI engineers we need practical working definition.   This section reviews our practical working definitions of intelligence and mind. 

3.2.1   Goertzel and Wang on Intelligence

As you have already seen, we do not believe that an intelligent computer mustprecisely simulate human intelligence. The Novamente AI Engine won’t ever do that, and it would be unreasonable to expect it to, given that it lacks a human body.  The Turing Test -- “write a computer program that can simulate a human in a text-based conversational interchange” (Turing, 1950) -- serves to make the theoretical point that intelligence is defined by behavior rather than by mystical qualities, so that if a program could act like a human, it should be considered as intelligent as a human (the Chinese room arguments attempt to refute this claim). However, Turing’s test, as useful as it may have been to establish this theoretical notion, is not useful as a guide for practical AI development.

Second, we don’t have either an IQ test or a rigorous test of basic general intelligence for the Novamente AI Engine.  The creation of such a test might be an interesting task, but it can’t even be approached until there are a lot of intelligent computer programs of the same type.  IQ tests work fairly well within a single culture, and much worse across cultures – how much worse will they work across species, or across different types of computer programs, which may well be as different as different species of animals?    But fortunately, we don’t need an IQ test yet.  What we need is just a simple working definition to guide practical progress. While eventually we would like to have a formal test of general intelligence (both because it’s theoretically fascinating and as a further claim against skeptics), and an AI IQ measurement test, should we fail to create sufficient tests we believe that our working definition will suffice for engineering and the proof will come if we create a system which exhibits enough general, flexible intelligence to “hold its own” amongst humans in its embodiment context.

In Goertzel (1993), a simple working definition of intelligence was given, building on various ideas from psychology and engineering.  It was formalized mathematically, but verbally, it was simply as follows:

Intelligence is the ability to achieve complex goals in a complex environment

The Novamente AI Engine work was also motivated by a closely related vision of intelligence provided by Pei Wang (Wang, 1995).   Wang posits, basically, that

Intelligence is the ability to work and adapt to the environment
 with insufficient knowledge and resources.

More concretely, he believes that an intelligent system is one that works under the Assumption of Insufficient Knowledge and Resources (AIKR), meaning that the system must be, at the same time,

  • a finite system --- the system's computing power,  as well as its working and storage space, is limited;
  •  a real-­time system --- the tasks that the system has to process, including the assimilation of new knowledge and the making of decisions, can arrive at any time, and all have deadlines attached with them;
  •  an ampliative system --- the system not only can retrieve available knowledge and derive sound conclusions from it, but also can make refutable hypotheses and guesses based on it when no certain conclusion can be drawn;
  •  an open system --- no restriction is imposed on the relationship between old knowledge and new knowledge, as long as they are representable in the system's interface language;
  •  a self­-organized system --- the system can accommodate itself to new knowledge, and adjust its memory structure and mechanism to improve its time and space efficiency, under the assumption that future situations will be similar to past situations.

Obviously, Wang’s and Goertzel’s definitions have a close relationship. 

Goertzel’s “complex goals in complex environments” definition is purely behavioral: it doesn’t specify any particular experiences or structures or processes as characteristic of intelligent systems. 

On the other hand, it may well be that certain structures and processes and experiences are necessary aspects of any sufficiently intelligent system.  In fact, our guess is that the science of 2050 will contain laws of the form: Any sufficiently intelligent system has got to have this list of structures and has got to manifest this list of processes.  

A full science along these lines is not (in our view) necessary for understanding how to design an intelligent system.   But we do need some ideas along these lines in order to proceed toward human-level artificial general intelligence today, and Wang’s definition of intelligence is a step in this direction.  Wang posits that, for a real physical system to achieve complex goals in complex environments, it has got to be finite, real-time, ampliative and self-organized – and we suspect that this hypothesis is true.  It might well be possible to prove this mathematically, but this is not the direction we have taken; instead we have taken this much to be clear and directed our efforts toward more concrete tasks.  Since Wang’s definition is not mathematically proved, and does not rigorously specify that some particular implementation structure is necessary to implement these principals, we have instead embarked upon a program of experimentation with which we hope to further guide the theory through reflection upon our hoped-for engineering successes.

3.2.2   Novamente’s Particular Goals

What do we mean when we say that we intend the Novamente AI Engine to be a human-level artificial general intelligence, a “real AI”?   Roughly speaking, based on the concepts from the previous sections, what we mean is that it will be capable of achieving a variety of complex goals in the complex environment that is, in its first implementation, the Internet.  This will necessarily be done using finite resources and finite knowledge.    Of course, this kind of abstraction is not in itself very satisfying.  But to go beyond it and get concrete, one has to specify something about what kinds of goals and environments one is interested in. 

In the case of biological intelligence, the key goals are survival of the organism and its DNA (the latter represented by the organism’s offspring and its relatives). These lead to sub-goals like reproductive success, status amongst one’s peers, etc., which lead to refined cultural sub-goals like career success, intellectual advancement, and so forth.   The external goal of survival gets internalized into personal goals which are believed to enhance survivability, and in social animals, into societal goals which enhance the chances of group survival.  An intelligent computer program, however, will emerge into an environment very different from the Veldt. 

The evolution of machine intelligence will take place in an environment of machines, data, and interaction with humans – the forces of nature initially only acting indirectly on the AI through the human world in which its world is embedded.  Furthermore, it is a knowable fact that another intelligence (humans) will play a signifigant role in shaping the goals of AI’s, at least initially.  Thus, we can both plan the initial goals of AI’s as we are building them and tune the initial parameters of the evolutionary processes which will guide their expansion.  Beyond that, evolutionary forces related to, but not equivalent to, those which shaped our present world will serve to shape the new world of humans and AI systems. 

In our case, some of the goals that the Novamente AI Engine version 1 is expected to achieve are listed below.  Some of these goals are general cognition goals for the system, while others are related to the initial areas of application of the system.  By building, and enhancing through evolutionary and experiential-interactive-learning processes, the general cognition and specialist capabilities of the system in parallel, we are taking advantage of the strengths of computers to be given rigorous, task specific procedural information which will serve both to meet immediate needs and to provide input procedural and declarative domain knowledge which can be used as exemplars for training and testing general cognition.  The non-exhaustive list of primary, high-level goals is as follows:

1.       Conversing with humans in simple English, with the goal not of simulating human conversation, but of expressing its insights and inferences to humans, and gathering information and ideas from them.

2.       Learning the preferences of humans (and other AI systems), and providing them with information in accordance with their preferences,  clarifying their preferences by asking them questions about them and responding to their answers, and guiding its own internal thought processes to focus sufficient attention on things which are important to its social context of its human and AI peers that it can succeed in its social tasks.

3.       Communicating with other Novamentes, in a manner similar to its conversations with humans, but using a Novamente-customized formal language called KNOW.

4.       Composing knowledge files containing its insights, inferences and discoveries, expressed in KNOW or in simple English.  (Similar to a human professional writing a white paper about her area of expertise, though written in a more formal language or restricted subset of English.)

5.       Reporting on its own state (in a contextually appropriate manner, not merely through diagnostic measurements), and modifying its parameters based on its self-analysis to optimize its achievement of its other goals.

The primary application goals of the Webmind AI Engine were:

1.       Predicting economic and financial and political and consumer data based on diverse numerical data and concepts expressed in news, statistical surveys, and other natural language and analytical reports relevant to socioeconomic trends.

2.       Enabling intelligent Internet-wide information retrieval via natural language conversation based search

While this sort of application is still of general interest to us, at the moment we have shifted our application focus primarily to the biology domain, concerning ourselves with the analysis and prediction of functional and structural traits of organisms, including pathogens, based on numerical genomics and proteomics data and associated natural language and analytical reports found in journals and other publications.  We are presently doing some practical work, with a partial version of the Novamente system, analyzing gene expression data derived from gene chips and spotted microarrays, in the context of information derived from various biological databases.

    

In case these aren’t ambitious enough, subsequent versions of the system are expected to offer enhanced conversational fluency, enhanced abilities at knowledge creation, including theorem proving, scientific discovery and the composition of knowledge files consisting of complex discourses, and additional modalities and distribution of perception.  Ultimately we hope our system will achieve the “holy grail” of AI: progressive goal-directed self-modification, leading to exponentially accelerating artificial superintelligence! 

Our aim is to work toward these lofty goals step by step, beginning with a relatively simple “Baby Novamente” and teaching it about the world as its mind structures and dynamics are improved through further scientific study and engineering experimentation.

Are these goals complex enough that the AI Engine should be called intelligent?  Ultimately this is a subjective decision.  Our belief is, not suprisingly, yes.  Novamente will not be an expert system such as achess-playing program or a medical diagnosis program, which is capable in one narrow area and ignorant of the world at large.  This is a design for general, autonomous intelligence – a program that studies itself and interacts with others, that ingests information from the world around it and thinks about this information, coming to its own conclusions and guiding its internal and external actions accordingly.  

 How smart will a completed Novamente be, qualitatively?  Our intuitive sense is:

  • The first version will be significantly less intelligent  than humans overall, though smarter in many particular domains – particularly those which play to the strengths of computers such as numerical data analysis and formal logic.
  • Within a few years from the first version’s release there may be a version that is competitive with humans in terms of overall intelligence capacity, though not isomorphic with humans in terms of intelligence focus and capability
  • Within a few more years there will probably be a version dramatically smarter than humans overall, with a much more refined self-optimized design running on much more powerful hardware (should our work in giving the system advanced self-organization and self-optimization capabilities succeed).  

But, although we do present them, here we feel that such speculations are not really all that valuable.  The important thing is to do the work and see where it leads – though not to do so blind of potential consequences, which is why we remain active in the development of ideas such as Sasha Chislenko’s (1998) theory of Hypereconomic Fairness, and Eliezer Yudkowsky’s (2001) work on “Friendly AI.”

3.3         Pattern, Intelligence and Mind

Part of the early inspiration for Webmind and Novamente was a formalization of the notions of pattern, intelligence and  mind, presented in Goertzel’s research treatises referenced above.   Overall, the role of these formalizations in practical Novamente work has been critical , though not thoroughgoing.  Most parts of the system have been developed without explicit heed paid to such general formalizations (although they are always there in the background to guide thinking).  But there are a few very important places where these formal notions have played an explicit role, as will be seen as we discuss important elements of Novamente that deal with the system’s “feelings”, with complex logical relations, and with the control of the system’s dynamics.

Goertzel’s formal definition of pattern relies on concepts from algorithmic information theory, which become slightly intricate in theory, though they’re simple enough to implement in practice using approximate algorithms and heuristics.  But the intuitive essence of the definition of pattern is very simple:

A pattern is a representation as something simpler

In other words: a pattern in X is something that is judged to be roughly substitutable for X in some context, but is also judged simpler than X.   Of course, assessments of equivalence and simplicity are to some extent subjective, and hence so is the “pattern” concept.  But this seems unavoidable.  As will be seen, all concepts in finite algorithmic information theory are basically subjective anyway, in that they’re only meaningful relative to some reference Universal Turing Maching. The subjectivity in our working definitions and criteria will be reduced, however, through this manuscript to relevance within the context of the mathematical theory presented herein.  What ungrounded subjectivity remains, we will openly admit, and hope to close through the methods of interplay between experiment and theory as we have described.

A related concept to “pattern” is that of “emergence.”  This is a concept that has existed in philosophy for millennia in various forms, but that has become explicitly important to science only in recent years with the advent of “complexity science.”  Scientists studying physical, chemical, biological, psychological and social systems have observed many examples of system properties that are not transparently reducible to properties of system parts, but appear rather to “emerge” from the system as a whole.  We believe this concept of emergence is as critical to the study of artificial minds as it is to other areas of science (including the study of biological minds). 

Emergence is a critical notion for Novamente, because one of the main ideas of the system is that important aspects of knowledge and behavior only exist as emergent patterns among the various components of the system.  This is the most important sense in which Novamente is both an engineered system which acts according to formal laws and design choices, and an “evolutionary” one which changes and adapts to (and emerges from) its environment – the use of explicit genetic programming techniques in the system is but one implementation element relevant to this general principle.

Roughly speaking, we may formalize the notion of emergence by saying that an emergent pattern is a pattern in the combination of A and B, that is not a pattern in A or B individually.  And this allows us to create a formal conceptualization of the notion of “mind”:

A mind is the set of patterns in a system, or emergent between the system
 and its environment, that are related to the system’s intelligence

This is obviously a highly pragmatic, non-mystical definition of “mind.”  It doesn’t address issues of consciousness or experience.  But it does inherit the generality of algorithmic information theory: it’s not tied to humans, or Novamentes, or any other particular type of intelligent system.

According to this definition, the pattern-building heuristics that are active in Novamente are explicitly involved in working to increase the amount of mind Novamente has.

4          From the Psynet Model of Mind to Novamente

One of the chief inspirations for the Novamente AI design is a complex-systems-oriented philosophy of mind developed by Ben Goertzel during the period 1987-96, and presented in (Goertzel, 1993, 1993a, 1994, 1997).   In (Goertzel, 1997) this philosophy was named the psynet model of mind.  

This section briefly reviews some of the key ideas of the psynet model, without any pretense of completeness, and with a distinct Novamente bias.  It then explains how Novamente’s key data structures and dynamics emerge fairly naturally from the psynet model.

4.1         Some Psynet Principles

The psynet model draws on numerous sources in the history of philosophy; but out of all of these, perhaps the most significant influence was that of  the late 19’th century American thinker Charles S. Peirce.  Never one for timid formulations, Peirce declared that:

Logical analysis applied to mental phenomena shows that there is but one law of mind, namely, that ideas tend to spread continuously and to affect certain others which stand to them in a peculiar relation of affectability. In this spreading they lose intensity, and especially the power of affecting others, but gain generality and become welded with other ideas.

This is an archetypal vision of mind that we call "mind as relationship" or "mind as network."  In modern terminology, Peirce's "law of mind" might be rephrased as follows: "The mind is an associative memory network, and its dynamic dictates that each idea stored in the memory is an active agent, continually acting on those other ideas with which the memory associates it."  

Others since Peirce have also viewed the mind as a self-organizing network of course.  Marvin Minsky has famously conceived it as a “society” and theorized about the various social mind-agents that cooperate to form intelligence (Minsky, 1988).   Some of the work done under the guise of “agent systems” (Weiss, 2000, Wooldridge, 2001) is relevant here, although most of this work pertains to agents interacting economically or in simple collective problem-solving contexts, rather than systems of agents cooperating to yield emergent intelligence. There is some current agent systems work, especially in the area of Ant Colony or Swarm intelligence research (Bonabeau, 1999), which seeks to create a higher level of emergent intelligence from simpler agents, but not a Novamente-level of intelligence. Unlike Minsky’s Society of Mind, however, the psynet model’s particular take on the self-organizing-agent-system view of the mind is strongly influenced by recent work in “complexity science.”   The focus is not primarily on agent interactions and the properties of individual agents, but on the emergent structures that arise from the interactions of particular types of agents – mental complexity, mental emergence, mental self-organization.  The psynet model is more Peircean than Minskian, but it adds an awful lot of details that Peirce overlooked. 

The psynet model is not the kind of philosophy that can be easily summarized in a compact set of principles.  It comes from a philosophical tradition (inspired by Peirce, Nietzsche and others) that eschews this kind of systematization at the fundamental philosophical level.   But it is obviously possible to give an approximate summary of some key points of the psynet model, and we will do so here.  According to the psynet model of mind:

1.       A mind is a system of agents or "actors" (our currently preferred term) which are able to transform, create and destroy other actors

2.       Many of these actors act by recognizing patterns in the world, or in other actors; others operate directly upon aspects of their environment

3.       Actors pass attention ("active force") to other actors to which they are related

4.       Thoughts, feelings and other mental aspects are embodied in self-reinforcing, self-producing, systems of actors, which are to some extent useful for achieving the goals of the system

5.       These self-producing mental subsystems build up into a complex network of attractors, meta-attractors, etc.

6.       This network of subsystems and associated attractors is "dual network" in structure, i.e., it is structured according to at least two principles: associativity (similarity and generic association) and hierarchy (categorization and category-based control).

7.       Because of finite memory capacity, mind must contain actors able to deal with ”weakly grounded” or “ungrounded” patterns, i.e., actors which were formed from now-forgotten actors, or which were learned from other minds rather than first hand – this is called "reasoning." (Of course, forgetting is just one reason for abstract, concepts to occur in the mind.  The other is generalization -- even if the direct (strong) grounding materials are still around, abstract concepts ignore the historical relations to them.)

8.       A mind possesses actors whose goal is to recognize the mind as a whole as a pattern and to make the non-autonomic decisions about the functioning of the mind (such as high-level focus) – these are "self."

According to the psynet model, then, the mind is at bottom a system of actors interacting with each other, transforming each other, recognizing patterns in each other, and creating new actors embodying relations between each other.   Individual actors may have some level of independent intelligence, but most of their intelligence lies in the way they create and use their relationships with other actors, and in the patterns that ensue from multi-actor interactions.  These emergent patterns are where we believe higher-level cognition will be primarily embodied in the Novamente system.

You will note that the concept of “pattern” comes up a lot in the psynet model.  It is thus instructive to apply the rough definition of pattern as “simplified representation” given above, in a psynet model context.  What it tells us is that the mind-actor system posited by the psynet model is not just a pool of actors playing around with each other, but a pool of actors that are looking at each other and trying to create new actors that are simplified versions of other actors or sets of actors, or that represent patterns among a large number of other actors (and, because perception in the system is carried out by, and the perceived signals are encoded as, actors, in the environment as well).   Simplicity and complexity have an interesting relationship here: The more the mind’s parts simplify each other, the more complex the mind as a whole can become.  This is because abstraction and generalization are performed by simplifying from the particulars, with varying degrees of rigor depending upon the methodology (and creativity / autonomous reasoning is, in part, abstraction of and application of abstract notions to particulars of procedural knowledge).

 

4.2         The Need for a Multilayered Novamente Implementation

It is obvious from even this capsule summary of psynet principles that a psynet-based AI system would best be implemented on a massively parallel hardware substrate.  The most natural psynet implementation is one in which each actor has its own memory and processing power.  However, given the state of contemporary computing hardware, this is simply not plausible.  This leads one to develop AI software architectures with at least two layers:

  • An actor system layer, providing a “Mind OS” allowing a system of actors to act and interact freely, “as if” they were on a massively parallel substrate
  • A collection of particular types of actors, running as “mind processes” atop the Mind OS

Getting this layering “right” has been a major engineering challenge.  One is always trapped between the Scylla of making an overly general Mind OS that is inadequate performance-wise, and the Charbydis of making an overly specialized Mind OS that doesn’t allow the actors sufficient flexibility for self-organization and growth.  While this is only one of the many dilemmas faced, it is one of the most important high-level concerns we have kept in focus during our engineering and experimentation.

Webmind 2000 provided a very general Mind OS bottom layer, suitable for implementing any kind of software actor.  There was an intermediate layer, specializing the general Mind OS for “mind actors,” one particular abstract actor type of the innumerable that could have been created atop the foundational layer.  Finally, at the highest Mind OS layer, there were the specialized actor types chosen to constitute the mind.  This was a very elegant way of proceeding, but turned out to be prohibitively inefficient, partly because Java was the wrong language to choose for implementing such a system, and partly because of the inefficiency inherent in such generality. 

The new Ccore has only 2 layers, and its Mind OS layer is fairly specialized to the particular kinds of “mind actors” being run on it, as it was designed to support our model of general intelligence, not a model of general agent systems.   Many of the details of Mind OS operation have also been changed.  Most notably, the approach to scheduling (determination of which actors get CPU time when) has been totally revised.   The main point for now, however, is the way that the broad strokes of the Novamente software architecture come directly from the basic principles of the psynet model of mind.

4.3         Novamente’s Mind-Actors

The psynet model of mind tells one a lot about what kind of multi-actor system one should build in order to create a mind - but it doesn’t go nearly far enough.  The big question is: what kinds of actors to throw into the soup?  This topic is really the meat of the Novamente AI design.

The conceptual work done by the Webmind Inc. AI Development team, and the Novamente team after it, has fallen mainly into two parts:

1.       Figuring out how to make the Mind OS reasonably efficient yet adequately powerful and expressive to embody and support the necessary mind actors

2.       Finding a minimal collection of actor types that, collectively, can do everything a Novamente needs to do (and with enough efficiency to allow the system to meet Pei Wang’s real-time criteria for intelligence)

The former area of work lies largely in the area of implementation and so it won’t be covered that thoroughly here, though some of the major lessons learned will certainly be discussed.    The latter area is really the crux of the “mind design” underlying Novamente.

One of the biggest subtleties of this work, in retrospect, has lain in the interrelationship between these two tasks.  Until the required collection of actor types was known, the optimal design of the Mind OS couldn’t be known; but until a reasonably efficient Mind OS existed, various combinations of actors couldn’t be experimented with.  Thus we proceeded iteratively, adjusting the Mind OS based on the currently hypothesized actor type set, then learning more about these actors, then adjusting the Mind OS again.  The result was that the Webmind Inc. Webmind AI Engine versions rarely functioned adequately as a whole, although they were often operated successfully in “restricted modes” involving only a handful of actor types.  On the other hand, the amount learned through this iterative process was incredible- with a project of this magnitude and originality, this iterative “trial and error” exploration process was unavoidable (and, like similarly ambitious projects, resulted in a lot of good work in related areas along the way – such as Webmind’s various technical successes in the areas of search, categorization, genetic programming, financial prediction, and probabilistic NLP).

At this point, with all this experimentation behind us, we feel we have a good handle on the actor types required to make a working Novamente.  Here what will be presented is the summary of a summary, designed to get across the conceptual essence of the assemblage of Novamente actors without any of the details.

4.3.1   Nodes and Links

The actors in Novamente are called “atoms,” and there are two important types of atoms: nodes and links.  Links are also called relationships; we will use these terms interchangeably here.   Atoms are a basic conceptual and implementational unit in Novamente.

Roughly speaking a relationship (link) is a typed n-tuple of other entities (which may be nodes or relationships). 

And, roughly speaking, a node is a typed entity that is involved in relationships.   Some nodes also contain data, such as numbers, strings, or small computer programs called “schema”; others don’t contain anything. 

Nodes and links come along with two important packages of numerical information:

  • Truth value information, which indicates the “frequency” with which the relationship actually holds, or with which the node is relevant to the world.  (This can be a single number, a set of numbers, or an approximation of a probability distribution.  Novamente does not make higher-level decisions in a Boolean mode, rather, it uses a probabilistic logic and truth values can be represented as a probability distribution or an approximation thereof.)
  • Attention value information, telling the Mind OS how enthusiastically the atom should be dispensed memory or CPU time (or whether it should be deleted altogether)

Within this framework the problem of finding “the right combination of actor types” is made more specialized.  One now wants to find the right combination of node and link types.  It is the replacement of generic “mind actors” with nodes and links as described above that is the key step in moving from the generic psynet model of mind to the Novamente AI design. 

The choice of nodes and links of this nature as an embodiment of “mind actors” was made very carefully (in 1996, by Ben Goertzel, prior to the formation of Webmind Inc.), and represents a sort of compromise between the two extreme poles of contemporary AI: neural networks and formal logic.   Neural networks are graphs made of very simple nodes and links, which have attention values (most simply, single-number “activations”).   Semantic networks, one manifestation of logic-based AI, are graphs made of typed nodes and links, which have truth values.  A facile fusion of neural and semantic networks based on the fact that they both share a graph data structure would not be very interesting.  However, fusing neural and semantic networks within the context of the psynet model of mind has proved significantly more productive.   In the psynet model, the neural-like activation values and semantic-like truth values are interrelated, grounded in perception and action (ultimately, though often indirectly), and modified by cognitive processes in which aspects of both types of networks are specifically intermingled. 

4.3.2   Node and Link Varieties

Novamente contains many node and link types, but there is a layer of abstraction between the concept of “nodes and links” and the specific node and link types that make up Novamente.   We call this layer “node and link varieties”  and each variety may contain many different specific types. 

The node varieties currently used in Novamente are:

  • Perceptual nodes, which are true and active at a given time to the extent that particular type of stimulus (i.e. a member of particular category of stimulus events) is present or not in the sensed environment.
  • Action nodes, which are true to the extent that they are acting at a given time (i.e. that a certain set of Novamente system actions corresponding to them is being carried out).  These nodes contain small programs called “schema,” and are called SchemaNodes.
  • Basic conceptual nodes , which represent categories of perceptual or action or conceptual nodes, and are true to the extent that the nodes in the categories they represent are true.  Some of these (ConceptNodes)  are just semantic “tokens” to be used in relationships.  There are also somespecial kinds of conceptual nodes that are associated with particular learning processes, including SchemaConceptNodes representing categorical information about Actions; and there are PredicateNodes
  • Action-Concept Nodes.  These are ConceptNodes that are linked to particular SchemaNodes.  They are called SchemaConceptNode.
  • Psyche nodes, GoalNodes and FeelingNodes, which play a special role in overall system control, in terms of monitoring system health, and orienting overall system behavior.
  • Nodes embodying logical combinations of relationships.  These are CompoundRelationNodes, which are actually just SchemaConceptNodes interpretable as predicates (meaning that they refer to schema that map into truth value space)

The link varieties are:

  • Basic Relationships, which are typed tuples of nodes or relationships
  • Conceptual links, including AssociativeLinks representing association, a host of inferential links representing probabilistic relationships, and PredicateEvaluationLinks reflecting the input-output behavior of PredicateNodes
  • Action links, called ApplicationLinks, indicating input-output relationships between schema
  • Action-Concept links, called ExecutionLinks, forming a conceptual record of actions taken (each link has three arguments; the syntax is ExecutionLink Schema Input Output)

These varieties of atoms embody the current set of atomic Novamente types, and while we do not see this as likely, it is feasible that this list could be revised based on further experimentation.  A Rough Overview of Node and Link Types

The choice of node and link types is a subtle issue, involving a combination of mathematical, implementational and conceptual issues.  A complete enumeration of the key node and link types in the current Novamente system is a substantial task, and would require too much mathematical background for this sort of article.  Here, where the focus is on the big conceptual picture rather than the details, it will suffice to review the key functions that these specialized actors perform.   The breakdown of these functions into particular node and link types is usually straightforward, but sometimes (as with the definition of logical links) can get a bit thorny.

Perceptual nodes come in many different types depending on the objects being perceived.  Minimally, there are NumberInstanceNodes and CharacterInstanceNodes.  In practice we have never worked without WordInstanceNodes, and associated things like PunctuationInstanceNodes.  For bioinformatics applications we have introduced GeneticSequenceInstanceNodes, GeneInstanceNodes, and additional structures along these lines.   It is a subtle issue to determine what one wants to have the system experience as primary rather than secondary (especially with complex perceptive signals, such as audio and video, which we have not yet implemented and experimented with). 

Ultimately, the assemblage of perceptual node types depends on the nature of the data that a given Novamente is exposed to.  This is the most customizable, mutable aspect of the node and link type set, and will need to be experimented with extensively to determine the best perceptive-cognitive balance to ensure a system which can learn from observation.   Most of the perceptual node types correspond to conceptual node types, so that, for instance, one has conceptual WordNodes corresponding to perceptual WordInstanceNodes.   A particular occurrence of the word “dog” in a sentence is represented by a WordInstanceNode, whereas the word “dog” in general is represented by a WordNode.

Action nodes are implemented a bit differently, for technical reasons.  There is only one kind of action node, the SchemaInstanceNode.  A SchemaInstanceNode wraps up a small program called a “schema.”  There are also compound schema, which are graphs of elementary schema (basically, compound schema are functional programs whose base-level functions are elementary schema).

The set of simple schema is in effect an “internal Novamente programming language,” which bears some resemblance to functional programming languages like pure LISP or Haskell.  The “actions” carried out by SchemaInstanceNodes are not just external actions, they are also in some cases internal cognitive actions.  Ultimately, all the AI processes carried out inside Novamente could be formulated as compound schema, although in the current core implementation, this is not the case; the primary AI dynamics of the system are implemented as C++ objects called MindAgents, which are more efficient than compound schema. 

Schema come with two special relationship types: ApplicationLink, used for passing inputs and outputs to other schema; and OutputValue, which returns the output value of the function application referred to by an ApplicationLink.

Next, basic conceptual nodes serve as the source and target node types for a lot of very important link types.  These “conceptual” link types are actually quite generic in their usage: they can span simple ConceptNodes, or more advanced entities like SchemaNodes, or CompoundRelationNodes.  They can also span between relationships, not just between nodes – these are what we refer to informally as LinkLinks, links joining links. 

 SchemaNodes are like ConceptNodes that refer directly to SchemaInstanceNodes.  They are involved in SchemaExecutionLinks, which record information about what the inputs and outputs of action nodes were when they executed.  They are a separate entity from ConceptNodes because of the specific requirements of schema learning and execution.

FeelingNodes are “internal sensor” nodes, that sense some aspect of the overall state of the systemsuch as free memory or the amount the system has learned lately.  Complex “feelings” are formed by combining FeelingNodes in CompoundRelationNodes (see below), and give the system a “sense of self” in a practical manner which allows for autonomic homeostasis to be performed and for the system to deliberately adjust its task orientation towards an increased sense of positive “feeling”.  

GoalNodes are internal sensors like FeelingNodes, but the condition that they sense may sometimes be less global; they represent narrow system goals as well as broad holistic ones.   The system is supplied with basic goals as it is with basic feelings, but complex and idiosyncratic goals may be built up over time.  GoalNodes are used in adjusting the system’s autonomic processes to support focus on goal-oriented thought processes, as well as for the sytstem to deliberately seek out and analyze relevant information to meeting these goals.   

The bulk of link types are conceptual link types.  Among these are AssociativeLinks, that record general associations among nodes and links, gathered by one of two methods to be described later.  Then there are links based on probabilistic reasoning, which fall into a number of different types.  The simplest here are Similarity and Inheritance links, representing symmetric and asymmetric sharing of qualities among nodes or relationships.  There are also Equivalence and Implication links, which are semantically similar to Similarity and Inheritance links, but join two relationships or two CompoundRelationNodes, rather than joining simple ConceptNodes.

Several link types exist to deal with SchemaNodes and PredicateNodes, including ExecutionLinks and PredicateEvaluationLinks, and also more advanced operators like SatisfyingSet and OutputValue; these will be discussed in detail when the relevant node types are reviews.

Finally, perhaps the least conceptually transparent of all these node and link varieties is the CompoundRelationNode (CRN).  CRN’s are implemented as SchemaConceptNodes whose embodied schema output truth values (rather than integers, strings, etc.).  But this is deceptive; they have a fundamental importance above ordinary compound schema, because of the special role that truth value plays in the system as a whole.  Put simply, CompoundRelationNodes are Novamente’s way of representing arbitrarily complex patterns in a more easily referenced and manipulated encapsulated, rather than psynet-wide, manner. 

From an engineering point of view it’s key that the Mind OS can efficiently represent and manipulate all these varieties of nodes and links.   Webmind 2000’s Mind OS did not fully meet this criterion, because it was designed without a full appreciation for the complexity schema.

 

4.4         Forgetting and Related Issues

AI theorists talk a lot about learning and memory, but forgetting is an almost equally important topic, which is frequently overlooked.  Without forgetting, a reasonably perceptive and creative mind will fill up unacceptably and use all available memory on the hardware (or wetware) that hosts it, leading to a state of total mental stagnation (and, most likely, death).  However, forgetting can’t be done willy-nilly; it has to be done intelligently, or the system will forget things that it needs to know (and forget to forget things it ought never have thought of in the first place).

Macro-level mind patterns like the dual network are built up by many different actors; according to the natural process of mind actor evolution, and they’re also “sculpted” by the deletion of actors.  All these actors recognizing patterns and creating new actors that embody them – this creates a huge combinatorial explosion of actors.  Given the finite resources that any real system has at its disposal, forgetting is crucial to the mind – not every actor that’s created can be retained forever.   Forgetting means that, for example, a mind can retain the datum that birds fly, without retaining much of the specific evidence that led it to this conclusion. The generalization "birds fly" is a pattern A in a large collection of observations B.  A is retained, but the observations B are not.

Obviously, a mind's intelligence will be enhanced if it forgets strategically, i.e., forgets those items which are the least intense patterns.   And this ties in with the notion of mind as an evolutionary system.  A system which is creating new actors, and then forgetting actors based on relative uselessness, is evolving by natural selection. This evolution is the creative force opposing the conservative force of self-production, actor intercreation.

4.4.1   Grounding

Forgetting ties in with the notion of grounding.  Roughly speaking, a pattern X is "grounded" to the extent that the mind contains entities in which X is in fact a pattern.  For instance, the pattern "birds fly" is grounded to the extent that the mind contains specific memories of birds flying. Few concepts are completely grounded in the mind, because of the need for drastic forgetting of particular experiences.  This leads us to the need for "reasoning," which is, among other things, a system of transformations specialized for producing incompletely grounded patterns from incompletely grounded patterns.

Consider, for example, the reasoning "Birds fly, flying objects can fall, so birds can fall." Given extremely complete groundings for the observations "birds fly" and "flying objects can fall", the reasoning would be unnecessary – because the mind would contain specific instances of birds falling, and could therefore get to the conclusion "birds can fall" directly without going through two ancillary observations. But, if specific memories of birds falling do not exist in the mind, because they have been forgotten or because they have never been observed in the mind's incomplete experience, then reasoning must be relied upon to yield the conclusion.

4.4.2   Short-term Memory

The necessity for forgetting is particularly intense at the lower levels of the system. In particular, most of the patterns picked up by the system’s perceptual schema are of ephemeral interest only and are not worthy of long-term retention in a resource-bounded system. The fact that most of the information coming into the system is going to be quickly discarded, however, means that the emergent information contained in perceptual input should be mined as rapidly as possible, which gives rise to the phenomenon of "short-term memory" or STM.

What is short-term memory?  A mind must contain actors specialized for rapidly mining information deemed highly important (information recently obtained via perception, or else identified by the rest of the mind as being highly essential).  This is "short term memory" -- a space within the mind devoted to looking at a small set of recently perceived things from as many different angles as possible, as quickly as possible.  

In Novamente, STM is implemented as a special case of a general phenomenon called AttentionalFocus.  An AttentionalFocus (AF) is a unit with Novamente (a “unit” being an isolated collection of functionally related atoms – this will be described in a later subsection) that is devoted to processing a small number of atoms as intensively and thoroughly as possible.  The things contained in an AF are grouped together in many different ways, experimentally and fluidly.  STM is a special AF devoted to recently perceived things and other things that relate closely to them.

Because of the intensity of processing and combination-generation that it entails, short-term memory can’t be all that capacious – there is a potential “combinatorial explosion” problem in that N things can be grouped together in 2n ways.   Human short-term memory is commonly said to have a capacity of 7 +/- 2 .  There are many subtleties in interpreting this number, which we won’t go into here, but at very least it should be taken as an indication that the percentage of the mind in the short-term memory is a very small one.  

From what we’ve said so far, the psynet model is a highly general theory of the nature of mind. Large aspects of the human mind, however, are not general at all, and deal only with specific things, such as recognizing visual forms, moving arms, etc.  That’s the topic of the next subsection.

4.5         Configuration

Finally, one aspect of Novamente that was not explicitly discussed in the pre-Webmind “psynet model” work, but only emerged as important through practical work at Webmind Inc., is what we call “Novamente configuration.”   What this means is that a large Novamente system  will generally be broken down into a number of functionally specialized units, each one with a fairly full complement of node and link types and dynamics.   The different units do communicate with each other fairly freely, but atoms in different units won’t interact as intensively as atoms in the same unit.

This breakdown into units has several different importances.  First, it can have a huge impact on system efficiency, because in many cases a functionally specialized unit can be localized on a single machine, minimizing distributed processing overhead.  Second, it has a clear effect on the emergent structures and dynamics of the system.  In effect, it imposes a “multilobed” structure on the system’s dynamical attractors and transients.  It also makes system testing easier, since for much of the testing process, one is not testing a huge all-purpose soup of nodes and links, but rather a collection of more narrowly-purposed soups of nodes and links.  Of course, ultimately one has to test the whole system, and there also needs to be at least one unit in the system that is integrative, doing deep thinking about the most important nodes and links in the whole system without heed to specialization.

How the breakdown into units is done is a subtle issue.  Initially we will work with simple Novamentes.  Our current version has one unit, and the next step will be to a three-unit version, whose units include:

  • A primary cognitive unit
  • A background thinking unit, containing many more nodes with only very important relationships among them, existing only to supply the primary cognitive unit with things it judges to be relevant
  • An AttentionalFocus unit, containing a small number of atoms and doing very resource-intensive processing on them

Here the specialization has to do with the intensity of processing rather than with the contents of processing.

For a Novamente to interact intensively with the outside world, it should have two dedicated units for each “interaction channel”:

  •  One to contain the schema controlling the interaction
  •  One to store the “short-term memory” relating to the interaction

An “interaction channel” is a collection of sensory organs of some form, all perceiving roughly the same segment of external reality.  Each human has only one interaction channel.  But imagine if you had one mind controlling two bodies in different places, then you would want to have different components of your mind dealing with each body.  Novamentes can easily be in this situation, interacting separately with people in different places around the world.

Language processing, as we shall see, may also require some specialized units, dealing specifically with aspects of language learning.  And when a Novamente becomes advanced enough to modify its own basic AI dynamics, it will need experimental mind-units to play with, so that it can try out conjectural dynamic methods in an insulated environment without corrupting its primary thought processes.

This kind of specialization may seem to go against the “free-flowing, interacting actor soup” approach at the heart of the psynet model.  But it shouldn’t be viewed that way.  It should be viewed, rather, as a structure for directing the free flow of actor dynamics, just as the shape of a riverbed directs the flow of water, while leaving each water molecule a lot of freedom to bounce around. 

The human brain contains this kind of functional specialization to a large degree.  In fact we know more about the specialization of different parts of the brain than about how they actually carry out their specialized tasks.   Many AI systems contain a similar modular structure, but each module contains a lot of highly rigid, specialized code inside.  The approach here is very different.  One begins with a collection of actors emergently providing generic cognitive capability, and then sculpts the dynamical patterns of their interactions through functional specialization. 

 

5.  Emergent Mind in Novamente

In writing about Novamente, it is often convenient to focus on the particular nodes and links that exist as software objects in the Novamente codebase.   However, it must not be forgotten that the software object level is not the essence of Novamente as a mind.   The mind of Novamente is the set of patterns emergent among the bits in the RAM and registers of the set of computers running Novamente, and emergent between these bits and Novamente’s environment.

It is not possible to spell out here the details of the path from the software objects constituting a Novamente system to the complete mind of a Novamente system.  Indeed, these details will likely never be entirely transparent to anyone, not even to an advanced introspective Novamente system.  Novamente is a complex, immensely-high-dimensional dynamical system, involving various stable, periodic, chaotic and subtly structured emergent behaviors.  But it is possible to make some general remarks about the emergent mind structures and dynamics that we believe Novamente’s concretely implemented mind structures and dynamics will lead to.  Indeed, the motivation for Novamente’s software objects primarily consists of intuitions about the emergent mind structures and dynamics that it will lead to.  And we expect that a deeply introspective, self-modifying Novamente will be able to extract many subtle statistical and algorithmic patterns in the relationships between its CIM and emergent mind levels.  This kind of pattern recognition will be the basis for most of its intelligent goal-directed self-modification.

Generally speaking, other AI systems fall into one of two categories:

  1. Subsymbolic systems, consisting of basic units (“tokens”) that, taken individually, have no direct relationship to mind-level semantics (but that create mind-level semantics collectively)
  2. Symbolic systems, whose basic tokens have mind-level semantics – and whose overall semantics are largely directly comprehensible in terms of the semantics of the basic tokens

The archetypal subsymbolic AI system is the formal neural network.  Formal neurons aren’t “mind stuff” in any direct way – they don’t represent concepts, memories, percepts or actions.  They combine to form overall activation patterns that constitute mind-stuff.  The dynamic processes of the neural network aren’t directly thought processes, but rather processes for updating neural activations and synaptic weights.  These are intended serve as the ultimate root of thought processes, but they act on a lower level than mental dynamics.

On the other hand, the archetypal symbolic AI system is the logic-based system; for instance a semantic network enhanced with formal logic axioms implemented as graph rewriting rules.  In this case, each node of the network has a meaning such as “fork” or “chair” or “say X to user” or “edge of length roughly 1 inch at an angle of 45 degrees in the visual field.”  The thought processes of the system are conceived as basically identical with the graph rewriting rules, and the explicitly encoded control mechanisms that drive the application of these rules.  The philosophy here is that the point of brain-level structures and dynamics is to give rise to mind-level structures and dynamics.   In software, it is posited, we may create these mind-level structures and dynamics in a different way, by writing high-level code rather than by wiring neurons or simulated neurons together.

Of course, the line between these two approaches blurs somewhat in practice.  For instance, a large multimodular neural network system may embody mind level structures on the architectural level, in its choice of modules and its sculpting of the interactions between modules.  And a complex logic-based system like the production system ACT-R may have somewhat complex dynamics, giving it mind patterns beyond those explicitly coded into it. 

However, there is no existing AI system that spans the two approaches as thoroughly as Novamente does.  This is an aspect of the design that may be confusing on first encounter.   Essentially, Novamente looks something like a logic-based system with some subsymbolic-style representation and control, but it acts more like a complex, subsymbolically based dynamical system with some logic-based substructure guiding its dynamics.   Or at least, this is the intention.  Our experimentation with Novamente to date has not been sufficiently sophisticated to fully validate our claims about the potential complexity of its dynamics.  Thus, the discussion of emergent mind in Novamente presented in this chapter is at present somewhat speculative.  At time of writing, we have not yet seen the emergent phenomena that we predict will arise in the system.  We did not see them in Webmind 2000 because the system was too inefficiently implemented to support anywhere the scale of processing required to give rise to these types of emergence; and we have not yet seen them in Novamente because the Novamente implementation is at present too incomplete.

 

5.1  Maps and Map Dynamic Patterns

The basic ideas needed to understand Novamente’s “mind level” are that of a map and a map dynamic pattern.  A map is, roughly speaking, a fuzzy collection of Atoms that collectively play the lead role in a coherent dynamic pattern for Novamente dynamics, in the sense that they are frequently highly active all together, over a fairly short interval of time.   The dynamic pattern of activity that a map induces when it becomes active, is the map dynamic pattern corresponding to the map. We will explore here the map-level structures and dynamics that are implicit in the Atom-level structures and dynamics defined in the previous chapters.  Map-level structures and dynamics are not the sum total of emergent Novamente mind – they are just one of its many aspects.  But they are more palpable and compactly explicable than other more complex mind patterns, and we believe they are in a way the crux of Novamente emergent mind, in the sense that most other emergent Novamente mind patterns can likely be analyzed in terms of their impact on maps.

In the language of Chaotic Logic (Goertzel, 1994), a Novamente map is a kind of self-generating system – actually a self-generating subsystem of the self-generating system that is the whole Novamente Atomspace.   It is self-generating in the sense that, when active, a map tends to perpetuate itself for a period of time – effectively producing its own attention, and ensuring its own continued existence.  Chaotic Logic defines mind in terms of an abstract iteration called the cognitive equation, which states, in brief, that a mind is a self-generating system that recognizes patterns in itself and embodies these patterns as basic system components  Here we will argue, intuitively and heuristically, that the collection of Novamente maps demonstrates the archetypal dynamic defined by the cognitive equation.

The notion of maps and map dynamic patterns allows us to introduce the critical notion of an emergent psynet: a mind network emergent from the Novamente network of nodes and links, whose “metanodes” are maps and map dynamic patterns, and whose “metalinks” are emergent relationships between these.  In building the lower-level psynet of Novamente nodes and links, we are implicitly sculpting the emergent psynet – which is where the bulk of Novamente mind resides.

 

5.2  Pleasure and Happiness

A simple but pertinent example of the relation between Atoms and maps in Novamente is given by looking at the implementation of pleasure and happiness. 

In Novamente we have FeelingNodes, which look like symbolic-AI-style representations of system feelings.  However, to pursue a human-mind analogy, these FeelingNodes are really more like basic limbic-system or otherwise chemically-induced brain stimuli than they are like richly texture high-level human feelings.  Novamente’s Pleasure FeelingNode is like the raw animal feeling of pleasure that we humans experience.  In the human mind, happiness is much more complex than pleasure.  It involves expectations of pleasure over various time scales, and it involves inferences about what may give pleasure, estimates of how happy others will be with a given course of action and thus how much pleasure one will derive from their happiness, etc.  Biological pleasure is in a sense the root of human happiness, but the relationship is not one of identity.  Changes in the biology of pleasure generally result in changes in the experience of happiness – witness the different texture of happiness in puberty as opposed to childhood, or maturity as opposed to early adulthood.  But the details of these changes are subtle and individually variant.

So, in this example, we have a parallel between

  • A concretely implemented mind structure, the Pleasure FeelingNode
  • An emergent mind map, a metanode, the feeling of happiness

There is a substantial similarity between these two parallel entities existing on different levels, but not an identity.  Happiness is embodied in:

  • A large, fuzzily defined collection of nodes and links (a “map”)
  • The dynamic patterns in the system that are induces when this collection becomes highly active (a “map dynamic pattern”)

The Pleasure FeelingNode is one element of the map associated with happiness.  And it is a particularly critical element of this map, meaning that it has many high-weight connections to other elements of the map.  This means that activation of pleasure is likely – but not guaranteed – to cause happiness. 

This illustrates what in From Complexity to Creativity (Goertzel, 1997) was called the Structure-Dynamics Principle, a general complex-systems concept first introduced in From Complexity to Creativity (Goertzel, 1997).  The Structure-Dynamics Principle is a heuristic rule stating that often, in complex systems, the set of static patterns overlaps greatly with the set of dynamic patterns.   In other words, statics often encodes dynamics.  In a Novamente context, maps encoding mental entities exist statically, but their purpose is to induce certain map dynamic patterns (often mind-wide in scope) when highly activated by the importance updating function.  The concretely implemented psynet exists to spawn the emergent psynet.

 

5.3   A Crude Typology of Mind-Stuff

We have created an elaborate terminology for the nodes and links in Novamente.  Of course, these represent separate software objects, so that it is necessary to name them and clearly distinguish them.  It is also possible to create a typology for the various types of maps in the Novamente emergent psynet.  Here, however, the distinctions are fuzzier, and it is acceptable to leave them that way.  These are entities whose emergence is essential to the system’s functioning, but our skill at characterizing and typologizing them is probably not critical to the task of getting the system to work.  If the intuitions underlying the system design are correct, then building the node and link level psynet, and tuning its parameters for optimum performance, will lead to the spontaneous formation of appropriate structures and dynamics on the emergent mind level.

With these caveats, we believe that the following “map categories” may be useful for understanding Novamente on the map level:

  • Concept map: a map consisting primarily of conceptual nodes
  • Percept map: a map consisting primarily of perceptual nodes, which arises habitually when the system is presented with environmental stimuli of a certain sort
  • Schema map: a distributed schema
  • Predicate map: a distributed predicate
  • Memory map: a map consisting largely of nodes denoting specific entities (hence related via MemberLinks and their kin to more abstract nodes) and their relationships
  • Concept-percept map: a map consisting primarily of perceptual and conceptual nodes
  • Concept-schema map: a map consisting primarily of conceptual nodes and SchemaNodes
  • Percept-concept-schema map: a map consisting substantially of perceptual, conceptual and SchemaNodes
  • Event map: a map containing many links denoting temporal relationships
  • Feeling map: a map containing FeelingNodes as a significant component
  • Goal map:  a map containing GoalNodes as a significant component

Note that these are all fuzzy concepts: a given map may lie in different categories to different extents. 

Thinking about the emergent psynet level makes clear why, as already intimated in earlier chapters, some of the nomenclature used to describe Novamente objects may be misleading.  For instance, ConceptNode could be more accurately (but more verbosely and awkwardly) named LikelyConceptMapComponentNode, in that a ConceptNode is not a full mental representation of a concept; it is rather a likely ingredient of maps that represent concepts.  Some ConceptNodes may individually denote concepts – this is the case of maps that are structured with a single central node and a periphery.  But there is no reason to believe this will be the majority of concepts.  And concept-embodying maps may also involve nodes besides ConceptNodes, though they will generally contain mostly ConceptNodes.

Some of the discussion in previous sections refers for simplicity to individual Atoms representing complex concepts, but in reality, these Atoms should be considered as “central Atoms” in significant maps.  On detailed consideration, the relationship between the Atom level and the map level thus emerges as: a strong conceptual parallelism, with various detailed differences keeping things interesting.  Engineering a mind is a complex task, and the course we have chosen is to create a CIM level that has some cognitive depth on its own, but whose primary purpose is to give rise to an emergent psynet which parallels the lower-level built-in structures and dynamics, while also possessing its own critical emergent properties.

5.4  Emergent Maps

The root of the emergent psynet concept above is the notion that memories, concepts, percepts and actions in Novamente will generally correspond, not to individual Atoms, but to maps – fuzzy collections of nodes and links.  Some maps will be centered around individual Atoms, in the sense that when the “central” Atom is active, the map nearly always becomes active, and vice versa.  Other maps will not have central Atoms in this sense, or will have more than one central Atom.  And the meaning of a map does not consist exclusively of its members considered statically, but rather of its corresponding map dynamic pattern – the dynamical activity pattern that is observed in the system when the map is activated via the Importance Updating Function.   In fact, the existence of significant maps in the system is largely due to the nonlinear dynamics of the Importance Updating Function, the equation that regulates the importance valued of Atoms. 

Generally speaking there are two kinds of map dynamic patterns: map attractors, and map transients.  Schema and predicate maps generally give rise to map transients, whereas concepts and percepts generally give rise to map attractors; but this is not a hard and fast rule.  Other kinds of maps have more intrinsic dynamic variety, for instance there will be some feeling maps associated with transient dynamics, and others associated with attractor dynamics.

It’s important to clarify the (somewhat peculiar) sense in which the term “attractor” is used here.  In dynamical systems theory, an attractor usually means a subset of a system’s state space which is

  • Invariant: when the system is in this subset of state space, it doesn’t leave it
  • Attractive: when the system is in a state near this subset of state space, it will voyage closer and closer to the attracting subspace

The subset of state space corresponding to a map is the set of system states in which that map is highly important.  However, in Novamente dynamics, these subsets of state space are almost never truly invariant.  They are attractive, because Novamente activation spreading dynamics behaves as in an attractor neural network: when most of a map is highly important, the rest of the map will get lots of activation which will make it highly important.  But they are not invariant because of the specific nature of the Importance Updating Function, regulating Novamente dynamics, guarantees that most of the time, after a map has been important for a while, it will become less important, because the percentage of new things learned about it will become less than the percentage of new things learned about something else.

This combination of attractiveness and temporary invariance that we see in connection with Novamente maps, has been explored by physicist Mikhail Zak, who has called subsets of state space with this property terminal attractors.  He has created simple mathematical dynamical systems with terminal attractors, by using iteration functions containing mathematical singularities.  He has built some interesting neural net models in this way.  The equations governing Novamente bear little resemblance to Zak’s equations, but intuitively speaking, they seem to share the property of leading to terminal attractors, in the loose sense of state space subsets that are attractive but are only invariant for a period of time.

Many concept maps will correspond to fixed point map attractors – meaning that they are sets of Atoms which, once they become important, will tend to stay important for a while due to mutual reinforcement.  On the other hand, some concept maps may correspond to more complex map dynamic patterns.  And event maps may sometimes manifest a dynamical pattern imitating the event they represent.   This kind of knowledge representation is well-known in the attractor neural networks literature.

Turning to schema, an individual SchemaNode does not necessarily represent an entire schema of any mental significance – it may do so, especially in the case of a large encapsulated schema; but more often it will be part of a distributed schema (meaning that SchemaNode might more accurately be labeled LikelySchemaMapComponentNode).  And of course, a distributed schema gathers its meaning from what it does when it executes.  A distributed schema is a kind of mind map, and its map dynamic pattern is simply the system behavior that ensues when it is executes.  Note that this system behavior may go beyond the actions explicitly embodied in the SchemaNode contained in the distributed schema.  Executing these SchemaNodes in a particular order may have rampant side-effects throughout the system, and these side-effects may have been taken into account when the schema was learned, constituting a key part of the “fitness” of the schema.

Next, percepts coming into the system are not necessarily represented by individual perceptual nodes.  For instance, a word instance that has come into the system during the reading process is going to be represented in multiple simultaneous ways.  There may be a WordInstanceNode, a ListLink of CharacterInstanceNodes … in a vision-endowed system, a representation of the image of the word will be stored.  These will be interlinked, and linked to other perceptual and conceptual nodes, and perhaps to SchemaNodes embodying processes for speaking the word or producing letters involved in the word.  In general, percepts are more likely to be embodies by maps that are centered on individual perceptual nodes (the WordInstanceNode in this case), but this is not going to be necessarily and universally the case.

Links also have their correlates on the map level, and in many cases are best considered as seeds that give rise to inter-map relationships.  For example, an InheritanceLink represents a frequency relationship between nodes or links, but inheritance relationships between maps also exist.  An inheritance relation between two maps A and B will not generally be embodied in a single link, it will be implicit in a set of InheritanceLinks spanning the Atoms belonging to A and the Atoms belonging to B.  And the same holds for all the other varieties of logical relationship.  Logical links on the map level are well handled by the straightforward statistical approach to intermap linkage presented in Section 1 above.

The first-order inference rules from Probabilistic Term Logic, Novamente’s reasoning system carry over naturally to map-level logical links.  For instance, if there are many InheritanceLinks from map A to map B, and many InheritanceLinks from map B to map C, then inference will build many new InheritanceLinks from map A to map C, thus carrying out deduction on the map level.    More complex aspects of Novamente inference carry over to the map level somewhat similarly.

On the other hand, association between maps A and B is somewhat of a special case.  It is best understood as being given by the total mass of links and CompoundRelationNodes joining A’s member Atoms and B’s member Atoms.  AssociativeLinks between Atoms are part of map associations but not the only part.  Association formation on the map level is then a generic process that encompasses all link and node building processes in the system.  The system’s explicit association formation dynamics are only part of the story.

The detailed consideration of further Novamente processes on the map level becomes intricate, but bears out the general perspective indicated by these examples.  Map level structures and dynamics mirror Atom level structures dynamics, with some moderately significant technical differences but no significant conceptual differences.

Finally, we have discussed extensively how Atom level dynamics feed up to influence map level dynamics; the other half of the story is how map level dynamics feed down and explicitly influence Atom level dynamics.  This happens to some extent implicitly, because what happens among maps affects all the Atoms involved in the maps.  But it also happens explicitly, via the general process of map encapsulation.  This refers to a collection of processes by which the system recognizes maps in its own dynamics and then embodies them as definite nodes and links. This is done by several separate processes, one oriented toward encapsulating schema and predicates, others toward encapsulating concepts.

We thus, finally, see the very strong sense in which the Novamente system may fulfill the “cognitive equation” identified in Chaotic Logic.  The cognitive equation says, loosely, that a mind is a collection of actors that constantly act on each other, transforming each other, and creating new actors based on patterns that they recognize amongst each other.  This happens in Novamente in many ways: each act of first-order inference is an example of a new actor (a new link) being created based on a pattern recognized among other actors (the premise links).  But map encapsulation is perhaps the most powerful example of pattern-recognition-based new actor creation.

5.5  High-Level Emergent Mind Structures

We’ve established that the dynamics of a Novamente system is largely controlled by the structure of the system, and that the structure of a Novamente system is strongly guided, and partly explicitly formed, by the dynamics of the system.  This structural-dynamical feedback can lead to all kinds of complex emergent structures – some existing in the system at a given time, some manifesting themselves as patterns over time, and some spatiotemporal in nature.   Maps are one manifestation of this feedback; but there is also a higher level of organization, in which the network of maps achieves certain emergent patterns.  Some of these patterns were seen, in crude forms, in simple experiments run with Webmind 2000.

5.5.1  The Dual Network

The foremost such large-scale emergent pattern is what Goertzel (1993a) calls the dual network.   The definition of the dual network given here is different in detail from that given in (Goertzel, 1993a), but the spirit is the same.  Here the precise nature of the concept has been modified a little to fit Novamente with optimal ease.

To understand what a dual network is, two preliminary terms must be defined: hierarchy and heterarchy, in a psynet context.  

An hierarchical mind network is defined as a directed graph in which:

  •  A --> B carries the approximate semantics “A is a special case of B” (“A inherits from B”)
  • When A --> B, this means that B tends to control A, in the sense that processes involving B tend to cause processes involving A, more so than processes involving A tend to cause processes involving B

An heterarchical mind network is defined as an undirected graph in which:

  • A <--> B carries the approximate semantics “A is similar to B”
  • When A < -> B, this means that A and B tend to mutually control each other, in the sense that processes involving A and processes involving B tend to cause each other with roughly equal frequency

Note that these definitions combine “logical” relatedness (similarity, inheritance) and “dynamical” relatedness (causal relations between processes). 

A dual network is a kind of mixed hierarchical/heterarchical network, i.e. a network of nodes that are involved in:

  • At least one hierarchical mind network
  • At least two heterarchical mind networks (on different hierarchical levels)

and for which the hierarchical and heterarchical mind networks are well aligned.  Well aligned may be defined as:

  • The similarity and inheritance relationships involved in the various subnetworks are reasonably logically consistent with each other
  • The processes involved in the asymmetric causal relations of the hierarchical network overlap significantly with the processes involved in the symmetric causal relations of the heterarchical network

This kind of balance between the hierarchical and heterarchical aspects of the emergent network of actor interrelations is crucial to the mind.

The emergence of a dual network structure in Novamente comes fairly naturally from the harmonious interaction of:

  • Inference building similarity and inheritance links
  • Importance updating, guiding the activation of atoms (and hence the application of built-in primary cognitive processes to atoms) based on the links between them
  • Schema learning, which extends a schema’s applicability from one node to another based on existing links between them (and based on observations of past schema successes and failures, as will be explained later)

The dual network structure is a static representation of the dynamic cooperation of these processes.

The static aspect of the dual network gives the mind a kind of “dynamic library card catalog” structure, in which topics are linked to other related topics heterarchically, and linked to more general or specific topics hierarchically.  The creation of new subtopics or supertopics has to make sense heterarchically, meaning that the things in each topic grouping should have a lot of associative, heterarchical relations with each other.   And the creation of new associations has to make sense hierarchically, in terms of the category structure.  This structure may be seen, at a very high level and to a rough degree of approximation, as  an extension of other paradigms in computer science, such as relational databases and object-oriented programming, which were themselves modeled after theories of mental organization.  However, from the ground-up the Novamente dual-network structure is optimized for general cognitive processing, not for specialized processing such as rapid querying or ease of decomposition into component tasks. 

On the map level, the dual network manifests itself as a pattern of organization whereby:

  • Maps are organized heterarchically, in that structurally similar maps are strongly associated with each other via system dynamics (i.e. SimilarityLinks and AssociativeLinks often exist between the same maps).
  • Maps are organized hierarchically, in that there is an approximate tree structure to the overall set of maps in the system, in which maps higher up in the tree exercise control over their children; and in which the children of a given map tend to be similar to & associated with each other

 

5.5.2  The Self

Another very critical emergent mind structure, a bit more abstract than the dual network, is the self.  We stress here that we are using a working definition of self, geared towards serving as a usable guideline for engineering what can be called an AI within the context of our theory, and deliberately avoid ontological-existential discussions of the universal nature of selfhood and its relation to consciousness.    

One might define the “raw material” for the self – the primary senses in which a Novamente can self-reflect – as the collection of:

  •  Emergent patterns that the system has observed in itself as a whole, that is, the structural and dynamical patterns within its internal dual-network.
  •  Patterns that it has observed in its own external actions, that is, that subnetwork of its dual network which involves tracking the procedure and consequences of running various schema.
The self itself is then a collection of emergent patterns recognized in this set.   Often the patterns recognized are very approximate ones, as the collection of data involved is huge and diverse – even a computer doesn’t have the resources to remember every detail of every thing it’s ever done.  Furthermore, the particular data items leading to the creation of the psynet-wide patterns that define the self will often be forgotten, so that the self is a poorly grounded pattern (tuning how poorly grounded it may be and still be useful will be a subtle and crucial part of giving Novamente a useful, nontrivial sense of self).

On the map level, we may say that the self consists of:

  • A set of self-image maps: maps that serve as an “internal images” of significant aspects of a Novamente system’s structure or dynamics
·        A larger map that incorporates various self-image maps along with other Atoms (this is the emergent self)The really interesting thing about the self is the feedback between declarative, localized knowledge and distributed, procedural knowledge that it embodies The collection of high-level patterns that is the self operate according to the Structure-Dynamics principle, and as they become more or less active they automatically move the overall behavior of the system in appropriate directions.  That is to say, as the system observes and reasons upon its patterns of self, it can then adjust its behavior by controlling its various internal processes in such a way as to favor patterns which have been observed to contribute to coherent thought, “good feelings,” and satisfaction of goals. 

The self is useful for guiding the system’s perceptual/cognitive/active information-gathering loop in productive directions – a cognitively “natural” means by which the system will perform self-modification (indirectly, from the system-code-level perspective usually meant when discussing self-modifying AI).  Knowing its own holistic strengths and weaknesses, a mind can do better at recognizing patterns and using these to achieve goals.  Self-guided action by the system to enhance its strengths, try to ameliorate (or avoid) its weaknesses, achieve its goals, and lead to overall “feelings of satisfaction” can take the form of more focused thinking, deliberately seeking out information relevant to (internal and external) goals, paying closer attention to the details when perceiving important objects, etc.

Finally, a well-known fact of developmental psychology should not elude mention here: The presence of other similar beings is of inestimable use in growing an effective self.   It allows a system to extend the above-listed “raw material” for the self by incorporating two extra types of data:

  • evaluations of the self given by other entities
  • patterns perceived in other similar beings

by which the self can be usefully grounded with respects to a universe in which there exist other selves, and not just within its own mind.  So, while it would be possible to have self without society, society makes it vastly easier, by giving vastly more data for self-formation – and for a self to be able to function sufficiently in a world where there are other selves, society is indispensable.

 

6          Experiential Interactive Learning

We have explained, in very rough and high-level terms, how Novamente’s cognitive activity is supposed to proceed.   But cognition in a vacuum is not much use.  A mind is really only meaningful in connection with an environment.

This means that loading knowledge into an AI system from databases, or even from texts via natural language processing, is never going to be fully adequate.   Knowledge encoding is only useful as an augmentation to learning through direct interaction with the world and with other minds -- learning through experience.

Human infants learn through experience, and as we all know, this is a long and difficult process.  We’ve seen, in the previous sections of this article, the incredible number of specialized mind-actors that appear to be necessary in order to implement general cognition within practical computational constraints.  Given this diversity and complexity, it’s sobering to realize an integrated Novamente AI Engine embodying all these actors will not, when initially completed, be a mature mind.  It will be an unformed infant – though perhaps an unformed infant with a peculiarly large amount of not-fully-comprehended knowledge fed directly into its brain.

The experience of this Novamente AI Engine infant will be diverse, including spidering the Web, answering user queries, analyzing data series, and so on.  But we believe that autonomous experiencing of the world will not be enough.  Our view is that, in order for the Baby Novamente to grow into a mature and properly self-aware system, it will need to be interacted with closely, and taught, much like a young human.  That acting and perceiving and planning intelligently must begin on the simple “baby” level and learned via interaction with another intelligent mind – and that the intelligence of these processes will then increase exponentially with the system’s experience.

Thus, after the Novamente engineering project, comes the Novamente teaching project.   The initial result of this project, if all goes well, will be a Novamente AI Engine that can converse about what it’s doing, not with 100% English fluency nor with the ability to simulate a human, but with spontaneity and directness and intelligence – speaking from its own self like the real living being it is. 

Note that this is not the Turing Test, we are not seeking to fool anyone into believing that Novamente is a human.  Emulating human behaviour is not our current priority. 

This teaching project is just the start of a lot of other more practical and interesting things.  Once one has a system that can be taught, the combination of this teachability with the innate problem-solving abilities of the Novamente cognition engine, should lead to a lot of highly valuable and exciting applications.

 

6.1         The NEIL UI

The basic ideas of experiential interactive learning are very general, and would apply to a Novamente  with arbitrarily diverse sense organs – eyes, ears, nose, tongue….  However we have worked out the ideas in detail only in the concrete context of the current system , whose inputs are textual and numerical only.  Extension of this framework to deal with music, sound, image or video input could be accomplished with the necessary effort.  In 2000 we created a specific prototype user interface (the “Baby Webmind UI”) intended for experimentation with experiential interactive learning with Webmind 2000 in the context of a “world” called FileWorld consisting of a directory of textual and numerical data files.  It was rather homely to look at, and was never used in any significant way, but the design process was extremely instructive.  At time of writing we have not yet created such an interface for Novamente, but when the time comes we will create something conceptually similar, though surely visually rather different.

In a Novemente context we refer not to “Baby Webmind” but to NEIL = “Novamente Experiential Interactive Learning." The idea of the NEIL UI is to provided a simple yet flexible medium within which Novamente can interact with humans and learn from them.   The following components would seem to be critical:

  • A chat window
  • Reward and punishment buttons, which allow us to vary the amount of reward or punishment (a very hard smack as opposed to just a plain ordinary smack…)
  • A way to enter in our emotions, along several dimensions
  • A way for Novamente to show us its emotions (technically: the parameters of some of its FeelingNodes)

A comment on the emotional aspect is probably appropriate here.  Inputting emotion values via GUI widgets is obviously a very crude way of getting across emotion, compared to human facial expressions.  The same is true of NEIL’s list of FeelingNode values: this is not really the essence of the system’s feelings, which are distributed across its internal network.   Ultimately we’d like more flexible emotional interchange, but for starters, it seems that this system gives at least as much emotional interchange as one gets through e-mail and emoticons.

This kind of experiential interactive learning UI can be used to mediate Novamente’s interactions with many different kinds of data.  What kind of “world” we give NEIL is going to depend on the types of problems that Novamente  is being applied to at the time it is completed.  The world we conceived while building the 2000 Baby Webmind prototype was something called FileWorld: a database of files, with which Baby Webmind interacted via a series of operations representing its basic receptors and actuators.    In this context, it was given an automatic ability to perceive files, directories, and URL’s, and to carry out several dozen basic file operations.

6.2         The Quest for Happiness

In FileWorld or any other environment, the system’s learning process is guided by a basic motivational structure, built into the initial fund of GoalNodes and FeelingNodes.  Novamente wants to achieve its goals, and its Number One goal is to be happy, i.e. on the crudest level, to feed its Pleasure FeelingNode.  Its initial motivation in making conversational and other acts in the NEIL interface is to make itself happy (as defined by its Pleasure FeelingNode, and the maps that arise around it).  

The actual learning, of course, is carried out by a combination of Novamente AI processes -- activation spreading, reasoning, evolutionary concept creation, and so forth.  These are powerful learning processes, but even so, they can only be used to create a functional digital mind if they are employed in the context of experiential interactive learning.

How is the Pleasure FeelingNode, which ultimately guides the system’s learning process, defined?  For starters,

1.       If the humans interacting with it are happy, this increases pleasure

2.       Discovering interesting things increases the its pleasure

3.       Getting pleasure tokens (from the user clicking the UI’s reward button) increases its pleasure

The determinants of pleasure in humans change as the human becomes more mature, in ways that are evolutionarily programmed into the brain.  We will likely need to effect this in Novamente as well, manually modifying the grounding of happiness in the Pleasure FeelingNode as the system progresses through stages of maturity.   And eventually this process will be automated, once there are many Novamente instances being taught by many other people and Novamente instances, but for the first time around, this will be a process of ongoing human experimentation.

All this should make clear that, in our view, building the thinking machine is only the start.  Teaching it, and embedding it in the world, is an equally big job.  And it will be very surprising if in the course of teaching and interacting with Baby Novamente, we don’t wind up making serious modifications to the system’s cognitive structures and dynamics.

But, this is our answer to AI skeptics who posit that embodied experience is essential to the mind.  Yes, of course it is.  But it’s not the possession of a human body that’s critical, it’s the process of experiential interactive learning.  The first NEIL UI we build may end up being too narrow-bandwidth, but there are many obvious ways to enhance it, and the choice of which enhancements to effect will be best made in the course of practical experimentation.  Novamente’s “body” will evolve along with its mind. 

7          Webworld

Novamente needs a lot of computing power.   This is probably our biggest worry at the moment: that, once Novamente is engineered and works properly, it won’t be able to achieve a reasonable level of intelligence without a contemporarily unreasonable amount of memory and processing power.  

A lot of thought has gone into how to circumvent this problem, and our conclusion has been that the issue is probably resolvable via an intelligent harmonization of Novamente “unit configuration” with the underlying hardware configuration.  One aspect of this harmonization, however, has implications going beyond Novamente proper, leading to the proposition of a complementary software framework called Webworld.

One of the most critical aspects of the AI Engine – schema learning, i.e., the learning of procedures for perceiving, acting and thinking -- is also one of the most computationally intractable.  Based on our work so far, this is the one aspect of mental processing that seems to consume an inordinate amount of computing power.  Some aspects of computational language learning are also extremely resource-intensive, though not quite so much so.

Fortunately, though, neither of these supremely computation-intensive tasks (schema learning or language learning) need to be done in real time.  What this means is that, in principle, they can be carried out through large-scale distributed processing, across thousands or even millions of machines loosely connected via the Net.   We have designed system for accomplishing this kind of wide-scale “background processing,” called Webworld. 

Webworld is an example of a peer-to-peer Internet system, more complex and powerful than the various peer-to-peer systems that are used for file sharing (Napster, Gnutella)  and distributed problem solving (, distributed.net, and so forth)  today.  It is a sister software system to the AI Engine, sharing some of the same design principles and potentially some of the same codebase (there was a Webworld 2000 prototype sharing code with Webmind 2000), but serving a complementary function. 

A “Webworld lobe” is a much lighter-weight version of a Novamente unit, which can live on a single-processor machine with a modest amount of RAM, and potentially a slow connection to other machines.  Webworld lobes host actors just like normal Novamente units, and they exchange actors and messages with other Webworld lobes and with Novamentes.   Novamentes can dispatch non-real-time, non-data-intensive “background thinking” processes (like schema learning and language learning problems) to Webworld, thus immensely enhancing the processing power at their disposal. 

Of course, Webworld has a lot of romance, in addition to its practical utility.  It allows Novamente’s intelligence to effectively colonize the entire Net, rather than remaining restricted to small clusters of sufficiently powerful machines  As more and more machines host Webworld clients, more and more of the computer processing power in the world will be used in the service of Novamente intelligence.  This will provide users with many practical benefits, including

  • advanced AI-enhanced information services at a lower price than would be possible without an underlying cheap source of distributed computing power. 
  • personalized information filtering, spidering and data analysis, using personal data from the user’s local machine, and local processing power tied in with global Webworld processing

At time of writing the Webworld project is on ice; the design exists but no engineering is proceeding.  However, we do believe that implementation of Webworld would be of very significant long-term value for Novamente AI, and we hope that the project can be resurrected in the not too distant future. 

8          Futuristic Speculations beyond Artificial General Intelligence

Building artificial general intelligence at the roughly human level is, in itself, a highly ambitious goal that is not taken seriously by the majority of contemporary academic and industry computer scientists.   However, in the large view, artificial general intelligence is just part of a larger family of highly speculative and ambitious technological goals (some would say “crazy dreams”).   In fact, real AI is in many ways less ambitious and speculative than other long-term goals with which it is linked.  We are thinking of two things in particular here: the notions of the Global Brain and the Singularity. 

These exciting though somewhat science-fictional ideas do not play a major role in the detailed practical matters of Novamente design.  But they have been somewhat inspirational in setting us on the course we are on, and so they seemwell  worth mentioning here.  Ben Goertzel’s thoughts on these topics, particularly the Global Brain and the potential role of software like Novamente and Webworld in it, are presented at length in (Goertzel, 2001).

8.1         The Global Brain

The notion of the Global Brain is appealing but somewhat ambiguous.  Intuitively, what it means is that some sort of intelligence is going to arise out of global computer and communication networks – some kind of emergent intelligence, going beyond that which is implicit in the individual parts.  This notion has been promoted by a variety of different thinkers, most avidly by the Principia Cybernetica group, led by Francis Heylighen and initially founded by Heylighen, Valentin Turchin and Cliff Joslyn (Heylighen, 1991).  But even the thinkers within this group hold fairly different ideas about what the “global brain” concept actually means.

Turchin proposed in (Turchin, 1977), [AA1] that the Internet and other communication technologies had the potential to lead to a new “metasystem transition”, in which humans would eventually become subordinate to an emergent “global brain.”    We find this to be a fascinating idea, however our own scientific work has focused on shorter-term and more easily comprehensible interpretations of the “global brain” idea. 

In (Goertzel, 2001), three possible phases of  “global brain” are distinguished:

Phase 1: computer and communication technologies as  enhancers of human interactions.  This is what we have today: science and culture progress in ways that would not be possible if not for the “digital nervous system” we’re spreading across the planet.  The network of idea and feeling sharing can become much richer and more productive than it is today, just through incremental development, without any  Metasystem transition.

Phase 2: the intelligent Internet.  At this point our computer and communication systems, through some combination of self-organizing evolution and human engineering, have become a coherent mind on their own, or a set of coherent minds living in their own digital environment.

Phase 3: the full-on Singularity.  This is what Turchin was talking about in 1977.  It may happen through superhuman AI programs that rewrite their own code until they’re more sophisticated than humans can imagine; it may happen through genetic engineering that allows us to breed human superbeings; or it may happen otherwise, in ways I’m not prognosticator enough to foresee.  At this point our current psychological and cultural realities are no more relevant than the psyche of a chimp is to modern society.

In this language, our own thinking about the global brain has mainly focused on how to get from Phase 1 to Phase 2 – i.e. how to effect or encourage the transformation of the Internet into a coherent intelligent system. 

Currently the best way to explain what happens on the Net is to talk about the various parts of the Net: particular Websites, e-mail viruses, shopping bots, and so forth.   But there will come a point when this is no longer the case, when the Net has sufficient high-level dynamics of its own that the way to explain any one part of the Net will be by reference to the whole.   This, we believe, will come about largely through the interactions of AI systems – intelligent programs acting on behalf of various Web sites, Web users, corporations, and governments will interact with each other intensively, forming something halfway between a society of AI’s and an emergent mind whose lobes are various AI agents serving various goals. 

The figure below, from (Goertzel, 2001)  is an attempt at an “architecture diagram” for the entire Net, in its early-Phase-2 form.  Naturally, any diagram with such a broad scope is going to skip over a lot of details.  The point is to get across a broad global vision:

What we have here is, first of all, a vast variety of “client computers,” some old, some new, some powerful, some weak.  Some of these access the intelligent Net through dumb client applications – they don’t directly contribute to Internet intelligence at all.  Others have smart clients such as WebWorld clients, which carry out two kinds of operations: personalization operations intended to help the machines serve particular clients better, and general AI operations handed to them by sophisticated AI server systems or other smart clients.

Next there are “commercial servers,” computers that carry out various tasks to support various types of heavyweight processing – transaction processing for e-commerce applications, inventory management for warehousing of physical objects, and so forth.  Some of these commercial servers interact with client computers directly, others do so only via AI servers.  In nearly all cases, these commercial servers can benefit from intelligence supplied by AI servers.

Finally, there is what we view as the crux of the intelligent Internet: clusters of AI servers distributed across the Net, each cluster representing an individual computational mind.  Some of these will hopefully be Novamentes, others may be other types of AI systems.  These will be able to communicate via a common language, and will collectively “drive” the whole Net, by dispensing problems to client machines via WebWorld or related client-side distributed processing frameworks, and by providing real-time AI feedback to commercial servers of various types. 

Some AI servers will be general-purpose and will serve intelligence to commercial servers using an ASP (application service provider) model; others will be more specialized, tied particularly to a certain commercial server (e.g., a large information services business might have its own AI cluster to empower its portal services). Private versions might also exist, of course.

This is our concrete vision of what a “global brain” might look like, in the relatively near term, with Novamente AI Engines playing a critical role.  From an AI point of view, what all this would mean would be:

  • Large amounts of hardware devoted to Novamente processing
  • The Internet becoming an increasing rich environment for Novamentes to live in
  • Humans having more and more practical purposes for interacting with Novamentes, thus teaching them more and more

 

8.2         The Singularity

The Global Brain is a fascinating aspect of the possible long-term future of Novamente and other “real AI” systems  – but the story doesn’t stop there.   Another part of the grand and fabulous future is the Singularity – a meme that seems to be on the rise these days.

The notion of “the Singularity” is not specifically tied to AI; it was proposed in the 70’s by science fiction writer Vernor Vinge, referring to the notion that the accelerating pace of technological change would ultimate reach a point of discontinuity (Vinge, 1993).  At this point, our predictions are pretty much useless – our technology has outgrown us in the same sense that we’ve outgrown ants and beavers.

Eliezer Yudkowsky and Brian Atkins have founded a non-profit organization called the Singularity Institute (Yudkowsky, 2001) devoted to helping to bring about the Singularity, and making sure it’s a positive event for humanity rather than the instantaneous end of humankind.  Yudkowsky has put particular effort into understanding the AI aspects of the singularity, discoursing extensively on the notion of Friendly AI – the creation of AI systems that, as they rewrite their own source code, achieving progressively greater and greater intelligence, leave invariant the portion of their code requiring them to be friendly to human beings (Singularity, 2001a).  He has coined the term “hard takeoff” to refer to the point when an AI system begins increasing its intelligence via self-modification at a tremendous pace.

From our point of view, whether technological change really proceeds so rapidly as to reach a point of discontinuity is not all that critical.  But the notion of an AI system becoming progressively more and more intelligent by modifying its own sourcecode is a fascinating one.  We don’t doubt that this is possible, and that it will ultimately lead to amazing technological advancements of various sorts. 

From our Novamente-centric point of view, the following is the sequence of events that seems most likely to lead up to the Singularity or some approximation thereof:

1.       Someone creates a fairly intelligent AI, one that can be taught, conversed with, etc.

2.       This AI is taught about programming languages, is taught about algorithms and data structures, etc.

3.       It begins by being able to write and optimize and rewrite simple programs

4.       After it achieves a significant level of practical software engineering experience and mathematical and AI knowledge, it is able to begin improving itself ... at which point the hard takeoff begins.

In this vision, we conjecture that something like the Singularity may arise as a consequence of emergence-producing, dynamic feedback between the AI Engine and intelligent program analysis tools like, for instance, Turchin and Klimov’s Java supercompiler (Turchin, 1986; see also www.supercompilers.com) .

In technical Novamente terms, self-modification is a special case of the kind of problem we call "schema learning."    The Novamente AI Engine itself is just a big procedure, a big program, a big schema.  The ultimate application of schema learning, therefore, is the application of the system to learn how to make itself better.  The complexity of the schema learning problem, with which we have some practical experience, suggests how hard the “self-modifying AI” problem really is. 

Sure, it’s easy enough to make a small, self-modifying program.  But, such a program is not intelligent.  It’s closer to being “artificial life” of a very primitive nature.  Intelligence within practical computational resources requires a lot of highly specialized structures.  These lead to a complicated program – a big, intricate mind-schema – which is difficult to understand, optimize and improve.  Creating a simple self-modifying program and expecting it to become intelligent through progressive environment-driven self-modification is an interesting research program, but it seems more like an attempt to emulate the evolution of life on Earth than an attempt to create a single intelligence within a reasonable time frame.

But just because the “learn my own schema” problem is hard, doesn’t mean it’s unsolvable.  A program in C++ or Java or any other language can be represented as a schema in Novamente’s internal data structures, and hence it can be reasoned about, mutated and crossed over, and so forth.  This is what needs to be done, ultimately, to create a system that can understand itself and make itself smarter and smarter as time goes on – eliminating the need for human beings to write AI code and write articles like this one.  

Reasoning about schema representing complex computer programs requires a lot of specialized intuition, and specialized preprocessing may well be useful here, such as for instance the automated analysis and optimization of program execution flow being done in the Java supercompilation project, mentioned above.  There is a lot of work here, and we acknowledge that we’re a long way from achieving this goal, but it’s a fascinating direction, and a necessary one, and we have put a fair amount of thought into how a Novamente should be specially architected to have the greatest chance of success at modifying its own sourcecode for the better. 

 

References

  • Bonabeau, E., Dorigo, M., Theraulaz, G.  (1999)  Swarm Intelligence.  Oxford University Press. 
  • Chislenko, Alexander and Madan Ramakrishnan (1998).  Hyper-Economy: Combining price and utility communication in multi-agent systems, Proceedings of ISAS 98, http://www.lucifer.com/~sasha/HEDG/ISAS98submission.html
  • Dreyfus, Hubert (1992).  What Computers Still Can’t Do. MIT Press, Cambridge, Mass.
  • Deutsch, David (1985). Quantum theory, the Church-Turing principle and the universal quantum computer.  Proceedings of the Royal Society of London A 400:97-117.
  • Goertzel, Ben (1993). The Structure of Intelligence: A New Mathematical Model of Mind . New York: Springer-Verlag.
  • Goertzel, Ben (1993a). The Evolving Mind . New York: Gordon and Breach.
  • Goertzel, Ben (1994). Chaotic Logic: Language, Thought and Reality from the Perspective of Complex Systems Science . New York: Plenum Press.
  • Goertzel, Ben (1997). From Complexity to Creativity , New York: Plenum Press
  • Goertzel, Ben (2001), Creating Internet Intelligence, New York: Kluwer Academ
  • Goertzel, Ben (1996). The Unification of Mind and Spirit. Unpublished manuscript available online at .
  • Hameroff, Stuart (1998). Funda-Mentality: Is the conscious mind subtly linked to a basic level of the universe? Trends in Cognitive Sciences 2(4):119-127 (1998).
  • Mandler, G. (1985). Cognitive psychology: An essay in cognitive science. Lawrence Erlbaum Associates, Hillsdale, NJ.
  • Minsky, Marvin (1988). The Society of Mind. Simon & Schuster, New York.
  • Penrose, Roger (1994). Shadows of the Mind. Oxford University Press, New York.
  • Searle, John R. (1980). Minds, Brains, and programs. The Behavioral and Brain Sciences 3:417-457
  • Turchin V. (1977). The Phenomenon of Science. A cybernetic approach to human evolution. Columbia University Press, New York.
  • Turchin V (1986). The concept of a supercompiler. ACM Transactions on Programming Languages and Systems, volume 8, pp.292-325.
  • Turing, Alan (1950). Computing Machinery and Intelligence. In: Computers and Thought, E.A. Feigenbaum & J. Feldman (Eds.), McGraw-Hill, New York.
  • Vinge, Vernor (1993). Essay on singularity. Online essay, .
  • Wang, Pei (1995). On the working definition of intelligence. CRCC Technical Report 94, Indiana University.
  • Wang, Pei (1995a). Problem-solving under insufficient resources. CRCC Technical Report 101, Indiana University.
  • Weiss, Gerhard, editor (2000). Multiagents systems. The MIT Press, Cambridge, Mass.
  • Wooldridge, Michael (2001). An introduction to multiagent systems. John Wiley & Sons.
  • Yudkowsky, Eliezer (2001). An introduction to the singularity. Online manuscript, .
  • Yudkowsky, Eliezer (2001a). Creating Friendly AI 1.0. Online manuscript, .