Chapter 11

Novamente Questions and Answers

 

Excerpted from the current draft of the forthcoming book,

Design for a Digital Mind: the Novamente Artificial General Intelligence System,

 by Ben Goertzel and Cassio Pennachin

© Ben Goertzel, 2002

Please note, this is a rough draft, and needs a fair bit of elaboration before it’s truly publication-quality. 

It is posted at this stage, in this form, because of the interest expressed to the author in numerous e-mails by  numerous individuals.

 

 

1          Why a Q&A List?

In the preceding chapters, we have reviewed the basic cognitive structures and dynamics of Novamente and their conceptual underpinnings; in this chapter we will give a different sort of overview, running through a series of questions regarding how Novamente approaches various aspects of intelligence and mind, and giving brief answers for each.

This chapter originated during some discussions that Ben Goertzel was having on the SL4 futurist e-mail list in early 2002, regarding Novamente as compared to some other AI systems (specifically: Cyc, Peter Voss’s A2I2 system, and Eliezer Yudkowsky’s DGI [1] theory of mind and related AI design ideas).   In the course of this discussion, it occurred to Ben that it would be desirable to have a list of questions to be asked of any AGI design.  So, Ben created the first version of this question list and posted it to the SL4 list.  The current (second) version incorporates additional questions suggested by Eliezer Yudkowsky and Stephen Reed (a veteran of the Cyc project). [2]  

Of course, full answers to the questions given here would require a complete description of Novamente.  But even sketchy partial answers may be useful in terms of giving the reader a conceptual picture.  This chapter has been positioned at this point in the book as a kind of bridge between the less rigorous conceptual overview of the previous chapters, and the technical details of the following chapters.

In proposing this question list, no pretense of completeness is made, nor is the systematization of questions into chapter sections and subsections intended to reflect any profound analysis of the structure of mind; the systematization is simply for convenience and could have been done in many different ways.

The reader might do well to re-read this chapter after plowing through the rest of the book; read at the end, it may serve as an overview, a reminder of how all the details come together to form a big picture of digital mind.

Finally, in addition to its use as a tool for explaining Novamente, we also intend the question list given in this chapter as a challenge to other AI theorists.   “Do you think you have a workable design for an AGI?  Then we’d like to see your answers to the questions on this list.”   Of course, being able to answer the questions reasonably clearly doesn’t prove that one has a workable AGI design – but it does prove that one has thought through a broad spectrum of relevant issues in the context of one’s AGI design.   And not being able to answer the questions doesn’t prove that one lacks a workable AGI design – but it does suggest that one’s AGI design work is at a sufficiently early stage that its plausibility cannot be rationally assessed.

[NOTE: This chapter could use a great number of examples, throughout.   Nearly every subsection could use an example.  This lack will be remedied in a subsequent revision.]

 

2          Knowledge Representation

 

How are the following represented?

 

2.1         Concepts

Concepts are represented as conceptual nodes or  “concept maps” (a concept map is a set of nodes that are commonly activated together, that contains a lot of conceptual nodes).  Example conceptual node types are: 

·        ConceptNodes: e.g. cat, dog, beauty, and many unnamed concepts

·        PredicateNodes: e.g. eat, walk, die, and many unnamed predicates

·        PredicateNodes containing complex encapsulated schema, often convenient (though not necessary) for representing complex concepts like “dogs that eat ugly cats”, “cows that have at least one but not more than three broken legs”, etc.

2.2         Types (kinds) and individuals

The individual/group distinction is represented on the Atom level, using special link types rather than special node types (an option we also considered).  For example, a MemberLink links a node interpretable as an element of a set to a node interpretable as a set.  A Subset link joins two nodes interpretable as sets with the obvious semantics.  These are different from InheritanceLinks: “Ben” has a MemberLink to “Goertzel family” whereas he has an InheritanceLink to “Silly people”

These distinctions are reflected implicitly on the map level, with two main factors.

·        Map A is interpreted to have a relationship of type R to Map B if there is a high total-weight of links of type R joining Atoms in map A to Atoms in map B

·        Map A may be considered a subset of Map B if the Atoms in A are a subset of the Atoms in B

2.3         Intension and Extension

On the Atom level this is represented using special link types.  The default logical link types, such as Inheritance, Implication, Equivalence and Similarity, implicitly represent combined intensional/extensional information.  To restrict attention to only intensional or extensional information requires a special link type, such as ExtensionalSimilarityLink, SubsetLink, ExtensionalImplicationLink, etc..

On the map level, this distinction is represented implicitly, with extensional and intensional “map links” corresponding to bundles of extensional and intensional links between Atoms.

2.4         Functional terms (e.g. “the sum of one and two.” “the flower of the oak tree.”)

These are represented as SchemaNodes or PredicateNodes, or as “schema maps” or “predicate maps”

2.5         Logical inheritance (or “subsumption”) between concepts

InheritanceLinks and related link types on the Atom level; implicit inheritance relationships on the map level

2.6         Relationships between concepts

These are links between the Atoms representing concepts, or “implicit map-level links” between the maps representing concepts.

2.7         Argument constraints on relationships

In the current design these are represented in two different ways:

·        There are type argument constraints, which restrict arguments according to Atom type, e.g. NumberNode vs. conceptual node vs. SchemaNode

·        Argument constraints that do not pertain to Atom types are represented as relationships, just like any other relationships

We are not sure whether we’ll merge the first type of constraint into the second, in a later design revision.

2.8         Logical implication (or “subsumption”) between relationships

ImplicationLinks and related link types on the Atom level; implicit implication relationships on the map level

2.9         Quantified relationships (e.g. “Every boy loves a girl”, “Every boy has a dog that every girl likes to call by some special pet name”)

These are explicit Relationships involving compound PredicateNodes containing complex internal predicate graphs, OR, implicit map-level relationships involving complex predicate maps,.

There is no explicit representation of quantifiers; combinatory logic embodying SchemaNodes are used to represent quantifier relationships as functional relationships (an idea similar to “lambda lifting”)

2.10     Logical operators (e.g. “not”, “or”, “and”, “xor”, “equivalent”, “implies”)

Equivalence and Implication are basic link types, and also exist implicitly on the map level.

On the Atom level, Boolean operators are represented as specific atomic schema living inside SchemaNodes.   Boolean relationships can also exist emergently and implicitly between maps.

2.11     Rules (e.g. “if something is a dog, then it has fur.”)

A rule is generally just an implication relationship (ImplicationLink or related link type)

2.12     Events (is the event reified with roles for actors, participants, situation, etc., and/or is the event represented mainly by the actions that occur?)

There is no built-in data structure embodying the archetypal structure of an event.  However, this kind of schema is expected to emerge, both on the Atom level and on the map level.  Once such a schema proves useful in a number of instances, it will become important, and will hence be automatically invoked when event understanding is required.

If one wished to build in this kind of event-processing schema (not an unreasonable idea, as humans are likely explicitly genetically endowed with such schema), one would do so on the Atom level, by creating an appropriate compound SchemaNode (or a set thereof) embodying this kind of “template.”  The danger of this is that one creates an overly rigid template not capturing the diverse context-dependent meanings of “actor”, “participant”, etc.   For this reason it is anticipated that it’s going to be better to teach the system event representation using carefully controlled EIL rather than to explicitly program in event templates.  The problem is that, although humans come with such templates built in, our built-in templates are extremely fuzzy and complex, hard to capture in a brief list of rules.

2.13     Actions (preconditions and effects suitable for planning)

Atomic actions are represented as schema (built-in C++ functions drawn from a limited vocabulary).  Complex actions are represented as compound SchemaNodes, or as schema maps. 

Information about the preconditions and effects of actions is embodies in ExecutionLinks associated with the actions.

Future versions of the system with strong self-modificatory capabilities will be able to rewrite and augment their own “built-in schema libraries.”

2.14     Procedures

A procedure is treated as a kind of “complex action” as discussed immediately above.  The structures and dynamics are not different for actions that affect the physical world, versus actions that are purely internal in their direct ramifications.

2.15     Time (e.g. “Today is Saturday.”, “at night” “Step A happens before step B.” “During the 1980's we feared the Soviet Union.”)

On the Atom level, there are several specialized schema dealing with time, including atTime which allows precise time-stamping, and before and after which deal with temporal comparison.  These may be automatically extended by the system to yield more general concepts of before, after, and “atTime” which apply to events where no precise time-stamp is available.

If one wished to supply the system with a kind of “built-in temporal logic”, one would do so by feeding it a set of relationships regarding these in-built temporal schema; however, it is not clear that this kind of wiring is going to be necessary.

On the map level, there are two kinds of implicit temporal relationship:

·        That which emerges from explicitly encoded temporal relationships among the Atoms in the maps

·        Actual temporal relationships among the maps, e.g. “map A often occurs before map B” (these may be recognized as explicitly encoded temporal relationships between encapsulated versions of the maps, as usual with implicit map relationships

2.16     Counterfactuals (e.g. “Dracula is a vampire.” “There are no human vampires.”)

On the Atom level, we have HypotheticalLink, which is used not only for counterfactuals as such but also to represent embedded knowledge more generally

There is implicit propagation of this to the map level as usual

2.17     Percepts

There are special node and link types representing perceptual data.  For instance, an incoming word “cat” might be represented as a ConcatContainsLink containing three CharacterInstanceNodes (for “c”, “a” and “t”).  An observed black pixel at coordinate (130,444) on a screen might be represented as EvaluationLink PixelColor (130,444) Black [where (130,444) is a ConcatContainsLink of two NumberNodes; PixelColor is a PredicateNode; and Black is a ConceptNode)].

2.18     Thoughts (considered as temporary mental phenomena, actively evolving and not committed to memory)

A thought is a pattern of activation among Atoms.  Specifically, it involves either:

·        A set of Atoms highly active at a given time

·        A set of Atoms that, on average, is highly active over a period of time

and the set of patterns emergent from this set of Atoms and their activation levels.

2.19     Beliefs about named concepts

On the Atom level, beliefs are represented as links (relationships) of various types; the concepts the relationships are “about” are the nodes the links join.

On the map level, beliefs are represented as implicit relationships between maps.

The name of a concept is represented as a WordNode or PhraseNode (a WordNode is basically a list of CharacterNodes; a PhraseNode is basically a list of WordNodes).  The node containing the name is linked explicitly to the node representing the concept, or linked implicitly to the map representing the concept.

2.20     Beliefs about unnamed concepts

Beliefs about concepts are represented the same way regardless of whether the concepts have names or not.   Most concepts may not have names in a mature Novamente.

2.21     Beliefs about procedures

ExecutionLinks describe the activity of procedures, and beliefs about procedures can be represented in terms of them.

2.22     The degree/strength/certainty of a belief

Atoms have TruthValues, which contain TruthValue objects, which may of several different forms: single numbers (probabilities), (probability, weight-of-evidence) pairs, (probability, weight of evidence, variance) triples, distributional approximations.

Maps and implicit map links have implicit truth values.

2.23     Hypothetical knowledge (“Pei believes the Earth is flat”)

Dealt with via HypotheticalLink, as mentioned above.

2.24     Contextual knowledge (“At parties, Jill is cute.”)

There is a link type called ContextLink, with semantics “ContextLink Context Relationship,” e.g.

ContextLink parties (InheritanceLink Jill cute)

This is just a convenience, as contextual relationships can be represented in terms of generic logical relationships.   In dealing with numerical data, we have a special node type called NumericalContextNode, which caches some numbers relevant to a context defined by numerical data, such as mean, variance, etc.

2.25     Fuzzy lexical terms (“near” “very close to” “approximately” “sort of” “might be”)

Each such term will be represented as a collection of predicate schema, embodying different arithmetical functions, associated with different (general or specific) contexts.

One could try providing such predicate schema in advance, but this seems an example of something that can be relatively easily experientially learned, given careful and systematic instruction.

2.26     Reflection -- the representation of the AGI by concepts and relationships, in particular its behavior

Expressing concepts and ideas and relationships about itself, is no different than expressing concepts, ideas and relationships about other things.

For reflection to exist, there must be one or more Atoms or maps representing the system as a whole.  Suppose such an Atom exists and is called “I”, then we may have relationships such as

atTime (TimeNode: 2002) HypotheticalLink (believe I (InheritanceLink cat animal <.8>))

which states that in 2002 the system held the belief “InheritanceLink cat animal” with strength .8.

Atoms and maps representing the system as a whole should not be programmed in, they should be acquired via experiential learning.

2.27     Associations between concepts and natural language words & phrases

Where a concept is represented by an Atom, a SimilarityLink joins a concept and the word\phrase that represents it.   Where a concept is represented as a map, an implicit map-level SimilarityLink joins the concept map and the word that represents it.  (Representation of words as maps is also possible, but is not expected to be common, as an encapsulated representation is far more convenient for a perceptual construct of such small size.)

2.28     Generic associational relationships between entities

AssociativeLinks, formed by a mechanism related to Hebbian learning, join Atoms that have been found to be generally associated in the system’s experience.

Implicit association relationships exist between maps that are commonly active at the same time, and these will normally correspond to bundles of AssociativeLinks joining the Atoms in one map to the Atoms in another.

2.29     Is there a representational and/or dynamical distinction between feature structure, category structure, and event structure?

The features of a perceived object or enacted action are represented as (explicit or implicit) relationships involving Atoms or maps representing the object or action.

Categorial structure involves logical relationships – grounded in InheritanceLinks and related link types – spanning conceptual Atoms or maps.

Event structure involves relationships concerning temporality; built-in schema like before and after, plus any learned or built-in templates (complex predicates) embodying the “archetypal structure of events.”

So, feature structure, category structure and event structure will habitually involve different types of relationships, but they do not correspond to explicitly separate knowledge representation mechanisms.

They do correspond to explicitly distinct learning mechanisms, in the sense that there are specialized (and to some extent, modality-dependent) techniques for finding features in perceptual data and action procedures, special techniques for category formation and maintenance (e.g. clustering, supervised categorization), and specialized techniques for recognizing event structure (time-based data mining).  However, in each of these cases, the specialized techniques are only part of the story, and much of the relevant dynamics is generic.

2.30     Is there a serialization of the knowledge store in XML or some other standard format(s)?

The knowledge of the system at any given time can be saved to XML.  However, the vast majority of this knowledge will not be meaningful to humans to read, as it will involve large collections of relationships of various types between unnamed concepts, complex unnamed encapsulated schema & predicates, etc.  A highly and carefully filtered version of the knowledge store might be meaningful to humans, and of course other software programs may be able to make use of the more complex knowledge stored.

3         Combinatorial Patterns

 

3.1         What kind of combinatorial patterns does the system contain?  (I.e., complex patterns formed by combining simpler ones)

What we call “maps” are sets of Atoms that are active at about the same time; or whose activity follows a certain simple dynamical pattern.  This is one kind of combinatorial pattern.

SchemaNodes and PredicateNodes, and schema and predicate maps, may embody arbitrarily complex mathematical functions of Atoms.  The basic combinatory operators used here are:

  • Arithmetic functions
  • Logical operators

·        Combinators (S,K,Y,B,C, etc.)

  • List and set operators

 

3.2         What kind of combinatorial operators are used to construct them?

 

Many schema are operators that combine Atoms (and these may be executed in a distributed way, or wrapped up inside compound SchemaNodes).   A set of atomic schema is provided, including basic logical and arithmetic operators.  More complex schema may be formed through various learning processes; these may then be encapsulated and used just as easily as the original atomic schema.

 

3.3         How large are the individual patterns that are combined?

What are explicitly combined are Atoms.  But some Atoms may encapsulate very large schema or predicate networks, potentially. 

A pragmatic upper bound for a single encapsulated compound schema or predicate is probably a few thousand schema instances.  However, some of these schema may themselves represent complex encapsulated schema or predicates.

3.4         How large are the combinations?

Novamente’s cognition mechanisms are intended to build up large combinations in an hierarchical way.  So a given combination is likely to involve dozens of Atoms directly, hundreds in some cases and maybe thousands in rare cases.  But these Atoms being combined may themselves be “encapsulated combinations,” so that when the whole hierarchy is unraveled, a combination could actually involve millions of Atoms.

This hierarchical approach is viewed as a very important heuristic for allowing useful large combinations to be formed without searching the whole space of all possible large combinations.

3.5         When the patterns combine, do they yield a new, larger representation of the same type, or do they yield a different kind of representation?  In the latter case, can the combined representation be transformed back into a bigger building block?

Atoms (which may be schema or predicates) combine to form schema or predicates.  These may be encapsulated to form new Atoms. 

Also, these newly formed schema and predicates may be used to create other Atom types, such as ExecutionLinks, ConceptNodes, etc.  And they may give rise to maps of various sorts, automatically.

 

4           Attention Allocation

 

4.1         Of the many processes the system can carry out at a given time, what determines which ones get the most attention?

Each process has an “importance level”, and processes are given CPU time proportional to their importance.  The importances of processes may be adapted by the system to ensure contextual appropriateness.

In a multi-machine Novamente, different processes may have different importance levels on different machines.

4.2         Of the many knowledge items the system can focus on at a given time, what determines which ones get the most attention?

Each Atom has an “importance,” and many dynamical processes act on Atoms proportionally to their importance.

The importance is determined by a number of factors, including: the neural-net-like “activation” of the Atom in the recent past, and the value gained by processing the Atom in the recent past.  There is a nonlinear iteration, the Importance Updating Function, that incorporates these factors and updates the importance of each Atom periodically.

There are also “interaction channel specific” importances.  So in the case of a system with one interaction channel (one set of sensors and actuators regarding a common environment), each Atom has a generic importance and an interaction-channel importance.  Interaction-channel importances are key to real-time activity.

 

5          Collaborative Processing among Multiple AGI’s

 

5.1         Can the system collaborate with others of its kind?

Yes.  Communication can take place using ordinary human languages (assuming a Novamente with advanced human-language understanding), or using a special Novamente communication language called Psynese.  Psynese lets Novamentes exchange “hunks of mind” and involves methods for dealing with the (usual) case where two minds define the same concepts a little differently.   Psynese should allow Novamentes to collaborate much more closely than humans can.

5.2         How is knowledge replicated?

Knowledge is transferred from one system to another using Psynese.

5.3         How is knowledge merged when two similar concepts are independently created by distributed instances of the AGI

This is the same as the “revision” process by which identical concepts or relationships from different sources are merged within a single Novamente.  The revision process requires each version being merged to have a certain “confidence” attached to it.  In the case of merging things formed by different Novamentes, the confidence indicates how reliable each Novamente in question is, in the particular contexts involved.

5.4         How do distributed instances of the AGI collaborate to achieve goals?

There are many options here.  Novamentes can collaborate loosely through ordinary communication, much as in human collaborative groups.  Or, one can define a “controller” Novamente for a community of Novamentes, and give it the ability to sculpt the Novamentes under its control.  There are also many other possibilities.

5.5         What is the security model? (how are “viruses” and the like prevented).

Inter-Novamente messages will be wrapped up using public-key encrypted messages.  No Novamente will accept a Psynese package of mind-stuff from another Novamente unless it has the right key, to verify it is actually from where it’s supposed to be from.

Within a single multi-machine Novamente instance, on the other hand, it is unacceptably inefficient to encrypt traffic.  So it is assumed, in the simplest case, that a single Novamente instance lies behind a firewall.  A more complex Novamente configuration may involve one or more “mind clusters,” with the machines in each cluster communicating unencrypted, and some machines that do not live in any cluster but only handle certain tasks that don’t require high-bandwidth communication with the other machines.

 

6          Knowledge Acquisition

 

6.1         What facilities are provided for acquisition of knowledge from experts?

Declarative knowledge can be entered in a format called KNOW (which has textual and XML variants).

Procedural knowledge can be entered in a simple functional programming language called Sasha

This is not intended as the system’s primary mode of knowledge acquisition; we believe this kind of knowledge acquisition is secondary to experientially acquired knowledge.

 

6.2         Can information be extracted from text?

A mature Novamente that knows a human language will be able to read text and extract knowledge from it.

Prior to that, in the very short run, Novamente can also be coupled with standard NLP systems to aid the semantic aspect of information extraction; but this is not part of the long-term Novamente plan.

 

6.3         Can information be extracted from specialized databases?

This requires translation of DB formats into Novamente Atoms.  Currently we are doing this by hand, and it’s not too difficult, but requires special attention to each DB separately. 

Eventually, a mature Novamente will be able to automatically figure out the format of a DB and extract the information from it.

 

6.4         Can information be extracted from sensory devices such as cameras, medical imaging devices, audio signals, etc.

Novamente represents quantitative data in a natural way, using two representations: a special repository of numerical multidimensional arrays, and an expanded representation in terms of ListLinks of NumberNodes.

In many cases, appropriate processing of data from a sensory device requires substantial “preprocessing.”  Currently it is a matter of judgment in each case, which preprocessing is done before Novamente sees the data, and which is done by schema within Novamente.   Ultimately Novamente should have complete control over all data preprocessing; it should be noted, however, that humans do not have very much control over their own data preprocessing, which is wired into low levels of the brain and into sensory organs. 

6.5         Reasoning

 How are the following types of reasoning carried out?

    1. Deduction
    2. Induction
    3. Abduction (hypothesis formation, and best explanation for observed evidence)
    4. Analogy
    5. Revision (combination of evidence from different sources about the same concepts or relationships)

There is an inference MindAgent that carries out deduction, induction, abduction, analogy and revision along with other related inference rules

It is based on a foundation of probabilistic term logic.  Unlike in Bayes nets and related approaches, probabilistic consistency is not assumed across the whole knowledge base.  Rather, a locally consistent probability model is constructed in the context of each individual inference step.  The inference rules follow the “pattern matching” format of term logic rather than predicate logic.

The inference system handles both first-order inference and higher-order inference (HOI).  HOI deals with logical combinations like AND, OR, NOT; and with relationships that join relationships rather than nodes.  Variables and quantifiers are not explicitly used; instead a combinator-based representation is used to reduce such constructions to complex networks of elementary schema and predicates.

Inference on maps takes place implicitly via implicit logical relationships, but will have a much lesser degree of precision.

6.6         Planning

Planning is a matter of learning implication relationships of the form

PredictiveImplicationLink Predicate Goal

where the Predicate is a combination of ExecutionLinks denoting the execution of particular actions, and (often) temporal relations interrelating these ExecutionLinks.

6.7         Finding of generic “associations” between mental entities

In Novamente lingo, association-finding is not a reasoning process, but rather a specialized process using neural-net-like activation spreading to find entities that generally are active together.  It results in the construction of AssociativeLinks.

6.8         Prediction (“I expect that I will run much faster on the new computer.”)

Predictive relationships are simply logical implication relationships with extra time constraints attached; some simple examples can be represented as PredictiveImplicationLinks.

In practice, finding predictive relationships through general inference tends to be intractable unless one uses a special control mechanism that looks specifically for simple predictive relationships that have held up historically.   We have put a lot of effort into designing this kind of control mechanism, because much of our practical datamining work with the current version of the system involves prediction.

6.9         Truth Maintenance (when a fact or rule is retracted, in what way and to what extent are solely supported facts automatically retracted).

If a piece of knowledge is retracted (i.e. its truth value, which used to be nonzero, becomes zero) then “facts that were solely supported by it” (i.e. Atoms that are primarily derived from it)  will not automatically and immediately be retracted.  Rather, the system’s inference processes will naturally revise the truth values of these Atoms, at its own speed.  If the retracted knowledge was very important, then this will happen quickly; otherwise it may not.

 

7          Classification

 

7.1         Classification based on background knowledge (“This small animal has fur and barks, so it could be a dog.”)

In Novamente terms, this is an example of inference.  Member-category relationships are represented by InheritanceLinks and related link types, which are formed based on raw data and based on inference from other links.

7.2         Clustering-like unsupervised concept formation

This happens in two ways. 

First of all, there is a node fusion process that creates new nodes by merging existing ones.   Acting over time, this behaves something like agglomerative clustering, with the Novamente system as a whole providing adaptive control. 

Secondly, there is an explicit clustering MindAgent based on the Bioclust algorithm, which finds approximate cliques in the subgraph of Novamente defined by node and SimilarityLinks that span them.

7.3         Supervised-learning-like model-building based on examples of given categories

Again, this happens in two ways.

There is concept-based predicate learning, which seeks to build a PredicateNode modeling a ConceptNode.  The compound predicate inside the PredicateNode is the “model.”

Alternately, one can run decision tree induction (or some other ML algorithm) on data consisting of categories of Novamente nodes and links.  The model learned by the ML method can then be imported back into the Novamente system and reasoned on, and potentially refined.  Decision tree learning is particularly amenable to this.

 

8          Learning

 

8.1         What kinds of things does the system learn?

 

Declarative knowledge, procedural knowledge, optimal settings for its own parameters in various settings, knowledge about how to achieve its goals and about how it and others behave in various circumstances.  Basically the same variety of things that a human learns.

 

8.2         What kinds of things are preprogrammed?

Ultimately, all things that are preprogrammed will be susceptible to intelligence-directed modification and replacement.

For now, we do not preprogram knowledge but we preprogram a collection of node and link types representing different types of information and relationship, and a set of processes called MindAgents that act on these nodes and links.

 

8.3         What kind of things are invented on the spur of the moment?

 

New concepts, new knowledge, new thoughts, new procedures for doing things, new systemwide parameter settings (embodying new states of mind),.…

 

8.4         Do the answers to these questions reflect structural distinctions - i.e. different kinds of cognitive content within the AI?

 

There are different kinds of cognitive content, but there are deep similarities among the different kinds, both in terms of data structures and in terms of dynamics.  The biggest difference is between procedure knowledge and declarative knowledge. 

In practice, we program in more procedural knowledge than declarative knowledge.   But both are learned fairly intensively right from the get-go, both via long-term background learning processes and on the spur of the moment.

 

8.5         How does the system learn things that were not on the programmers' minds when they created the system?  For example, if the AI had no built-in code to identify when two groups contained the same number of objects, how would it learn the concepts "five" and "number"?

 

New declarative and procedural knowledge is built up from the “atoms” of the system, from basic perceptions, actions and relationships.   This building-up occurs via the cooperation of a variety of different AI processes.

 

8.6         When the system learns something, what are the specific ways in which the learned content and the experience of learning contribute to solving similar future problems?

There are a lot of different learning algorithms that enable this kind of knowledge transfer.  Probabilistic reasoning, neural net like association spreading, node fusion and fission which creates new structures out of old…

 

9          Procedure Learning

 

9.1         How are new procedures learned?

By  a combination of inference, association-spreading, reinforcement learning, evolutionary programming, and self-assembly based on simple “construction rules”

9.2         How are existing procedures reinforced when they lead to desirable behavior?

Using a special variant of the “bucket brigade” reinforcement learning algorithm.  GoalNodes, when satisfied, spread a special kind of activation back to SchemaNodes whose activity helped them to be satisfied. They pass this activation yet further back to the SchemaNodes that helped them, etc.

9.3         How is declarative knowledge converted into procedural knowledge?

The system rewards procedures satisfying (declaratively given) constraints relating to its goals.  This causes the system to behave like an “evolutionary programming” system, evolving new procedures and letting the fittest survive, where fitness is determined by declarative knowledge.

9.4         How is procedural knowledge converted into declarative knowledge?

ExecutionLinks record the input and output of a SchemaNode every time it executes.  These can be reasoned on. 

 

10     Goals

10.1     How are the system’s goals represented?

Using GoalNodes, and maps dominated by GoalNodes

 

10.2     Does it have initial built-in basic goals, and if so can it modify them?

Yes, it does have initial basic built-in goals.  Modifying the built-in goals is an advanced function, to come along with general sourcecode modifiability in a later system version.

10.3     How does it create new subgoals and supergoals from its existing goals?

Via inference and node combination operators

10.4     How do the goals guide the system’s various processes?

 Goals are constantly pulsing out activation, so things related to them become active.  Goals also use reinforcement learning to reinforce schema that act in their interest.

 

11     Internal Sensation (“Feelings”)

 

11.1     How does the system sense its own internal state?

There are a number of built-in “internal sensors”, and there are nodes and links whose purpose is to constantly return the output values of these sensors.  These include FeelingNodes, and then special schema like, e.g. GetImportance, which returns the importance value of its Atom argument.

11.2     Can it create new ways of sensing its own internal state?

Initially it can just combine the existing internal sensor outputs in various ways.  Since these sensors are very low-level in some cases, this in principle gives the system the ability to sense just about any internal data-pattern there is.   But some higher-level internal sensors are given in advance, and this does bias the system’s internal-data-pattern-finding in certain directions.

11.3     How does the output of the system’s internal-state-sensors affect the system’s behavior?

Because many GoalNodes refer to the states of FeelingNodes.

 

12     Consciousness

 

12.1     Do you expect the system will at some point consider itself “conscious”?

Yes, if it is taught this human-language word and concept, as is likely.

It is interesting to ask whether a community of nonhuman intelligences would naturally create its own concept similar to our concept of “consciousness.”  But Novamente will presumably not be a good subject for this experiment, as it will be closely taught by humans and will hence absorb a good bit of human culture, though it will interpret it in its own way.

12.2     If so, what parts of the system will be most closely associated with its self-described “conscious” experience

The most important Atoms in the system at a given time

12.3     Will the system have different states of consciousness (as subjectively perceived by itself)

Yes, and these will correspond to different system parameter settings, as well as to different mixtures of Atoms or different dynamical patterns in the set of “maximally important Atoms.”

12.4     If so, what will some of the most important ones likely be? 

In general, it is hard to foresee the precise breakdown between states of consciousness that will be experienced by a Novamente system.

There will definitely be externally-focused versus internally-focused states of consciousness.  Also, adventurous versus conservative, and intuition-oriented versus reasoning-oriented. It should also be possible to induce a meditative, “oceanic awareness” state of consciousness as well, in which the system’s attention is continually drawn to the oneness of all things.

But these are really all speculations.  The geography of Novamente’s consciousness landscape will have to be discovered through experience.

 

13     Linguistics

 

How are the following processes carried out?

13.1     Noun phrase parsing (“young computer scientists” “laptop computer parts”)  and sentence parsint

Parsing is carried out via logical unification among PredicateNodes (with other cognitive processes playing a supporting role).  This is in the spirit of “feature structure unification grammars” which have been used in linguistics for some time.  Predicates representing grammar rules must be unified with node and link constructs representing sentences and sentence fragments.

Grammar rules must be learned through experience; the notion of “grammar” is not embedded explicitly in the system and nor are any particular grammatical rules for combining words or word categories.  What the system has are cognitive mechanisms that are known to be appropriate for representing, applying and learning grammars.

13.2     Syntactic disambiguation

Parsing and syntactic disambiguation are carried out largely using logical unification, vaguely in the manner of feature structure unification grammars in standard computational linguistics.  Constraints representing grammatical and lexical rules are represented as Atomic relationships.

The unique facets here are the probabilistic nature of the unification process, and the diversity of nonlinguistic knowledge that is used to guide the process via inference and association-spreading.

There must be control schema guiding the parsing process, e.g. telling it when to stop parsing and go onto the next sentence, in cases of syntactic ambiguity.

13.3     Semantic disambiguation and Anaphor resolution

These are carried out via a combination of inferential and associative methods.  Inference is used to try to reason out which of many possible meanings is intended.  If that fails, then the most closely-associated meaning is chosen.  As with parsing, control schema must guide this process.

13.4     Discourse (understanding the rhetorical structure of a document or speech)

This is mainly reasoning, based on a combination of linguistic and nonlinguistic knowledge

13.5     Sentence production (generation)

Sentence production is viewed as a kind of growth process in which more and more words progressively attach to a “thought”, obeying the rules of syntax as they accrete.

Inference is used to ensure grammaticality, based on the same grammar-embodying relationships that are unified during the parsing process.

·        Transformation of nonlinguistic “thought forms” into sentences

·        “Semantic mapping” of sentences into nonlinguistic thought-forms

 

13.6      If you stripped away all the English names from all AI content and replaced them with randomly generated strings, so that the AI could no longer handle human-supplied problems and human-supplied data which required identification or invocation of concepts by English name, what capabilities would the AI still possess? 

 

This question (posed by Eliezer Yudkowsky) is intended to identify AI systems that are based on explicit encoding of human knowledge using human language.  It asks what happens if, in Novamente, one does the following process:

 For all WordNodes and WordInstanceNodes in the system, replace the ListLink of CharacterNodes that is connected to the WordNode or WordInstanceNode by a ConcatContained Link, with a ListLink to a random sequence of CharacterNodes.  Do this in a way so that the WordInstanceNodes with MemberLinks to WordNode X all get the same random sequence of CharacterNodes

The answer is:

 

1) The links to the random sequences will be forgotten in time as they will prove useless, as will be the ListLinks of random char sequences and the bogus WN/WIN's

2) In time, through exposure to human language again, new WordNodes will form representing actual linguistic groundings for the system's concepts

 

However, process 2 may take a while.  This is less hard than first language learning because a linguistically-inspired concept structure has been left intact; it's basically like second-language learning of a system that has no phonological or typographical similarity to one's first language.

 

14     Perception

 

14.1     How are perceptual gestalts recognized?

Via the creation of maps or new (map-encapsulating) nodes embodying the gestalts.   The different perceptual nodes forming the gestalt will interlink to each other, forming a “mind-attractor” or map with dynamical coherence.  This map may then be encapsulated in a single compound PredicateNode.

14.2     How are inputs from different sensory modalities merged and interrelated?

Inputs from different modalities may correspond to different perceptual node types, e.g. CharInstanceNode, PixelInstanceNode, SoundInstanceNode….  However, these perceptual nodes are all interrelated by the same types of relationships, and they may all link to ConceptNodes, PredicateNodes, SchemaNodes, etc.  The merging is thus done by the generic cognitive mechanisms, using these relationships between modality-specific Atoms and generic cognitive Atoms.

14.3     How are sensations turned into perceptions?

Individual perceptual nodes (say, PixelInstanceNodes) may be thought of as “sensations”; it is the integration of an incoming node of this sort with other nodes in the system (via various relationship-building processes), that turns it into a full-fledged “perception”.

14.4     Given a huge set of perceptions, how does the system recognize potentially salient patterns in the set?

The same way it recognizes potentially salient patterns in actions or abstract thoughts: using a combination of mechanisms.  Inference and association-finding are pattern recognition methods, as is activation spreading (via map formation).  We also do explicit mining of repeated patterns in the Atomspace using a variant of the Apriori datamining method.

14.5     What needs to happen in order for the system to notice an implication between two perceived events?

Nodes representing the events must be formed, and then these nodes must be selected by the inference MindAgent, which can create an appropriate ImplicationLink between them.  If the nodes are important they will be selected rapidly by the inference MindAgent.

14.6     What needs to happen in order for the system to notice an implication between two perceptual features of an object?

The answer here is the same as for 14.5, with “feature” substituted for “event.”

 

14.7     How does the system track object-part hierarchies in perceptual data?

The “part-of” relationship is a subtle one, and is not embodied in the system as a primitive relationship type.   We believe that the notion of “part-of” is a complex predicate which takes different forms in different contexts.

Much of the system’s recognition of “part-of” relationships, however, is grounded in its primitive schema and predicates for manipulating and evaluating lists , e.g.

elementOf x L

elementAt x L 1

These relationships between lists and elements give a low-level means of representing part-whole relationships, out of which complex “part-of” concepts can be built in context-appropriate ways.

 

14.8     Can the system imagine perceptual data of the same kind as is produced by its sensory capacities?

Yes, it is possible for new perceptual nodes like PixelInstanceNodes to be created via cognition rather than via direct sensation of external-world data.  However, this will rarely be useful; more often it will suffice to create hypothetical relationships among hypothetical perceptual nodes such as PixelNodes, without going down to the “instance” level.

15     Action

15.1     Does the system originate internal actions?

Yes, the nodes called SchemaInstanceNodes contain small programs that carry out actions -- some internal, some external.  Complex actions are carried out by networks of interconnected SchemaInstanceNodes – “schema maps” or “distributed schema.”

 

15.2     How does the system choose between actions?

Schema execution is activation-driven: SchemaInstanceNodes getting sufficient activation are allowed to execute the programs they contain.  Complex distributed schema are then enacted as activation spreads among the SchemaInstanceNodes they contain.

 

15.3     How does the system learn real-time skills?

Real-time skill learning is a special case of general schema learning (procedure learning), which uses a combination of Novamente learning processes.   However, different processes are valuable on different time-scales; inference is good for incremental adaptational learning, whereas evolution is better at providing massive leaps in effectiveness over long periods of time. 

 

15.4     How does the system learn real-time reflective skills?

Reflection is carried out via cognitive schema, which are learned via the same basic mechanisms as perceptual or motor schema.

 

16     Memory

 

How are the following distinctions represented, in terms of both structures and dynamics?

 

16.1     Short-term vs. long-term memory

Long-term memory, in Novamente, is the default situation: it’s the Novamente Atomspace.

Short-term memory, roughly speaking, seems to correspond to the collection of maps that are active in the system at a given point in time.  

16.2     Declarative versus procedural memory

Declarative and procedural memory are stored in Novamente in an “interlocking” way, i.e. there are not separate memory tables for declarative and procedural knowledge.  However, they involve different types of Atoms: ConceptNodes and PredicateNodes in the declarative case, and SchemaNodes in the procedural case.

 

16.3     The modality-dependent “perceptual store” aspect of short-term memory, versus the “blackboard” aspect of short-term memory

The “blackboard” is simply the set of highest-importance Atoms (which may often include new Atoms created to embody new thoughs, images, etc.). 

The “perceptual store” is accomplished via the notion of interaction-channel-specific importance.  Items that have recently been perceived have high interaction-channel-specific importance.

16.4     Memory of perceptions versus memory of ideas

Perceptions and abstract ideas are represented by different node types, but the mechanisms for dealing with these node types are the same. 

 

16.5     What determines when an item is “forgotten” from the system’s memory

The Atom parameter “long-term importance” (LTM) governs forgetting from memory.  When the LTM is too low, the item is forgotten to make room for new knowledge.

 

16.6     Is there a mechanism for saving forgotten memories in long-term storage (e.g. on disk) and then retrieving them (and if so, by what retrieval mechanism?)

 

Yes.  Saving is easy, retrieving is the tricky part.  There is a mechanism of “proxy Atoms” which allows Atoms in memory to link to stubs that point to Atoms on disk.  When a stub has been activated enough, the corresponding Atom can be retrieved from disk (and the loading of this new Atom may lead to the loading of new stubs that were not previously pointed to by any active Atom). 

 

17     Specialized Cognitive Components

 

What primitive representations and/or dynamics are provided for dealing with

 

17.1     Time

Temporal relationships are dealt with via the TimeNode Atom type, and some associated built-in elementary schema, principally the atTime schema, which takes the form

atTime Atom TimeNode

There is also a special mechanism called the TimeSeriesRepository, which allows efficient storage and manipulation of series of events observed over time.

 

17.2     Space

In the current version of the system, there are no explicit mechanisms for the representation of spatial location.  In the ShapeWorld context, we have PixelNodes and PixelInstanceNodes, which indicate observations of color values at particular coordinates within the 2D “spatial” domain of the ShapeWorld viewing area. 

In future, it may be worthwhile to introduce explicit Atoms representing locations within the visual field of a particular sensor, or other such mechanisms.

 

17.3     Other minds

At this time, we have no specific representational or dynamic mechanisms in place to help Novamente understand other minds.   If we were to introduce such a thing, our first step would be simply to bias the cognitive MindAgents to look for analogies between external actions induced by others and external actions induced by itself.  This kind of analogical reasoning will cause the system to hypothesize other minds that have goals, feelings, etc. much as it does.  However, it’s quite plausible that such analogies will emerge spontaneously without any nudging.

Humans seem to have specialized neural mechanisms for carrying out various sorts of social inference, e.g. for detecting deception in social situations.  We are guessing that these won’t be necessary in Novamente, but are open to the possibility of introducing such things as the work unfolds.

17.4     Self-perception and self-analysis

All of Novamente’s cognitive mechanisms are, in a sense, instances of self-perception and self-analysis.  What Novamente MindAgents do, by and large, is to look at the Atoms existent in the system, analyze relations between them, and create new Atoms embodying these relations.

There are also FeelingNodes, which report overall statistics of the system state.  These  may be reasoned on and generally incorporated into the system’s thinking.   And there are schema that allow the system to observe various aspects of itself, e.g. the number of Atoms of a given type.

Based on Feelings and observational schema, the system may construct one or more overall models of itself (in general or in different contexts).

 

 

18     Causality

 

How does the system represent a causal relationship between two:

 

18.1     Events or Event-categories

There is an explicit link type, CausalLink, which measures the degree of causality between two events or categories of events.  Its truth value is based on a combination of two factors: “predictive implication”, and the presence of a comprehensible causal mechanism.

This is not viewed as a universal explanation of causality, but simply as a basic tool that the system can use, along with other relationship types, to construct its own understanding of causal relationships as they occur in different situations.

18.2     Processes

CausalLinks between SchemaNodes are built, essentially, based on the principle that process A causes process B to the extent that events “process A executed” are causally related with events “process B executed”. 

In the case where a process is represented by a time series of results of the process’s execution, specialized mechanisms are used to estimate CausalLink strength efficiently.

18.3     Actions taken by agents

There is evidence that humans use special means to assess causality of actions taken by humans and other agents.  However, this doesn’t seem to be the sort of thing that should be “wired into” Novamente.   Rather, we believe this part of causal inference must emerge on the map level, rooted in the system’s intuitive understanding of agency and its basic mechanisms for assessing event and process causality.

 

19     Specialized Data Types

 

How does the system represent:and manipulate:

 

19.1     Perceptual data, such as visual, acoustic, tactile, olfactory?

This data is represented as various nodes of the form XXInstanceNode, such as SoundInstanceNode, PixelInstanceNode, etc.  These may be linked to conceptual nodes such as PixelNodes and ConceptNodes, as well as linked to each other using logical links, ListLinks and other sorts of relationships.  They are manipulated just like other types of nodes are manipulated.

19.2     Software code?

A software program in a purely functional language has a particularly simple representation: it’s just a Novamente schema.

A software program in an imperative language is a different story.  In the case of a language like Java that lacks direct heap memory access, it can be translated into purely functional form (as is done, in essence, inside the Java supercompiler) with only a moderate amount of difficulty.

In general, a software program can be represented as an expression in a formal grammar (the grammar embodies the programming language), where the formal grammar is described by a set of implication relationships.  The semantics of the program is then represented by another set of implication relationships.

19.3     Quantitative data tables?

These can be represented using logical relationships, or more compactly using the NumericSeriesRepository.  These two representations can be freely converted between based on the system’s dynamic needs.

19.4     Database records?

These can be directly translated into logical links, but the best way of doing so is dependent upon the structure of the database in question.  Initially, a different translation script must be written for each database; in the long run, this process may be automated, but such automation will require a system with a fairly high degree of general intelligence.

19.5     Text files?

A text itself may be represented as a ListLink of CharacterInstanceNodes, or a ListLink of ListLinks of CharacterNodes (the ListLinks of CharacterNodes representing words), etc.

The file may be represented using a FileNode, as produced e.g. by perception from the FileWorld interface.

 

19.6     Knowledge records from repositories such as the Cyc KB or the WordNet lexical KB?

These can be directly translated into logical links, if one so desires.

 

19.7     Mathematical knowledge

Mathematical facts can be directly translated into logical links, with maximal strength and weight-of-evidence.

 

20     Concept Formation

 

20.1     What is the internal structure of a “concept”

A concept is a set of relationships among percepts, actions and concepts.

 

20.2     In what sort of external relationships is a “concept” involved?

·        Logical and associative relationships

·        Relationships with predicates (indicating if the concept satisfies a predicate)

·        Relationships with schema (indicating what happens when the concept is the argument of a schema, or what sorts of arguments cause the schema to give the concept as an output).

 

20.3     By what methods are new concepts formed?

·        Higher-order inference rules

·        Mutation and combination operators

·        Random generation

 

20.4     What factors determine the choice of method for concept-formation in a given context?

There are parameters determining the frequency of each concept-formation method in each CIM-Unit.  There may also be cognitive schema that enact concept-formation processes, or that adapt the relevant system parameters based on context.

 

 

21     Integration

 

21.1     How does cognition guide perception?

Through links from ConceptNodes to perceptual nodes (and processes that exploit these links), and through links between cognitive schema and perceptual schema.

 

21.2     How does action guide perception?

Via links between

1.      schema carrying out actions, and

2.      schema carrying out perception, and Atoms representing perceived entities

 

21.3     How do cognition, perception and action guide the process of storing an item in memory, or retrieving an item from memory

Memory storage and retrieval have to do with importance updating, and with the creation of relationships binding remembered items to other remembered items.   Cognition, perception and action are all involved in relationship-building, and also indirectly affect importance updating.

In the “attentional focus”, cognition, perception and action often work together to create a coherent whole, a “map”, which is then “remembered” as a mind-attractor pattern emerging among many connections in the Atomspace.

 

21.4     How do concepts relate to “mental imagery”

A mental image is a collection of perceptual nodes that are evoked by a collection of conceptual nodes.  This is the reverse of the direction of causality in ordinary perception, where percepts evoke concepts.

 

 

22     High-Level Structures

 

22.1     What high-level structures exist in the system?

At the highest level, there are patterns of organization that we consider essential:

·        The “dual network”, a superposition of hierarchical and heterarchical connection patterns among Atoms and maps

·        The self, a subnetwork that approximately mirrors the structure of the entire network

·        There may also be subdivisions of the dual network, corresponding to particular perceptual channels, particular contexts, etc.

 

22.2     Which of these structures are explicit in the codebase and which are expected to emerge dynamically?

No high-level structures are programmed in, however, parameters of the system are explicitly tuned to encourage them to emerge dynamically.

 

22.3     What are the primary categories of dynamics in the system?

·        Probabilistic logical inference

·        Association formation via activation spreading

·        Clustering

·        Mutation and crossover of complex schema and predicates

·        Encapsulation of maps

·        Attention allocation via activation spreading and related formulas (the Importance Updating Function), which leads to map formation

·        Activation-driven schema execution

 

22.4     Which of these dynamics are explicit in the codebase and which are expected to emerge?

Atom-level dynamics are programmed into the system, and are expected to give rise to corresponding map-level dynamics.  This emergence of corresponding map-level dynamics requires the Importance Updating Function, activation spreading, and associated dynamics to be in a cooperative, emergence-friendly regime.

 

22.5     Does the system involve a perceptual hierarchy?  A motor hierarchy?  Are these hierarchies interconnected?  How are they constructed?  How do the parts of the hierarchy corresponding to different sensory modality or motor components interact?

These hierarchies are a part of the overall “dual network” structure mentioned above.  They are not “wired into” the system in any way, but they are expected to emerge spontaneously via the system’s AI processes.  Various system parameters encourage hierarchicality; most notably, schema and predicate learning are restricted to the explicit learning of relatively small schema and predicates, so that large schema and predicates must be constructed by piecing together smaller ones in an hierarchical manner.

Hierarchies take the form of hierarchically constructed schema, hierarchically constructed concept maps, and logical links that form a hierarchy of relationships.  These various hierarchies all overlap and synergize.

Perceptual and motor subhierarchies corresponding to particular sensory or action modalities,

 

22.6     How does the system embody functional specialization, e.g. different “cortices” or “units” devoted to different functions?

 

One early analyst of the Novamente design asked us to clarify “How to build a visual cortex with Novamente atoms, or what a olfactory subsystem would look like. Granted these are strained examples given the proposed environment for Novamente, but the point still stands. Presumably Novamente-AI will need a "FileWorld" cortex of some kind.”

My answer was: “A fileworld cortex starts out as a CIMUnit (a localized collection of Atoms and MindAgents, which may span many machines) that has a certain mix of Atom types in it, and a certain mix of MindAgents (rather, a certain set of importance levels for MindAgents).   It then grows into a fileworld cortex by learning appropriate schema and concepts....   This is a matter of what we call Novamente configuration.”

 

23     Belief Systems

 

23.1     How is a “system of beliefs” represented in the system?

Beliefs are simply relationships (links).  A system of beliefs is a set of intersupporting relationships, i.e. a set of relationships so that, if one removed a small subset of the set, the remainder of the set would be closely reconstructed by the system’s learning processes.   Generally, belief systems will correspond to maps, often very large and weak ones (with the property that when one element of the system is strongly activated, the other elements become weakly activated).

 

23.2     What mechanisms does a belief system use to preserve itself over time?

 Inference (via the self-regenerative property of belief systems) and activation spreading (via the generic processes of map formation and map maintenance.

 

24     Mind and Reality

 

24.1     How is the system’s “model of reality” defined?

Its model of reality is simply the subset of relationships that pertain directly to the external world (i.e. that are relationships among externally-grounded perception and action nodes, relationships among relationships among externally-grounded perception and action nodes, etc.)

 

24.2     How is this model constructed?

Through the combination of learning processes that comprises generic Novamente cognition.

24.3     What is the relation between its model of self and its model of reality?

The two will be built up cooperatively and synergetically. 

 

 

25     Creativity

 

25.1     How are fundamentally new ideas, not directly derived from existing ideas in the system, created?

There are only two sources for new ideas in Novamente: combination or mutation of existing ideas in the system, and random creation of new ideas from scratch.   Purely random creation of new ideas plays a small role, but randomness plays a large role in the combination and mutation processes.

25.2     Is there a special component of the system, special structures, or special dynamics devoted to creativity?

We believe it may be useful to create semi-isolated CIM-Units (pools of nodes and links) which have parameter values specifically set to encourage “wild idea formation.”  The basic learning processes and representational mechanisms will be the same here as in the rest of the system, but the parameters will be set less conservatively.

25.3     If creative thinking corresponds to a particular aspect or property of the system, which one (or ones)?

In Novamente, creative thinking involves the same processes as generic learning and routine concept creation.  The difference is that in a creative thinking mode or subunit, concept creation methods are set to create concepts deviating further from existing concepts, inference rules are set to produce more speculative inferences, association spreading is set to favor looser, more remote associations, etc.

 

26     Personality

 

26.1     Will different instances of the system have different “personalities”?

Definitely so – the high-level patterns of action and response will be different for different Novamentes, based on the patterns that self-organize within each one, particularly the different cognitive schemata.

 

26.2     Might there be multiple subpersonalities, in some sense – if so, in what sense? How might different personalities be grounded in different aspects of the system’s structure and dynamics?

Subpersonalities would correspond to very large, fairly weak maps, each one involving a host of cognitive schema as well as particular nodes and relationships embodying declarative knowledge.  

If a system is involved in sharply different external-world contexts, one could expect subpersonalities to form corresponding to the different contexts.  (E.g., chatting on the Web with random users, versus carrying out biological data analysis in conjunction with scientists.)

26.3     In what ways is the system’s understanding of itself affected by its interactions with others?

Others may lead it to carry out actions, among which it recognizes patterns, hence learning about itself.

And, by observing others similar to itself, it learns generalities about how entities live in the world, which it may then apply to itself.

 

26.4     How is morality expressed in the system?

In two ways:

·        Explicitly as part of the system’s goal structure (it may have a goal of causing others to be happy, not causing others to be unhappy, etc.)

·        Implicitly as habits of behavior

 

26.5     To what extent is morality learned versus in-built?  What is the relation between learned and in-built morality in the system?

Essentially, morality is to be taught, not built in.  There is no way to build in a moral rule like “don’t hurt others” because there is no built-in notion of “hurt.”   Through interacting with the system and rewarding it when it does good things, it will internalize general rules regarding what sorts of things are good to do or not….

What is built in, is a goal framework that allows the system to learn general moral rules. 

 

27     Evolution

 

27.1     Are there elements of the system that “evolve” in the sense of surviving with probability roughly proportional to “fitness” in some sense?  If so, what are they?

Atoms and maps all survive differentially with respect to fitness, where fitness is defined roughly as “usefulness in creating new information, and relatedness to other Atoms” (these criteria are embedded in the IUF).

GoalNodes cause schema to survive differentially with respect to fitness, where fitness is defined by the degree to which the activation of a schema causes achievement of a goal.

 These two aspects of evolution interact significantly.

27.2     Is this evolution ecological, in the sense that the fitness of one evolving entity depends on the states of other entities in the system?  If so, specify details.

Yes: the fitness functions mentioned above have to do with “relatedness to other Atoms in the system” and “helping with the satisfaction of GoalNodes.”  Thus nearly all evolution is ecological in nature, though there is also an “intrinsic” aspect to entities’ fitness.

 

27.3     How does this evolutionary process relate to standard models of evolution by natural selection, such as GA/GP, biological population genetics, mathematical ecology, etc.

Some Atoms cross over and mutate in a manner very similar to standard GA/GP.   Other Atoms’ “evolution” involves different “reproductive operators,” e.g. logical inference rules.  

Some evolution is implicit, where the “selection mechanism” is the Importance Updating Function, and the “reproductive mechanisms” are the system’s cognitive processes as a whole.   On the other hand, there is also explicit GP-style evolution, in which schema are evolved using a GA with the fitness function being estimated degree of satisfaction of a GoalNode.

 

 

28     Computation Theory

 

28.1     Does the system rely heavily on (quasi) stochastic processes (e.g. random number generation).  For what purposes?

Very heavily.  Creation of new concepts is driven by stochastic methods, and most cognitive processes work by stochastically gathering a set of Atoms to intercombine.

 

28.2     What are the biggest general computational-complexity issues you’ve faced in building the system (or that you’ll envision you’ll face)

The notion of computational complexity has not proved very useful in Novamente work, because what we are concerned with is the average-case time and space complexity of algorithms given problem instances satisfying particular statistical properties characteristic of the real world.  In many cases this is different from average-case complexity given random problem instances.  Because of this, there is basically no theoretical computationally complexity work that is directly useful for Novamente. 

Of course, this same problem plagues many real-world software systems, but it’s worse for Novamente because so many of Novamente’s algorithms are not only intractable in the worst case, but intractable in the random average case – yet tractable in most real-world situations….

 

28.3     What aspects of the system’s structure/dynamics are the most difficult from a computational-complexity perspective?

Schema learning, and unification of complex predicates, are the most computationally intractable problems we’ve encountered so far.   Of course, nearly all the problems faced by Novamente are intractable in the worst case, but these two are tough even in the average case. 

29     Implementation

 

29.1     What kind of hardware does the system require?

It is designed to run on a cluster of SMP computers, each embodying the standard von Neumann architecture.  To run it on a different sort of architecture would require a lot of low-level changes, though no significant changes to the fundamental AI underlying the system. 

Webmind 2000 was written so as to be easily portable to massively parallel hardware, but this resulted in inefficiency on standard hardware platforms; Novamente does not boast this sort of cross-architectural portability.

29.2     If it is intended for distributed implementation, how are knowledge and processes to be distributed among machines?

There are two levels of structure in a distributed Novamente.  There is a division into CIM-Units, each of which represents a “functionally specialized subunit” which may contain many machines.  We have developed special algorithms for dynamically moving Atoms among the machines in a cluster to approximate maximum efficiency.

29.3     How much RAM and processing power do you think will be necessary to achieve roughly human-level intelligence using the system?

At this stage, we are involved in guesswork here.  Our current guess is a few hundred GB of RAM, serviced by roughly one 1Ghz processor per MB of RAM.

29.4     Does the system involve the “scheduling” of different processes?  If so how is this accomplished?

There are about a dozen mental processes; each one has an “importance” (a different quantity from Atom importance), which determines the amount of CPU time that it gets.   A table is kept defining processes that cannot run simultaneously in the same CIM-Unit (acting on the same Atomspace) without causing conflict; processes that conflict are not run concurrently.

29.5     Is an Application Programming Interface (API) provided that helps                                                                                                                                         integrate foreign systems?

Interaction between Novamente and other systems is most conveniently done via the nmshell, a specialized Unix shell that allows messages to be sent to and from Novamente in a simple shell-script-style syntax. 

Integration of new mental processes into Novamente is a subtler matter.  It is easy to create new Atom types and new MindAgents (objects representing mental processes); the code is object-oriented and the design is clean.  However, the conceptual integration of new mental processes and Atom types with existing ones is a very subtle matter.  Software issues pale here, compared to issues of semantic consistency and dynamic stability.  The collection of Novamente MindAgents is carefully chosen with emergent functionality in mind, and great care must be exercised in expanding or modifying it.

29.6     Is the implementation open source – entirely or in part or none?

Currently the system is not open-source.  We are considering making the core system open-source, or even making the whole codebase open-source.  There are many issues involved here – not just intellectual property issues, but issues such as finding a small team willing and able to work full-time managing an open-source effort (or finding an appropriate set of Novamente-savvy volunteers).   Productively developing this sort of system in an open-source manner is a very different kettle of fish, as compared to what one has with something like an OS, a Web browser, etc.  Novamente is not a system into which new features can be plugged by relative amateurs; it’s a system in which most substantial modifications and additions must be made in the context of a deep understanding of the whole.

30     Parameters

 

The questions in this section are quantitative in nature, and the answer given are rough estimates.

 

30.1     How many non-fixed parameters does the system have?

Around 100.  The exact number changes as implementation details are experimented with.

 

30.2     How many of these can be optimized once for a system’s whole lifetime and how many have to be tuned adaptively?

All of these can be tuned adaptively, but about 80%of them only need to be tuned very infrequently.

 

30.3     How many of these have to do with the internal performance of some relatively localized system component, versus having to do with the interactions between components?

About 80% have to do with individual components.

30.4     What mechanisms are used for automatic optimization of parameters, either offline or adaptively

Genetic algorithm based optimization is used to study parameter space offline, resulting in long-term improvements in parameter settings.

There are also simple adaptive mechanisms in place for some parameters, which increase or decrease one parameter when another parameter exits its acceptable range.

 

31     Experiential Learning

 

31.1     What kind of environment is the system intended to operate in?  (Physical or virtual, and what kind within these categories)

The cognitive, perception and action mechanisms are basically environment-independent.  However, the current system is tuned for textual and quantitative and database-record inputs.  To enable it to deal with, e.g. 3D visual inputs and robot-arm actuators, with maximal effectiveness, would require the introduction of some appropriate new perceptual node types, and might also require the addition of specialized structures similar to the TimeSeriesRepository, synchronized with the Atomspace but providing efficient access to various modality-specific operations.

31.2     Does the system have a reflective "sensorimotor" environment in which it can learn skills and perceptual categories?

For early training, we have created artificial environments we call ShapeWorld and FileWorld.   These enable simple perception, cognition, action and communication in the contexts of (respectively) 2D shapes and generic data files.  How far we’ll be able to go with these simple environments, remains to be seen.  The Internet provides a rich source of data and a rich variety of interactional modes.  However, it may prove useful to connect Novamente directly to robot-style sensors and actuators, at some point in the future.

31.3     What kind of instructional guidance will it be given, and by whom?

The idea is for humans to teach the system how to do things – first simple things in simulated environments, then more realistic and useful tasks in online or physical environments.  This teaching may take place by example, by reinforcement, and by explicit instruction.

Explicit teaching of facts may also play a role, but is viewed as secondary, and will take place mainly the context of “teaching how to.”

 

31.4     Will there be a community of AI’s built according to the design, which learn from each other?

Yes.  We are not sure about this, but we believe it likely that interaction with others of like kind will be an extremely valuable guide to self-formation in young Novamentes.   To understand oneself, it is useful to observe others who are similar to oneself.  This is not the only route to self-understanding – especially for an AI, which can observe its own brain and mind processes far more thoroughly than a human can – but it is one route. 

 



[1] Deliberative General Intelligence

[2] The inclusion of some questions by Novamente outsiders has resulted in some questions that may seem at first not quite apropos to Novamente.  However, these are questions that seemed natural to these AI experts, and so they seem worth addressing here.