Interacting robot agents

15. Juli 1996 Luc Steels

Synthesising the origins of language and meaning using co-evolution, self-organisation and level formation.

Luc Steels reports on experiments in which robotic agents and software agents are set up to originate language and meaning. The experiments test the hypothesis that mechanisms for generating complexity commonly found in biosystems, in particular self-organisation, co-evolution, and level formation, also may explain the spontaneous formation, adaptation, and growth in complexity of language.

Luc Steels [1] is director of the Artificial Intelligence Laboratory [2] and Professor of Computer Sciences and Artificial Intelligence at the Vrije Universiteit Brussel.

Introduction

A good way to test a model of a particular phenomenon is to build simulations or artificial systems that exhibit the same or similar phenomena as one tries to model. This methodology can also be applied to the problem of the origins of language and meaning. Concretely, experiments with robotic agents and software agents could be set up to test whether certain hypothesised mechanisms indeed lead to the formation of language and the creation of new meaning.

By a language, I will mean an adaptive system of representation used by distributed agents for communication (and possibly other things). As a communication system, language allows the transmission of meanings from one agent to another using some physical medium such as sound. The agents are distributed in the sense that there is no central controlling agency that defines and imposes a language. Each agent can only gain knowledge of others by interaction. A language is adaptive when it expands or changes in order to cope with new meanings that have to be expressed. Moreover new agents should be allowed to enter the group and agents may leave.

Meaning is here equated with a distinction relevant to the agent. Some meanings, like colors, are perceptually grounded. Others, such as hierarchical relations are socially grounded. Still others, like intentions or descriptions of current actions, are behaviorally grounded. Meanings may arise in any kind of domain and task setting, and need not necessarily be expressable in the language. The set of meanings that humans have access to is ever expanding as new environments are entered and new interactions take place. This should be the case for artificial agents which operate in open dynamically changing environments as well.

My main hypothesis is that language is an emergent phenomenon. Language is emergent in two ways. First of all, it is a mass phenomenon actualised by the different agents interacting with each other. No single individual has a complete view of the language nor does anyone control the language. In this sense, language is like a cloud of birds which attains and keeps its coherence based on individual rules enacted by each bird. Second, language is emergent in the sense that it spontaneously forms itself once the appropriate physiological, psychological and social conditions are satisfied. The main puzzle to be solved is how.

The origins of complexity are currently being studied in many different areas of science, ranging from chemistry [3] to biology [4] . The general study of complex systems, which started in earnest in the sixties with the study of dissipative systems [5] , synergetics [6] , and chaos [7] , is trying to identify general mechanisms that give rise to complexity. These mechanisms include evolution, co-evolution, self-organisation and level formation.

This paper does not focus on the origin of cooperation or the origin of communication in itself, although these are obviously prerequisites for language. These topics are being investigated by other researchers, using a similar biological point of view. For example, Dawkins [8] has argued that two organisms will cooperate if they share enough of the same genes because what counts is the further propagation of these genes not the survival of the individual organism. Axelrod [9] , Lindgren [10] , and others have shown that cooperation will arise even if every agent is entirely selfish. McLennan [11] and Werner and Dyer [12] have experimentally shown that communication arises as a side effect of cooperation if it is beneficial for cooperation.

The emergent communication systems discussed in these papers do not constitute a language in the normal definition of the word, however. The number of agents is small and fixed. The repertoire of symbols is small and fixed. None of the other properties of a natural language such as multiple levels, synonymy, ambiguity, syntax, etc. are observed. The main target of the research surveyed in this paper is to study the origins of communication systems that do have all these properties.

Before starting, an important disclaimer must be made. This work does not make any empirical claim that the proposed mechanisms are an explanation how language actually originated in humans. Such investigations must be (and are) carried out by neurobiologists, anthropologists, and linguists studying historical evolutions, child language, or creolisation.

Here, I only propose and examine a theoretical possibility. If this possibility can be shown to lead to the formation of language and meaning in autonomous distributed artificial agents, then it is at least coherent and plausible. Thus, if meaning creation mechanisms enable agents to autonomously construct and ground meaning in perception, action, and interaction, then it is no longer self-evident that meaning has to be universal and innate [13] . And if the proposed language formation mechanisms enable artificial agents to create their own language, then it is no longer self-evident that linguistic knowledge must for the most part be universal and innate [14] or that language can only be explained by genetic mutation and selection [15] .

Evolutionary language games.

Since Darwin, evolution by natural selection has played a key role in attempts to explain the origin and diversity of species. Evolution takes place when there is a source for generating variety and a selective force that retains those variations that are most adapted. Evolution is not restricted to genetic evolution however. The only requirements are a mechanism of information preservation (i.e. a representation) over which transformations can take place to create variation, and a feedback loop between the occurrence of a specific representation and selective 'success'. In the case of genetic evolution, the information is preserved in the genes. Transformations take place by operations like mutation and crossover. The feedback loop is established through the reproductive success of the organism expressed by a particular set of genes. This success will depend as much on the environment in which the organism finds itself as on the genes.

I propose that evolution is a major force in the origin of language as well. However I do not mean evolution in the genetic sense. Rather, I see an analogy between species and language and between genes and language rules: A language is the consequence of the distributed behavior of a large group of agents. The behavior of each agent is based on a set of linguistic rules. How these rules are encoded is not essential at this point. Variation in a language results when individuals construct new rules or change their existing rules, just like variation in a species results when there are mutations in individual genes.

Genetic selection is based on reproductive success. Linguistic selection can be based on criteria which depend on the linguistic and communicative effect of a rule. For example, the selectionist criteria for a new phoneme or phoneme combination are: Ease of producability and reproducability for the speaker, which includes that the articulators must be able to reach the desired goal states possibly with minimum energy requirements, and ease in understandability for the hearer, which includes that there is enough information in the signal to reliably detect the sound and distinguish it from others in the repertoire. Structures and rules at other linguistic levels have analogous criteria. The whole system bootstraps itself when variation is not only created by mutation of existing rules but by the creation of new rules from scratch. For example, a new gestural phonetic score could be created by assembling a random sequence of articulatory goals, a new word could be created by a random combination of phoneme segments, etc.

Evolution is often studied in terms of games [16] . A game is an interaction between two agents or between an agent and an environment. A game has a certain outcome and changes may occur as a side effect of the outcome. For language games, the changes are in terms of the linguistic rules that the participating agent(s) employ. For example, if a certain association between a word and a meaning is not effective, then the agent should potentially review whether this association is to be part of his lexicon. If a certain phoneme cannot be recognised reliably by the hearer, the producing agent should reconsider whether it is to be part of his phonology.

Many different games can be defined: There are games between the agent and the world that result in the creation of meaning, language games between agents that will lead to the buildup of a lexicon, imitation games that lead to a joined repertoire of sounds (phonology), games that exercise and extend syntax, games that explore and evolve possible interactions (speech acts). The following subsections give concrete examples for some of these games. Because this is a survey paper, the discussion of each example will be brief but more detail is found in cited papers.

The term language game will be used for a complete interaction in which all linguistic levels are implied as well as a context and social relations between speaker and hearer. This use of the term is compatible with Wittgenstein's concept of a language game [17] , although he did not consider language games to have an evolutionary dimension.

Meaning creation through discrimination games

In our Brussels laboratory we have built a variety of robots that execute dynamical behaviors in a robotic ecosystem. These robots, which are the ultimate hardware platform for our language experiments, have a wealth of data channels resulting from internal or external sensors, internal motivational states, actuator states, etc. Perceptually grounded meaning creation operates over these data channels.

One way in which meaning creation can take place is through discrimination games: The agent attempts to distinguish two objects or situations using a repertoire of feature detectors which transform the continuous values of the real world as experienced via these data channels into discrete features. For example, a value from a continuously varying scale for left visible light is divided into regions, which could be lexicalised in English as strong, medium, and weak. When a particular value falls within one of these regions a discrete feature is output by the feature detector. When the game fails, i.e. an object cannot be distinguished from a set of other objects, the agent creates new feature detectors by refining existing ones or by creating new feature detectors for unexplored sensory channels.

As shown in another paper [18] this leads indeed to the buildup of a repertoire of features adequate for distinguishing objects. Moreover the repertoire is adaptive. When new objects are to be considered or new data channels become available, the feature repertoire will expand if necessary.

Many kinds of meaning creation games can be imagined. For example, an agent can attempt to classify an object against a list of classes and extend the feature repertoire or change the definitions of classes in order to be successful. An agent can attempt to predict aspects of a situation by formulating some features and deducing other features from them. Success in a game equals predictive success and failure leads to the construction of more refined features or the revision of prediction rules.

Lexicons through language games

We now turn to the lexicon. As discussed in other papers [19] , evolutionary language games can be defined that lead to the formation of a lexicon. In each game, there is a speaker and a hearer and a set of objects making up a context. The speaker identifies one object (the topic), for example by pointing. Both then find a feature set discriminating the topic with respect to the other objects in the context by playing discrimination games.

The speaker attempts to code this feature set into language by using words in a lexicon that relate words to meanings. The hearer decodes the resulting expression using his lexicon and the game succeeds if the distinctive feature set decoded by the hearer matches with the expected distinctive feature set. When the game fails, the speaker or the hearer change their lexicon. For example, if the speaker does not have a word yet for the distinctive feature set that he wants to express, he may create a new word and add a new association to his lexicon; if the feature set decoded by the hearer is more general than the one expected, then the hearer can refine his associations between words and meanings, etc. There is additional complexity because one word may have multiple meanings and one meaning may be expressed by multiple (competing) words.

Obviously many variants of such language games can be developed, including games where the feedback of success comes from the world. For example, when one agent asks another agent for an object using a linguistic expression, success occurs when the agent gets the object he wants.

Phonology through imitation games

In another series of experiments, conducted together with Bart de Boer, we have shown that a phonology can evolve through imitation games.

The phonology consists of a repertoire of phonemes and phoneme segments which are admissable and distinctive in the language. Agents must develop both the capacity to produce the phonemes and the capacity to recognise them from acoustic signals.

An imitation game works as follows: A speaker picks a phoneme or phoneme sequence from his repertoire, or possibly creates a new phoneme or phoneme sequence by producing a random gestural score. A gestural score is a sequence of articulatory targets. The hearer then applies low level feature detectors to the signal in order to recognise the phonemes. When this recognition is successful, the hearer attempts to reproduce the phonemes again. The speaker now attempts recognition and can provide feedback on whether the result is compatible with the originally produced phoneme sequence.

Our experiments show that a common phoneme and phoneme sequence repertoire indeed develops and under the demands of the lexicon expands. The observed evolution in the complexity of the phoneme repertoire remains to be compared with what is known of natural phonological evolution.

Linguistic co-evolution

In genetic evolution the selectionist criteria are not fixed but derive from an environment which is constantly changing due to co-evolution. For example, one species, acting as a prey, is evolving to become better in escaping its predator. But this causes the predator to evolve again towards becoming better in catching the prey. Whereas evolution in itself causes an equilibrium to be reached, co-evolution causes a self-enforcing spiral towards greater complexity.

Also in the case of language, co-evolution appears to play a crucial role in pushing a language towards greater complexity, at all levels. The ultimate pressure comes from the growing complexity of agent-agent and agent-environment interaction partly enabled by an increasingly more powerful linguistic ability. Thus language complexity feeds on itself and escalates. The lexicon puts pressure on phonology creation to create an adequate repertoire of phonemes. If there are not enough phonemes new ones will be generated through imitation games. The language game puts pressure first on the meaning creation modules, for example, to have enough distinctions. When there are more different types of objects, more distinctions are needed. It also puts pressure on the lexicon to lexicalise the meanings that need to be communicated. Thus the more meanings are used in language games the bigger the lexicon will have to be.

In the experiments conducted so far, multiple word sentences start to emerge because as feature sets become more elaborate more than one word is needed to code a given feature set into words. Syntax becomes a natural need when greater and greater pressure is exerted to express more and more sophisticated meaning within as few elements as possible. The possible origins of syntax are discussed in more detail later.

Different games are coupled because the output of one is used as input by the other. This introduces also additional selectionist pressures so that there is a two-way flow between two interdependent modules: When module 1 delivers input to module 2, module 2 will exercise additional selectionist constraints on module 1. For example, a feature used in discrimination is more appropriate in a language game if it also has been lexicalised. When one agent uses a word and thus certain features, the other agent may have to expand his feature repertoire accordingly before being able to decode the word. Thus there are two selectionist pressures on features: Are they adequate for discrimination and do they have or are they needed for lexicalisation.

Similarly a phoneme is not only appropriate as part of the phonological repertoire when it can be produced and understood, it must also be used by the lexicon.

Another paper [20] illustrates this in more detail based on experiments for the co-evolution of words and meanings by a combination of discrimination games and language games. The agents engage in a series of language games and as part of each language game each agent performs one discrimination game.

Self-organisation

Evolution and co-evolution are in themselves not enough. Usually there are many possible structures which are equally plausible from the viewpoint of the selectionist criteria discussed so far. But out of the many possibilities only one is usually selected and adopted by the total linguistic population. Language and meanings are shared. This is a big puzzle for anyone who seeks a non-nativist theory of language and meaning. If meaning and language is innate then it is genetically shared and coherence comes for free. But if language and meaning are not innate we must explain how coherence may arise without central control and with agents having only access to each other's states through localised interactions.

The origins of coherence in a distributed system with many interacting elements have been studied in biology and other sciences under the heading of self-organisation. A typical example is a cloud of birds or a path formed by ants. Examples of self-organisation are also found at lower levels. For example, regular temporal or spatial patterns in the Bhelouzow-Zhabotinsky reaction or the sudden appearance of coherent light in lasers are chemical and physical examples of self-organisation.

The principle of self-organisation prescribes two necessary ingredients: there must be a set of possibile variations and random fluctuations that temporarily may cause one fluctuation to gain prominence. Most of the time these fluctuations are damped and the system is in a (dynamic) equilibrium state. However if there is a positive feedback loop causing a certain fluctuation to become enforced, then one fluctuation eventually dominates. The feedback loop is typically a function of the environment so that the self-organisation only takes place for specific parameter regimes. When these parameters are in a lower regime they leave the system in equilibrium. When the parameters are in a higher regime, they bring the system in (deterministic) chaos. Structure arises and is maintained on the edge of chaos [21] .

In the computational experiments, self-organisation has proven to be an effective way to arrive at coherence. The positive feedback loop is based on success in games that involve multiple agents. Those rules are preferred that are the most used and the most successful in use. For example, for each word-meaning pair a record is kept how many times this pair has been used and how many times the use of the pair in a specific language game was successful. The (speaking) agent always prefers the most successful word. This causes the positive feedback effect: the more a word is used, the more successful it will be and the more it will be used even more. Initially there will be a struggle between the different word-meaning pairs until one dominates. Coherence crystallises quite rapidly once a word starts to dominate, similar to a phase transition.

The same principle can also be applied at other levels. Phonemes and phoneme segments that have most success in imitation games are preferred over those that have less success - even if they satisfy all other constraints. The coherence of meaning happens indirectly. Features are preferred that have been lexicalised and hence the most common features will be shared by all agents.

Level formation

Many linguists argue that a representation system is only a language when it also features a complex syntax. In the experiments discussed so far, there was no syntax yet, although there is a steadily increasing and adaptive lexicon, phonology and meaning repertoire. All the ingredients are therefore in place of a protolanguage [22] . Although syntax is obviously important, these other aspects of language are just as crucial and no theory of the origins of language is complete without explaining how they might evolve. Nevertheless the origins of syntax is an essential part of the problem and it must be addressed. I hypothesise that level formation is the key towards solving it.

Level formation is very common in biosystems. It occurs when there are a number of independent units which due to co-occurrence develop a symbiotic relationship, eventually making the units no longer independent. Level formation has for example been used to explain the formation of cells [23] and the origin of chromosomes [24] which group individual genes. In the case of the cell, there were initially free floating organisms and structures which came to depend on each other, for example because one organism produces products for another one or destroys lethal products. Gradually the relationship between these organisms and structures becomes so strong that they give up some of their independence to become a fixed part of the whole. For example, mitochondria are organisms with their own genetic information that used to be independent but are now so much intertwined with the cell that they need the genetic information in the nucleus to duplicate. Thus a new unit at a new level is created which itself then can start to form part of larger units.

Experiments for applying these principles to the formation of syntactic units are currently in progress. The following steps are seen:

1. The starting point are individual words as originating from the meaning creation and lexicon formation processes discussed earlier. These words get preferred semantic functions and therefore fall into (emergent) grammatical categories. For example, some words, like the white tables pick out the main target set, others, like white in the same phrase carve out a subset from this set, others like -s indicate that several objects are involved, etc. Initially the function of a word is undifferentiated but based on the actual use of a word and group dynamics, words become specialised. For example white may become only used for delineating a subset from an already defined set.

2. Words with different functions naturally co-occur in recurrent constellations. Thus an indication that several objects from a set are referred to, requires that first such a set is identified.

3. These constellations then become conventionalised, meaning that some functions become obligatory and hence words of a certain category have to be there.

4. The functions then become gramatically codified using word order and/or morphology. This makes it possible to derive grammatical functions of words even if they have multiple uses or if the function of a word is not yet known.

Once these basic principles are in place, the syntactic level may undergo various additional evolutionary processes on its own. For example, in some very interesting experiments, Hashimoto and Ikegami [25] have already shown that there can be evolution towards greater grammatical complexity (from regular grammars to context-sensitive grammars) based on language games which select for producability and parsability.

Much further work remains to be done for syntax but the same mechanisms as used for lexicon formation, meaning creation and emergent phonologies appear to be applicable. The principle of level formation needs to be better understood - but this is also true for biological instances of level formation.

Conclusions

This paper proposed a number of mechanisms that together might explain the origins of language: evolution, co-evolution, self-organisation and level formation. Each of these mechanisms is known to play a critical role in the origins of complexity in biosystems [26] , which justifies that they are also applied to the origins and evolution of language. However, in contrast with most researchers today, I propose to apply these mechanisms not to biological structures (for example genes or neural networks), but rather to language itself. For example, no `catastrophic' genetic mutation [27] is proposed to explain the origins of syntax, rather syntax is hypothesised to originate spontaneously through level formation based on the pressure to express more meanings with limited resources of time, memory and processing power.

An analogy was proposed between language and species and between an individual's language rules (at different linguistic level) and genes. Evolutionary processes operate on the individual rules causing the language to bootstrap and evolve. Selectionist criteria are not in terms of reproductive success (as in the case of genetic evolution) but rather success, ease and efficiency in linguistic communication. Coherence emerges through self-organisation.

Of course, the individual brain must have the appropriate capabilities to engage in the operations necessary to represent and enact the linguistic rules. This includes fine motor control of the articulatory system, frequency analysis of the speech signal, associative memory, discretisation of continuous sensory data channels, set operations over feature structures, monitoring and establishment of feedback loops between use and success, planning and recognition of sequences, etc. But none of these functions is unique for language. The fine motor control needed for the articulatory system is similar to the one needed for controlling a hand. The frequency analysis of the speech signal is identical to that needed for recognising other kinds of sounds. Set operations, associative memory, planning and recognition of action sequences all are needed for daily survival and can be found in lower animals, albeit with much less sophistication.

Testing the adequacy of mechanisms for the origins of language by building software simulations and robotic agents, has proven to be a very effective methodology although it requires a large amount of work. So far, concrete positive results have been obtained for meaning, lexicons and phonology. Much more research needs to be done, particularly in the area of syntax and in the evolution of language games and speech acts. But many exciting new insights are clearly within reach.

Acknowledgement [28]

URL dieses Artikels:
https://www.heise.de/-3445777

Links in diesem Artikel:
[1] http://arti.vub.ac.be/steels/welcome.html
[2] http://arti.vub.ac.be
[3] https://www.heise.de/tp/subtext/telepolis_subtext_3478346.html?artikel_cid=3445777&row_id=1
[4] https://www.heise.de/tp/subtext/telepolis_subtext_3478376.html?artikel_cid=3445777&row_id=2
[5] https://www.heise.de/tp/subtext/telepolis_subtext_3478402.html?artikel_cid=3445777&row_id=3
[6] https://www.heise.de/tp/subtext/telepolis_subtext_3478410.html?artikel_cid=3445777&row_id=4
[7] https://www.heise.de/tp/subtext/telepolis_subtext_3478420.html?artikel_cid=3445777&row_id=5
[8] https://www.heise.de/tp/subtext/telepolis_subtext_3478432.html?artikel_cid=3445777&row_id=6
[9] https://www.heise.de/tp/subtext/telepolis_subtext_3478440.html?artikel_cid=3445777&row_id=7
[10] https://www.heise.de/tp/subtext/telepolis_subtext_3478442.html?artikel_cid=3445777&row_id=8
[11] https://www.heise.de/tp/subtext/telepolis_subtext_3478446.html?artikel_cid=3445777&row_id=9
[12] https://www.heise.de/tp/subtext/telepolis_subtext_3478348.html?artikel_cid=3445777&row_id=10
[13] https://www.heise.de/tp/subtext/telepolis_subtext_3478352.html?artikel_cid=3445777&row_id=11
[14] https://www.heise.de/tp/subtext/telepolis_subtext_3478354.html?artikel_cid=3445777&row_id=12
[15] https://www.heise.de/tp/subtext/telepolis_subtext_3478356.html?artikel_cid=3445777&row_id=13
[16] https://www.heise.de/tp/subtext/telepolis_subtext_3478360.html?artikel_cid=3445777&row_id=14
[17] https://www.heise.de/tp/subtext/telepolis_subtext_3478362.html?artikel_cid=3445777&row_id=15
[18] https://www.heise.de/tp/subtext/telepolis_subtext_3478364.html?artikel_cid=3445777&row_id=16
[19] https://www.heise.de/tp/subtext/telepolis_subtext_3478368.html?artikel_cid=3445777&row_id=17
[20] https://www.heise.de/tp/subtext/telepolis_subtext_3478370.html?artikel_cid=3445777&row_id=18
[21] https://www.heise.de/tp/subtext/telepolis_subtext_3478372.html?artikel_cid=3445777&row_id=19
[22] https://www.heise.de/tp/subtext/telepolis_subtext_3478378.html?artikel_cid=3445777&row_id=20
[23] https://www.heise.de/tp/subtext/telepolis_subtext_3478380.html?artikel_cid=3445777&row_id=21
[24] https://www.heise.de/tp/subtext/telepolis_subtext_3478384.html?artikel_cid=3445777&row_id=22
[25] https://www.heise.de/tp/subtext/telepolis_subtext_3478386.html?artikel_cid=3445777&row_id=23
[26] https://www.heise.de/tp/subtext/telepolis_subtext_3478388.html?artikel_cid=3445777&row_id=24
[27] https://www.heise.de/tp/subtext/telepolis_subtext_3478392.html?artikel_cid=3445777&row_id=25
[28] https://www.heise.de/tp/subtext/telepolis_subtext_3478394.html?artikel_cid=3445777&row_id=26