0% found this document useful (0 votes)

31 views

Robot Behaviour Conicts: Can Intelligence Be Modularized?: Amol Dattaraya Mali and Amitabha Mukerjee

The document discusses behavior-based models of robot intelligence that are composed of simpler behavior modules. It examines the potential for conflicts between behavior modules, where the consequence of one behavior triggers the stimulus for another, creating cyclic behaviors. The authors develop an algorithm to detect these temporal conflicts and propose methods like stimulus specialization and response generalization to eliminate conflicts. However, eliminating all conflicts weakens the behavior structure and limits the complexity of tasks robots can perform with purely reactive systems.

Uploaded by

Vinay

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Robot Behaviour Conicts: Can Intelligence Be Modularized?: Amol Dattaraya Mali and Amitabha Mukerjee

Uploaded by

Vinay

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Robot Behaviour Con icts: Can Intelligence be modularized?

Amol Dattaraya Mali and Amitabha Mukerjee

Center for Robotics I.I.T. Kanpur 208016

Abstract
In this paper, we examine the modularity assumption of behaviour-based models: that complex functionalities can be achieved by decomposition into simpler behaviours. In particular we look at the issue of con icts among robot behaviour modules. The chief contribution of this work is a formal characterization of temporal cycles in behaviour systems and the development of an algorithm for detecting and avoiding such conicts. We develop the mechanisms of stimulus specialization and response generalization for eliminating con icts. The probable con icts can be detected and eliminated before implementation. However the process of cycle elimination weakens the behaviour structure. We show how (a) removing conicts results in less exible and less useful behaviour modules and (b) the probability of con ict is greater for more powerful behaviour systems. We conclude that purely reactive systems are limited by cyclic behaviours in the complexity of tasks they can perform. 1

1 Introduction

Complex robot interactions are conveniently modeled in terms of stimulus-response sequences often called behaviours. It is easier to model and debug the behaviour modules as opposed to the larger and more integrated centralized controllers. Impressive results have been achieved using this strategy in a can collection robot (Connell 1990), navigation of mobile robot (Arkin 1992), a prototype airplane controller (Hartley & Pipitone 1991), o ce rearrangement robot in AAAI-93 robot competition etc. This behaviourbased intelligence paradigm propounded by Brooks and others has challenged the role of representation 1 AAAI-94, p.1279-1284

in AI. In a sense, these approaches treat the world as an external memory from which knowledge can be retrieved just by perception. Brooks argues that when intelligence is approached in such an incremental manner, reliance on representation disappears (Brooks 1991). In response, traditional AI researchers such as Kirsh have argued that control cannot serve as a complete substitute for representation (Kirsh 1991). At the same time, behaviour systems have also been moving away from the purely reactive paradigm. A well-known extension of the Brooks' approach includes the SONAR MAP (Brooks 1986) which is a module that learns what looks suspiciously like a central representation. Some researchers (Gat 1993) are beginning to propose that the internal state be used, but only for modeling highly abstract information. All modular designs (databases, architectures, factories) face the problem of intermodular con ict. In robot behaviour implementations, con icts which do not result in cycles can be removed by prioritization schemes (e.g. suppressive links), but this is usually ad hoc, with the primary objective of demonstrating success in the current task objectives. Brooks stresses that additional layers can be added without changing the initial system - our results show that such a claim is most probably not tenable. Then how does one put the behaviour modules together and get useful performance? This depends on identifying the possible sources of inter-modular con icts that are likely to arise in behaviour chains. This paper is one of the rst formal investigations on the issue of combining behaviours and interbehaviour con icts. Despite the attention such models have been receiving, the issue of inter-behaviour con ict, which challenges the fundamental assumption of modularity, has not been investigated. For example, if the consequence of a behaviour a triggers the stimulus for behaviour b and b precedes a, then we have an unending cycle. Such con icts are also beginning to show up in the more complex behaviour implementations. For example, Connell records an instance where a can collecting robot attempts to

re-pick the can it has just deposited in the destination area as shown (Figure 1) this con ict was detected only after a full implementation (Connell 1990). Cyclical wandering and cyclic con ict of going back and forth between two obstacles have been reported (Anderson & Donath 1990) (Miller 1993). Can behaviour systems be constructed so that such con icts can be detected beforehand? How can one modify the structure so as to avoid such con icts? These are some of the questions we set out to answer. Our analysis in this paper is dependent on a crucial observation regarding the temporal structure of purely reactive systems. The con icts we are addressing are not control con icts but temporal sequence con icts for which it is necessary to de ne the temporal structure of behaviours, which is usually sequential since one behaviour usually provides the stimulus for another, so that there is often a clear temporal sequence in which behaviours are executed. In this paper we show that cycles occuring in this temporal sequence can be avoided only by modifying the behaviours themselves, and we introduce two such modi cations, based on specializing the stimulus or restricting the action of a behaviour. One of the key results of the paper is that any such modi cation reduces the usefulness of the behaviour structure and makes it less exible.
(visible(x) can(x) V graspable(x))

feels that complex behaviour need not necessarily be a product of an extremely complex system, rather, complex behaviour may simply be the re ection of a complex environment (Simon 1969). Arkin proposes the motor schema as a model of behaviour speci cation for the navigation of a mobile robot (Arkin 1992).

Notation

list and level of activation respectively (Maes 1990). In this work we have followed the behaviourists and adopted a 3-tuple model of behaviour: stimulus, action, consequence. An elemental behaviour module takes the form <s, a, c>, although the action a is not directly referred to by us, and we sometimes abbreviate the notation to <s, c>. Both the stimulus s and the consequence c are commonly de ned in terms of a predicate. We de ne the dominant period of a behaviour as that period when the behaviour is active. In most behaviour implementations, behaviours become dominant in a temporal sequence. We use the symbol \:" (precedes) to denote this. 1 : 2 implies that behaviour 2 becomes dominant following behaviour 1 . We de ne a behaviour chain as a sequence of behaviour modules f 1 : 2 : 3 : ::: : n g. Here the action of the earlier module changes the situation in such a way that the newly changed part of the situation is in turn a stimulus for the next module in the sequence. If the consequence and stimulus include a nite universal state as well, then we can say that the stimulus si+1 of the behaviour module i+1 is logically implied by the consequence of the module i i.e. (ci ) si+1 ). What we mean by the nite universal state V be clari ed by an example. Let Universe can V V = X Y Z and c1 = A and s2 = X A. Then 1 leads to 2 but (c1 6) s2 ). Thus when we say that (c1 ) s2 ) we mean that a part of s2 was true in the Universe and some literals in c1 cause the rest of s2 to come true. Thus in order for (c1 ) s2 ) to be true, both stimulus and consequence should always contain the \state of the universe predicate." This allows us to develop the argument e ectively, skirting the philosophical debate on the frame problem (George 1987). We de ne a behaviour space B as a set of behaviour modules. A temporal chain of behaviours C is said to be composable from B (written as C B), if and only V if (C = ordered set f i g (8 i) i 2 B ). A stimulus space of a behaviour space B is the union of the stimuli of all behaviour modules in B.

Maes models a behaviour as a 4-tuple <c, a, d, > which represent pre-conditions, add list, delete

Behaviour Chain

-pickup(x) -move(x) ?
drop(x)

Figure 1. Con ict in picking and placing the can.

2 What Is a Behaviour?

AI researchers, psychologists, cognitive scientists, ethologists and roboticists, all use the term behaviour in senses that are related but are fundamentally different. At one end of the spectrum is Brooks who looks upon behaviours as a type of intelligent module, an input-output relation to solve small problems (Brooks 1986). Hopefully these modules can be combined to solve larger problems. There is no shared global memory. The stimulus to a behaviour is boolean and is tested by an applicability predicate. This is the model of behaviour investigated in this paper. Minsky suggests thinking about goal directed behaviour as an output of a di erence engine that measures the di erences between the world state and the goal state and takes actions to reduce these di erences (Minsky 1986). On the other hand, Simon

Power, Usefulness and Flexibility of Behaviours

To compare di erent behaviour systems, we de ne a few relative measures. erful than ( 0 := <s0, a0, c0>) i (s0 ) s) (c ) c0). In other words, it can be triggered at least as frequently as a less powerful behaviour and results in at least as strong a consequence. A behaviour space B is more powerful than the behaviour space B 0 if B 0 can be obtained from B by replacing some module 2B by less powerful module 0. Usefulness : A behaviour space B spans the task space if and only if f8 (t 2 ) (9 (C B) ful lls (C, t)) g. The greatest ful llable task space G (B ) is the largest task space that is spanned by the behaviour space B. The usefulness of a behaviour space is de( ned as the ratio j GBB) j . j j Flexibility : A behaviour space B is at least as exible as behaviour space B 0 if f8 t 2 ( G (B ) \ V G (B 0)) 9 (C B )f fulfills (C t) 8 (C 0 B 0) f fulfills (C 0 t) )j C j j C 0 jggg. In the broad sense of the word con ict, any behaviour chain leading to non-ful llment of the desired objectives can be said to have a con ict. Let a chain C = f 1 : 2 : ::: : n g be the desirable behaviour sequence that achieves a desirable outcome. There are three types of con icts that can cause the chain C from not being executed, by breaking the sequence i : i+1 .

De nition 1 Power : A behaviour ( := <s, a, c>) isV powmore

All cycles do not result in a con ict. Let us say that a large block of butter needs to be cut into a number of smaller pieces. A module cut achieves it.V sCut = Let V V butter(x) cuttable(x) breadth(x b) (b 2 ). We specify the smallest acceptable size of the piece of butter by specifying the limit , which introduces the termination condition. There are three types of prioritization used in robot behaviour control. If : is a possible but undesirable sequence of behaviours, the sequence can be modi ed. In suppression, a module suppresses the output of , and instead of : , : occurs. In inhibition the action of may take place, but only after is no longer dominant. Here the chain : has the module inserted, so : : may occur, if the stimulus for is not a ected by . Delayed action is a special case of inhibition, where the inhibitive link remains e ective for some time tdelay even after the inhibiting module is no longer dominant. Connell uses the term retriggerable monostable to capture this sense (Connell 1990). These mechanisms are not guaranteed to kill the stimulus of k , hence k may be active after dominant period of the suppressing module is over. Thus within the scope of the three prioritization schemes discussed here, it is not possible to guarantee that cyclic con icts will be avoided.

Prioritization

3 Detection of Con icts

Detecting Cycles in Behaviour Graphs

De nition 2 (a)Extraneous behaviour Con ict : i : 0, 0 C. 62 (b)Cyclic Con ict : i : k , k 2 C, k i. (discussed later) (c) Skipping Con ict : i : k , k 2 C, k > (i+1).

This type of con ict can be treated in a manner analogous to extraneous behaviour con icts. The type of behaviour that we are investigating is the cyclic con ict, where, both i+1 and k may be triggered and clearly the triggering of k would lead to a cycle (Figure 2).

Representing a behaviour chain as a graph, we present without proof the following lemmas: Lemma 1(a). Whenever there is a cyclic con ict, there is a cycle in the temporal graph of behaviours. Lemma 1(b). Whenever there is a cycle in the temporal graph of behaviours that is not terminated by a recursive condition, there is a cyclic con ict. Thus detecting con icts in a behaviour chain is equivalent to detecting cycles in the corresponding graph.

4 Behaviour Re nement

- sk ak ck - : : : - si 6
Terminated Cycles

ai ci

- si -

ai+1 ci+1

Figure 2. Cycle in a temporal chain of behaviours.

From de nition 2, whenever there is a cycle in a behaviour chain C = f 1 : 2 ::: k ::: i : i+1 ::: n g, there must be a triggering of the kind i : k , where k i. Then both i : i+1 and i : k are possible at this point. Our objective is to break the i : k link without disturbing the i : i+1 or k;1 : k triggerings which are essential to the successful execution of the chain. We have seen that priority based methods are not guaranteed to achieve this, so we look for behaviour modi cation approaches which will maintain (ci )si+1 ) whereas (ci )sk ) would be negated. We

develop two methods for achieving this: in stimulus specialization, sk is specialized, and in response generalization, ci is generalized. Let us consider the con ict in picking up the soda cans, where the freshly deposited can is picked up. If we were to add the condition \not-deposited-justnow (x)" to the stimulus predicate for pickup, then we would only need a small recency memory (recently dropped can). Thus the stimulus for k becomes more specialized. However, in doing this, one must be careful so as not to disturb the rest of the chain, i.e. (ck;1 )sk ) should still hold but (ci )sk ) must be broken. Clearly this will not be possible where (ci )ck;1 ), then any changes we make in sk such that : (ci ) sk ) will also result in : (ck;1 ) sk ). Thus stimulus specialization can be used only if (ci ) ck;1 ) is not true. One model for this is to say V that there must be a literal such that (ck;1 ) sk ) V but : (ci ) sk ). The conjunction of all such litV 2 V ::: m) is called the maximal diferals ; = ( 1 ference between ck;1 and ci . Stimulus specialization works V when ; 6= , and involves modifying sk only to (sk ), 2 ;. It is advisable not to specialize it more than necessary (e.g. by adding more than one literal), since this adversely a ects the power of the behaviour. A simpler understanding of the process is obtained if both ci and ck;1 are in conjunctive form. Then ; is nothing but the di erence (ck;1 ; ci ) and sk is modi ed by conjunction with one of the literals that is in ck;1 but not in ci . Note that since the stimulus is specialized, any stimuli that are required by the action are still available to it. Here the action is modi ed so that the consequence of the action is weaker i.e. if the old consequence was c and the new one is c0 then (c ) c0) but : (c0 ) c). For example, we can modify the action of the module drop so that while dropping the can on the ground, the robot puts it in an inverted position which prevents the robot from detecting that the object V a can. V original consequence was is The (visible(x) can(x) graspable(x)) and the modV i ed consequence is (visible(x) graspable(x)), assuming the sensors cannot identify cans that have been inverted. Otherwise, we may modify the consequence by covering the can to make the predicate visible(x) false, then this leads to addition of a new behaviour module or modifying the action part of the original module, both of which require considerable re-programming, and are expensive. In response generalization, (ci ) sk ) must be negated, while (ci ) si+1 ) must hold. Hence response gener-

Stimulus Specialization

alization can be used only when (si+1 ) sk ) does not hold. In fact, one could characterize the process of response generalization by saying that there must exist W W s.t. (ci ) ) si+1 but : (ci ) ) sk . The disjunction of all such 's is . Again, if sk and si+1 are in conjunctive form, then a simpler understanding is obtained, since = (sk ; si+1 ) i.e. the negation of all the conjunctions that appear in sk and not in si+1 . This negation is a disjunction of many negative literals ( ). In this instance, modifying ci is better understood as dropping the literal already appearing in ci , written as (ci ; ). Since stimuli/consequences are often conjunctive, this di erence notion is a useful concept in practice. Thus ci is modi ed to (ci ; ), where 2 (sk ; si+1 ). Stimulus specialization is easier to do than response generalization, as response generalization requires that the action should be modi ed. However, stimulus specialization may not always be possible e.g. with \not-deposited-just-now(x)" the robot may still pick up an older deposited can. Better solutions to this, such as \never-deposited-before(x)" or \not-at-depository(x)" would require considerable global memory. Therefore, stimulus specialization, while cheaper to implement, may not be available in many instances, since the actions require a minimum stimulus, and specializing it without memory may be counter-productive.

E ects Of Behaviour Re nement

Response Generalization

Let us now investigate the e ects of stimulus specialization and response generalization. Lemma 2. Removing a cycle from chain C by stimulus specialization or response generalization cannot introduce a cycle in any other chain C 0, that did not have a cycle before. Proof :- Let k be the behaviour that was specialized. Now to introduce cycles when no cycles existed before, some link 0 : k must have become possible, i.e. (c0 ) sk ) has become true. This is not possible since c0 is the same and sk is more speci c. Similarly, since ck has not been modi ed, no new links k : 0 could have been created. Hence no new cycle will be initiated. Similarly it can be shown that response generalization does not introduce new cycles. 2 Lemma 3. Whenever a behaviour space is modi ed to eliminate cyclic con icts by response generalization and/or stimulus specialization, either the exibility or usefulness of the behaviour space decreases. Proof :- Let be the stimulus space and s and s is a conjunction of its members. Now let s be specialized to s0 so that s0 s. Now tasks or subsequent behaviours requiring the predicates in (s - s0) will no longer be attended to by B. Thus we need a new behaviour 00such that (s00 s0) = s, so that and 00

together serve the stimulus set s which implies that jBj increases and the usefulness of B decreases. Similarly, if the response of is generalized so that c has more literals than c0. Thus (c ; c0) is not being performed by . Hence other new behaviours are needed to complete tasks requiring (c - c0) which increases jBj, and also increases the chain lengths for performing tasks, reducing exibility. Otherwise, some tasks requiring (c - c0) cannot be performed which implies that j 0j < j j which again means that the usefulness of the behaviour space decreases. 2 Let us say that we have a behaviour whose conseV quence c = p q leads to a cycle. If we use response generalization, we may have to design two behaviours 0 and 00such that c0 = q and c00= p. If has a W stimulus s = p q which is triggered leading to a cycle and if we use stimulus specialization, we may have to design two more behaviours 0 and 00such that s0 = p and s00= q. In some cases, it may not be possible to design an action such that it will fulll these conditions. This discussion brings us to our most important results, which have to do with the power and usefulness of behaviour spaces. Behaviour Modi cation Theorem. Given two behaviour spaces B and B 0such that B is more powerful than B 0 (i.e. B 0 can be obtained from B by replacing some behaviours of B by the less powerful ones 0) then: (a) The greatest ful llable task space for behaviour space B 0 is less than that for B , i.e. j G(B0)j j G(B)j (b) Usefulness 0 B is greater than that of B 0 i.e. of j G(B )j j G(B )j . jB j jB 0 j (c)Probability of a cycle is also greater in B. Proof (a) :- First, let us consider the case where a single behaviour has been replaced by the less powerful 0. The set of chains of behaviours composable from a behaviour space represents a tree with initial point corresponding to the availability of the right initial stimulus and each node in this tree represents a world state which may correspond to the desired task. The greatest ful llable task space is proportional to the total size of this tree of behaviour chains. Now, either the behaviour will have more applicability due to smaller stimulus length as compared to the behaviour 0, or the behaviour will have stronger consequences resulting in more behaviours being triggerable. In terms of the task tree, either will have more parent nodes, or it will have more children. In either case, the branching factor is higher in B than in B 0 and the size of the task tree will be as large or larger. Since jBj has not changed, the usefulness ( of the behaviour space, j GBB)j has decreased which j j

proves part (b). This treatment can be extended to multiple instances of replacing a strong behaviour by a weak one. 2 Proof (c) :- Let i 2 B and i0 2 B 0be two behaviours s.t. i is more powerful than i0, i.e. (s0 ) si ) i or si is weaker than s0. Now consider any chain i0 composable in B and B of n modules, which differ only in that the module i is replaced by i0. Now consider all behaviours j 2C, j i, with consequencePj . The probability of a cycle probcn cycle (B) is j i prob(cj ) si ) and prob-cycle (B 0) Pj is n i prob(cj ) s0). Clearly, since (s0 ) si ), f8j prob0(cj ) si) i prob(cj ) s0)]g. iSimilarly i (ci ) ci ) for which similar analysis can be carried out. Thus prob-cycle (B) prob-cycle (B 0). 2 Corollary :-If B and B0 have the same greatest fulV llable task space G , but 9( 2 B ) 9( 02B 0)f is V more powerful than 0g, but 9( 2 B ) 9 ( 0 2B 0)f 0 is more powerful than ], then j B j j B0j. In this section we consider the parsimony of the logical chain underlying the behaviour chain. If i : i+1 , then (ci ) si+1 ). There may be some literals in ci which are not necessary in this implication, or there may be some disjunctive literals in si+1 not referred to by ci . We call this di erence between ci and si+1 , the residual. If ci and si+1 are identical, then their residual is null. If is the most general matching string between ci and si+1 , i.e. is the most general construct for which ci ) andW ) si+1 , then we V si+1 = , then the residcan write ci = ual consequence = , the remedial stimulus = and the total residual between ( i i+1 ), Ri , is de ned V : . Residuals are a measure of the degree of as coupling in a behaviour chain. The intuition behind this is that as the residual increases, the exibility as well as probability of cycles increase. Stimulus specialization as well as response generalization decrease the residual. Lemma 4. If two chains C and C 0 have identical residuals except for some residuals R in C which are stronger than corresponding R0 in C 0, then the probability of a cycle is greater in C. Proof :- Consider two behaviour chains C and C 0 where C 0is formed from C by replacing the single behaviour i by i0. Then all residuals in the two chains are also identical except Ri;1 and Ri . If the residuals in C are stronger, then ( i ) i0) and ( i0;1 ) i;1 ), i.e. behaviour is more powerful than 0. Hence by part (c) of the behaviour modi cation theorem, probability of a cycle is greater in C than in C 0. The same

Residual

arguments can be extended for multiple behaviour changes between C and C 0. 2 In this paper we have focussed on the temporal relations between behaviours as opposed to the control relations. This has highlighted an important similarity between behaviour-based modeling and the classical models of planning in AI. The e ect of actions, which become new stimuli is similar to the postcondition - precondition structure in means ends planners such as STRIPS (George 1987). One of the approaches used to avoid cyclic con icts in planning is the meta-level reasoner, idea which has also been used in behaviour systems such as in (Arkin 1992). But purists would not consider these to be true reactive behaviour systems. However, behaviour models di er from planning in some crucial aspects. Locality of behaviour programming makes opportunistic plangeneration automatic, since the relevant behaviour is triggered automatically when stimulus becomes very strong. Also, cycles are much more of a problem in behaviour models since unlike planners, a behaviour does not \switch-o -and-die" after execution if the stimulus reappears, it may re-execute, causing a cycle. One of the bene ts of this work is that by testing for cycles, the designers will not have nasty surprises awaiting them after implementation. We also show that approaches such as prioritization will not avoid cycles. Thus the only guaranteed method for avoid cycles is to modify the behaviour itself, and this can be done either by specializing the stimulus or generalizing the response of some behaviour module. Unlike learning, which makes the behaviours more powerful, this reduces the usefulness of the behaviour module. If a robot can pick up a soda can, it should be able to pick up a co ee cup or other similar object. Using stimulus specialization, such a general behaviour would be split into many separate behaviour for picking up separate objects. The principal insight to be gained from this discussion is that in behaviour design, there is a tradeo between the power of a behaviour and the likelihood of cycles. The crucial task of the behaviour designer is to achieve just the right amount of re nement, without involving con icts and without sacri cing too much exibility. Can con icts be avoided by using alternate architectures such as fuzzy logic (which allows behaviour designers to model strength of stimulus), meta-level reasoning (Yamauchi & Nelson 1991), or connectionist architectures (Payton, Rosenblatt & Keirsey 1990)? If we ported the can-pickup example into any of these representations, the con ict does not go away, since the con ict arises at the knowledge level and

Conclusion

not at the representation level. Using internal state would not, in itself, be able to remove this type of con ict, although it would make it easier to modify the behaviours so that the con ict can be avoided. Another issue related to internal state is the intention of the robot (psychologists read will). Knowing the intention at some meta-level, it may be possible to construct tests for detecting con icts, and even possibly of avoiding them. At the same time, models involving will or intention (as in Searle) are one of the most debated and di cult quagmires in AI today. Is there then some limit on the complexity of a system of behaviours before self-referential cycles develop? A deeper question raised by the presence of such cycles in behaviour based robotics, as well as in other branches of AI, is that of its signi cance to the entire search for arti cial intelligence. Is there some bound on the complexity of any system claiming intelligence, before it begins to develop cyclic con icts? This paper is a beginning of the search for these answers which are sure to a ect the future of the behaviourbased robot modeling paradigm in particular and that of models for intelligence in general.

References

1] Anderson, T. L. Donath, M. 1990. Animal Behaviour As A Paradigm For Developing Robot Autonomy, Robotics and Autonomous Systems, 6(1 & 2): 145-168. 2] Arkin, R. C. 1992. Behaviour-Based Robot Navigation for Extended Domains, Adaptive Behaviour 1(2): 201-225. 3] Brooks, R. A. 1986. A robust layered control system for a mobile robot, IEEE transactions on robotics and automation, 2(1): 14-23. 4] Brooks, R. A. 1991. Intelligence without representation, Arti cial Intelligence, 47(1-3): 139159. 5] Connell, J. 1990. Minimalist mobile robotics, A colony style architecture for an arti cial creature, Academic press Inc. 6] Gat, E. 1993. On the Role of Stored Internal State in the Control of Autonomous Mobile Robots, AI Magazine, 14(1): 64-73. 7] George , M. P. 1987. Planning, Annual Review of Computer Science, 2: 359-400. 8] Hartley, R. Pipitone, F. 1991. Experiments with the subsumption architecture, In Proceedings of the IEEE Conference on Robotics and Automation, 1652-1658. 9] Kirsh, D. 1991. Today the earwig, tomorrow man?, Arti cial Intelligence, 47(1-3): 161-184.

10] Maes, P. 1990. Situated Agents Can Have Goals, Robotics and Autonomous Systems, 6(1-2): 4970. 11] Miller, D. P. 1993, A Twelve-Step Program to More E cient Robotics, AI Magazine, 14(1): 6063. 12] Minsky M. L. 1986. The Society of Mind, Simon and Schuster. 13] Payton, D.W. Rosenblatt J. K. and Keirsey, D. M. 1990. Plan guided reaction, IEEE Transactions on Systems, Man and Cybernetics, 20(6): 1370-1382 14] Simon, H. A. 1969. The Sciences of the Arti cial, The MIT Press. 15] Yamauchi, B. Nelson, R. 1991. A behaviourbased architecture for robots using real-time vision, Proceedings of the IEEE Conference on Robotics and Automation, 1822-1827.