0% found this document useful (0 votes)

64 views

On Adaptive Control Processes

This document discusses adaptive control processes and how dynamic programming techniques can be used to design adaptive controllers. It presents an example problem of using feedback control to keep a nonlinear system near an unstable equilibrium state. The controller selects control values over time based on observations of a random disturbing force affecting the system, allowing it to adapt and potentially improve its control decisions. Numerical solutions are generally needed due to the complexity of real-world adaptive control problems.

Uploaded by

Shivani Sharma

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views

On Adaptive Control Processes

Uploaded by

Shivani Sharma

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

ON MV

IE COmTROL PROCESSES

Richard BeUlnan and Robert Kalaba

The RAND Corporation
Santa Monica, California

sumrmary
One of the most challenging areas in the underlying physical processes. These condftions
field of automatic control is the design of auto- range all the way from complete howledge $0 total
matic control devices that 'learn' to improve their ignorance. As the processunfolds, however,
performance basedupon experience, i.e., thatcan additional information concerning these factors
adapt themselves to circumstances as they find may become available to the controller, which th
them. The military and connuercial implications of has the possibility of 'learming' to improve its
such devices are impressive, and interest in the performance basedupon experience, or in fact,
two main areas of research in the fieldof control, actual experimentation. In this case we say that
the USA and the USSR, rum high. Unfortunately, the controller adapts itself to its environment.
though, both theory and construction of adaptive In an earlier paper, [TI, a broad and general
controllers are in their infancy, and some timefoundation was laid for the mathematical of study
may pass before they axe connuonplace. Nonetheless,adaptive processes, through the ofuse concepts
developent at this time of adequate theoriesof from the fieldof dynamic programming,141. The
processes of this Z-ture is essential. specific purposeof this paper is to render these
The purposeof our paper is to show how thenotions more concrete through the detailed study
functional equation technique of a new mathe- of some special control processes involving a
matical discipline, dynamic progrmmkg, can be nonlineax system with a tendency to be stimulated
used in the formulation and solution of a variety to undesirable oscillations.
of optimization problems concerning the design of We approach the adaptive control process in a
adaptive devices. Although, occasionally,a series of steps. Flrst we assume that the control-
solution in closed form can be obtained, in general, ler hascmplete information concerning the behavior
numerical solution via theof use high-speed of the forcing function over time, a process which
digital computers is contemplated. is referredto as a deterministic control process.
We discuss here the closely allied problems of Then we introduce someunknown factors, which
formulating Optive control processes in precise appear mathematically as random variables having
mathematical termsarld of presenting feasible distribution functions which known are to the
computational algoritbmsf o r determining numerical controller. This leads toa stochastic control
solutions. process. Lastly, we allm the controller still
To illustrate the general concepts, considerless information about unknovn
the factors and
a system which is governed by the inhomogeneous require that the controller learn to improve its
Van der Pol equation performance through observation of the valuesof
,
r(t) an adaptive control process.
-
P + p(x2 1) + x = r(t), o 5 t 5 T, In this paper,we limit the deficiency in the
where r(t) isa random functiondose statistical controller's knowledge to incomplete information
properties areonly partially known to a feedback concerning a random disturbing force. There are,
control device which seeks to keep the systemneedless near to say,many other waysin which ignorance
the unstable equilibrium state x = 0, x = 0. It can manifest itself. Among these we may mention
proposes to do this by selecting the of pvalue
as uncertainties concerning the determination of the
a function of the stateof the system at time t, state of the system and its environment by the
and the time t itself. By observing the raudom sensing devices , the objective (figure of merit)
process r(t), the controller may, vith the passage of the process, the transfonuations of the state
of time, infer more and more concerning the statis- of the system produced by control decisions, the
tical properties of the function r(t ) and thus may set 02 allowable decisions, and so on. These will
be expected to improve the quality of its control be exmuined in subsequent investigations.
decisions. In this way the controller adapts
itself to circumstances -.it finds them. The Although the design and operation of adaptive
process is thus an interestjng ex,antple of adaptive controllers are in their infancy, interest in
control, and, conceivably, with some inrmediate these devicesruns high, [lo]. The functional
applications. equation techniqueof dynamic programing,(4),
can beused to attacka wide varietyof problems
Lastly, some areas of this general domain involving the determination of optimal control
requiring additional research are indicated. policies for control devices having the ability t
adapt themselves to circumstances. In particular,
I. ktrduction it providesa useful conceptual framework for the
very discussion of such devices.
In many engineering, economical, biological, occasion, aaalytic results are obtained, on
Though
and statistical control processes, a device of one [ ?),
type or another which we shsll call a eontroller emphasis i s upon the awelopent of methods which
is calledupon to perform under various conditions are suitable for use in conjunction with high-
of uncertainty regarding the structure of the speed digital comguters having large memories.

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
Some of the advantages of t h e dynamicprograrmning f o r 0 < t < T in order to minimize J[p1. The
approach are i t s suitability for use with nonc controi fun-ction w i l l besubject t o a constraint
l i n e a r , as well a s l i n e a r , systems, i t s autamatic
production of a desirable parameter study (a
3 5 p ( t ) 5 m2, where may benegative. Motice
t a t t h e c r i t e r i o n f u n a i o n i s not the usual mean-
' s e n s i t i v i t y ' or s t a b i l i t y a n a l y s i s ) , i t s straight- square deviation, which in t h i s case w u l d be of
forwardness and computational f e a s i b i l i t y , and l i t t l e avail since the underlying equations are
i t s a b i l i t y t o incorporate stochastic elements in nonlinear. The f i r s t term on the right-hand side
a routine fashion. of Eq. (4) measures the cost o f deviation during
Let us now t u r n our a t t e n t i o n t o an example the entire course o f the process and the second
which w i l l serve t o i l l u s t r a t e t h e s e remarks. term measures the cost of deviation a t t h e termina-
t i o n of control.
2. A Feedback Control Problem The temptation is t o view t h i s as a problem
Let us consider a system which, i f uncontrol- in the calculus of variations, [16], in which one
led, is governed by the well-known nonlinear seeks t o determine p = # ( t ) as a function of time
differential equation over t h e e n t i r e i n t e r v a l 0 5 t 5 T in an attempt
t o minimize the functional J. The f a c t t h a t p i s
x + p(x2 - l)X + x = 0, x(0) = cl, X(0) = c2, (1) constrained t o l i e between y and m i s a cause
(the Van der Pol equation) where the dots denote of some complication. Furthermore, 2there are no
differentiation with respect to time. This classical prototypes for the stochastic control
equation i s of fundamental importance in describing processes we wish t o study below.
the development of relaxation oscilLations in The approach which we shall use, by contrast,
t r i o d e o s c i l l a t o r c i r c u i t s and i n describing the emphasizes the feedback control nature of the
operation of lrmltivibrators. We s h a l l c a l l p the problem. We s h a l l imbed the original problem
system parameter. If we introduce the fhction within a class of problems in which we regard the
v ( t ) , by means of t h e r e l a t i o n !ystem as being in some general state x = cl,
x = v, (2) x = c2 a t the time t , and ask what the optimal
choice of p i s under thesecircumstances.Notice
then the equation in (1)can be replaced by the ( a s a consequence o f the usual existence and
first-order system uniqueness r e s u l t s f o r d i f f e r e n t i d e q u a t i o n s )
x = v, x(0) = T, (3) that the past history of the process i s of no
consequence i n making this decision, only the
e=-
p(x2 - -
l ) v x, v(0) = c2. current state. Pursuing t h i s approach, i n which
we have a continuous decision process, [21, we
2 2
It i s well-knoun t h a t if c1 + c2 # 0 then the , characterize the curve p = j ( t ) a s an envelope of
solution of the system ( 3 ) will tend toward a tangents, rather than as a locus of points, as
unique periodicsolution. In the (X,.) phase m l d be the case were the earlier vievpoint
plane, this periodic solution is represented by adopted.
a closed curve which all trajectories (except I n order t o prepare the way, though, f o r t h e
,
x = 0 v = 0) approach. Thus , when the system i s use of d i g i t a l computing machines, we wish t o
disturbed f r o m its (unstable) equilibrium position reformulate the problem in terms of a discontinuous
(x = 0, v = 0 ) , a periodic oscillation tends to time variable, which w i l l a l s o m a t e r i a l l y s i m p l i e
develop. Full d e t a i l s are available in t h e book matters conceptually &en we deal with the cases
on nonlinear oscillations by Stoker, [11. of stochastic and adaptive control.
Let us assume, though, t h a t t h e o s c i l l a t i o n s
are undesirable ( ' p a r a s i t i c ' ) , and t h a t t h e 3. A DiscreteVersion
system can be controlled by varying the system The problem could be treated in the form in
parameter, p, in a given range in an e f f o r t t o which it now stands, [21. Since our objective i s
maintain the system as close as possible to i t s t o devise methods which are particularly suitable
equilibrium state. f o r high-speed d i g i t a l computational purposes, we
Consider that the process begins at time 0 and prefer to reformulate the model i t s e l f in terms
terminates a t time T, and t h a t t h e system i s of discrete variables. It must beborne in mind
initially in s t a t e (cl, c2), mere c1 i s the t h a t , in any event, digital computers consider all
v a r i a b l e s t o be discrete.
displacement, x, and c i s the velocity, x = v.
2 The time interval f r o m 0 t o T i s divided into
We shall a r b i t r a r i l y measure the 'cost' of N intervals of length h so t h a t
deviation from equilibrium during the process by
t h ei n t e al N h = T (1)
J[pl =
f ( f x ( t ) l + / v ( t ) / ) d t + exp(lx(T)] + h ( t ) l ) ,

where exp( z) is the exponential function of 2 .

(41
the system is in s t a t e (%,vk)
~f at time a,and
the control decision a t that time is t o have the
system parameter be %, then the new s t a t e et
time (k t 1 ) h is given by the finite difference

k:
We deliberately use such a monstrous function in equations
order t o squelch any direct analytic approach i n
% + vkh
1
=
embryo. Our objective w i l l bethedetermination
of the system parameter p as a flmction of the
s t a t e of the system a t time t and the time t i t s e l f , = vk - [.tk($ - l)vk + % h,

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
relations which hold for k = 0,1,2,. N 1. .., - system parameter values with the s y s t e m i n s t a t e
These equations are the finite difference (c,,c,) -
a t time (N k)h. Let ustherefore
analogues of theequations in (2.3). The cost of
thedeviation f r o m equilibrium from time kh t o define for k = 1,2 ,...,N, the functions
time
(k + l ) h i s taken t o be ( + prkl )h, fk(c1,c2) = thecost of the last k stages
ma the cost for deviation of t h e f i n a l s t a t e from of the control process with the
equilibrium is e-( 151 + lvNl ). The t o t a l system beginningthose last k
cost for deviation from the equilibrium state stages in s t a t e (cl,c2), and .
during the entire process is considered t o be using an optimal selection of the
system parameter throughout the
m-1 t-
remainder of the process.
(1)

We s h a l l determine first fl(c1,c2), then

f2(c1,c2),and so on, u n t i l fN(cl,c2) has been
determined. A t the same time, we s h a l l determine
the optimal choices of p t o make.

The function fl(c1,c2) is e a s i l y determined.

value of the systemparameter selected a t time kh. Here we are concerned with a process which begins
Let us assume t h a t t h e system is in state (x,y)
i n i t i a l l y and that we seek a s e t of parameter -
st time (N l ) h and terminates a t time Iw = T,
values, [po, p1,. .
.,fin-;] , which w i l l leinimize withthe system in s t a t e (cl,c2) a t time
t h e t o t a l coat of deviation given in equation (3). (N - 1)h.The cost of deviation during the process
is ( cll + 1c21 )h. If thevalue of the state
4. Deterministiccontrol parameter selected is t.f
, then the state a t the
termination of the process will be given by the
As stated, the problem requires a constrained equations
N-dimensional minimization t o be performed, and as
such may be quite difficult t o carry out cornputs-
t i o n a l l y in Its native form. Even so, this
problem i s conceptually much simpler than the
original continuous version which required a mini- vN = c2 - [PC.: - l)c2 + c1] h,
mization over elements i n a function space. To
solve this new discrete problem we imbed the
given declaion process within a class of processes where use has been made of the formulas in (3.2).
in such a vay t h a t we shall have a sequence of H The cost of this terminal deviation is
simple one-dimensional optimizations t o perform,
rather than the one d i f f i c u l t N-dimensional lcl + c2hl + Ic2 -h[ft(cl 2 - 1)c2 + c
problem. This decomposition makes possible an
e f f i c i e n t machine solution.
Consequently, the systemparameter p must be
The imbedding i s accomplished by focusing our selected so that the total cost, given by the
attention upon determining whaf value between ml expression
and m of t h e system p a m e t e r t o choose a t time
2
-
(H k)h if the system i s then i n 8-
,
general
s t a t e (a,b ) where k may have any of the values
-.
0,1,2,. ..,N -
1 and a and b are any r e a l
numbers. The original discrete problem is one of
the members of this c l a s s of problems. Notice i s minimized. The minimizing value of
that t h i a i s the general problem of i n t e r e s t t o one-stage processwill depend on the s?t atefor
(cl,this
c2),
the feedback controller, for it must decide what and it can easily be determined by a digital
value of the system parameter to call for with the computer using a searchtechnique. The bracket is
system in some physicalstate(a,b) and kh evaluated f o r sample values of f 4 in the range
time units r d n i n g before the termination of the ml 5 5 m2 and thevalue of yieldingthe
process.
smallest value of the bracket i s t h e o p t h a 1 system
To formulate the problem analytically, we pararaeter value. Here me see by inspection that
note f i r s t that t h e m i
ni
m l cost of deviation Over
a , should be chosen equal t o m,,
U if
the last k stages of theprocesswiththe system ‘ 2 -21)cg > O and equal to ml ci f (cl2 - 1)c2 < 0.
(cl
s t a r t i n g t h i s portion of the process i n some state,
say (c1,c2)? i s some definitefunction of k and If (cl -
l)c2 = 0, the choice of p is immaterial.
c1 and c2.It is, namely, thecostthat is in- kt us denote this optimal choice
stage process under consideration
curredduring t h e hat k stages of theprocess initially in state (c,,~,) by
using an optimal eelection of t h e sequence of

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
expression
We have the i r h r -

particular, since
~n fl(c1,c2) i s known
+
2 hl + 12
' -"[b('l 2 - ')'2 +

' from the above discussion, f ( c ,c ) can be

2 1 2
(4) determined from the formula

where the right-hand side represents the minimum

f2(c1,c2) = b1/h + /c2j h + Min p l ( c l
over a l l choices of? between ml and m2 of the
m15pS2
expression i n brackets. It i s c l e a r that this
m i n i m u m value actually depends on c1 and c
2'

Now let us assume that the functions

fl(c1,c2), f2(C1,C2),"', fk(c1,c2), vith < N,
have a l l been determined. W e wish next t o d e t e r -
mine the function fk+l(c1,c2). W e do t h i s by
making use of the principle of optimality,
which i s a special, but quite important, app
t i o n of the concept of invariant imbedding, [8]
CL:. which follows from equation ( 6 ) . The value of
which minimizes the expression in brackets in
equation (7) i s theoptimal value of to choose
According t o this principle, an optimal sequence
of decisions has the property that whatever the
-
a t time (N 2)h(theinitialstage of a two-
stage decision process) d t h t h e system i n s t a t e
i n i t i a l d e c i s i o n and i n i t i a l s t a t e are, the remain-
ing decisions must constitute an optimal sequence
(c,,c,) . We denote t h i s optimalchoice of p by
of decisions with respect to the state which
results from the first decision.

To apply this p r i n c i p l e , l e t u s suppose t h a t

a t time (W - k -
l)h, with k + 1 decisions re-
maining and the system i n some s t a t e ( c ,c ), the
f i r o t decision i s t o s e t t h e system parkneger
equal to f 4 .
The e f f e c t of this decision i s t o
Similarly, the remaining m i n i m a l cost functions
f ( c , c ) and optimal decision functions %(c1,c2)
transform the system i n t o t h e s t a t e ( 5vk) , at k 1 2
can nou be determined recursively.
time (N - k)h (when k decisionsremain),vhere
In order to apply this solution, it i s
\ = c1 + c2h, necessary, of course, to construct a control device
that w i l l c a l l for the indicated optimal value of
vk = c2 - [~(c: - 1)c2 + el] h.
the system parameter pl f o r each s t a t e of the
system,andtimeremaining.Should this prove not
t o be feasible, other sub-optimal policies will
have t o be employed. These can a l s o be determined
From the cost point of view we see t h a t this byaynamic programing methods, by imposing suit-
r e s u l t si n a cost ( [,CI
+ /c21 )h during the able constraints on the allowable choice of fl and
the information fed into the computer-controller.
time i n t e r v a l from (N k 1 ) h t o (W- k)h - - In the event of t h i s sub-optimization, ll3-J , the
and (since optimal decisions must be made over the loss i n system performance can be quantitatively
remaining k decisionsbeginningwiththe system assessed by comparison with the performance of an
i nt h es t a t e (%,v,) given by equation ( 8 ) 1, a optimal controller.
cost of f (c
k 1
+ c2h ,cp
l ) c 2 + CJ h) - [& - 5. I n f i n i t e Duration
duringthe remainder of theprocess.Clearly, the
choice of the system parameter a t time (N k l)h - - In the event that the process i s of suffi-
must be made so as t o minimize the sum of these ciently great duration, we may wish t o approximate
w
t o costs. This observation results in the t o it by means of an infinitely long process.
eqetion Furthermore, we may now wish to exert control so
as t o minimize the maximum deviation of the
function
Ix(t)l + Iv(t)l
over ELI. time. kt
us then define the function

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
f(c c ) = the maximum value of IxI+IVI N
1-
" over all time with the system = x(l\ki + lvkl lh + e q ( l % ] + lvH1) ( 3)
i n i t istahtilan
eltye (c1,c2), k=O
using an optimal control policy.
is minimized. This now can only be accomplished
(1) in some averagesense, since % and vB are random
variables for k = 1,2, ,N. ...
Once again we imbed
the original process within a class of procesaea.
CI Denoting
taking the value
of an expected by E, we
i .. The relevant
functional
equation now becomes define
the sequence of functions

J
This is based on the observation that u must be
chosen so t h a tt h el a r g e r of ]e1) + ]c2/ and the for k = 1,2,...,N, where
greatest aeviation over the remainder of the pro-
cess w i l l be a s smell possible.
as A further %k = vw= c2' (5)
discussion of suchequations can be found in 2 3 ,
Xhus, fk(c1,c2) represents the minimal expected
6 . StochasticControl
t o t a l c o s t of deviation from equilibrium for a
L e t us now complicate mattersfor the con- processbeginning a t time (N k)h and terminating -
t r o l l e r , somewhat,by assuming t h a t t h e system is a t time T = Nh, withthe system i n i t i a l l y in the.
subjected t o a random external force,
the in- s t a t e (cl,c2) .
fluence of which cannotbeneglected. If we
denote the random force which is exerted a t time We first consider
theone-stage process which
kh by rg, then equations (3.2) become -
begins a t time (a l ) h , ani3 terminates a t Rh = T.
We have
xn+l = xn + Vnh>
f 1( 1
c ' c2 ) = Min Eicllh + lc21h + exp[\cl
v
n+l
= vn -[pn(xn 2 - l ) v n + xn I
The controller can no longer predict precieely
h + rnh.

+
m15 u 9 2

c2hl + [e2 -( 2
- 1 ) c 2 + cl)h
w h a t state the system will be transformed i n t o
when avalueforu is chosen, for thetransformed
s t a t e depends on tEe value
that
the random force Jl . . . .
+ ,
rn asmmes.
or, taking the expected value over r
For simplicity, we shall a s s m that the N-
1'
random variables rn are independent and t h a t

So that the disturbing force is e i t h e r 26. The

case where r is correlated to the value of
n
can also be considered a t some increase in
complexity.
+ Ic2 - ( ~ ( ~ -1 21)c2 + cl)h +
1
Sat

- ..'il
We do,however,wish t o assume t h a t t h e value
of p have the known value p*. This last +jc2 - ( ~ ( ~ 1-21)c2 + cl)h
assumption is not justified i n many situationa.
If thevalue of p is not known, further compli-
cations arise, leading to the adaptive control Once again, t h i s minimization can e a s i l y be per-
processes discussed in $7. formedby a computer, so that the functions
fl(c1,c2)and M1(c1,c2), the minimizing value of
Once again we wish to control the development
of the system i n such a way t h a t p f o r each state (c c ) canbe taken a s known.
1' 2

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
Next we consider the process which is initi- brash contempt for the power of the human mind
ated at time -
(B k)h with the system in a general
state (c1,c2), so that the process involvesk Let us return to our nonlinear system
is being disturbed by a random force. Let now us
decisions. We wish to determine the optimal firstdeprive the controller of the knowledge
of the ex-
decision for the controller to make under these
circumstances, and we denote the ODtimal of value act
rn
value of p . The controller stillknows that
= 2 5 , but the probability of each outcome is
I. Por each state (c1,c2) yC(c1,c2).
by
not known. Although this is an unpleasant situa-
For any choiceof the system parameter tion, this controller is still much more fortunate
state of the system is transformed from the than one that does not know eventhe form of the
(c1,c2) at time(N -
k)h into the state distributions of the variables r , or their degree
of correlation. We shall not en?er into a dis-
( c1 + c2h,c2 -
[p( : c -
l)c2 + cl] h +
cussion of such matters here.
We can proceed with the design of an adaptive
with probability p* and into the state controller along the rollowing lines. The state
of the system will be characterized now o n l y not
(cl + c2h,c2 - [ + ( e : -
l)c2 + cl] h -
by a position anda velocity, butalso by a cur-
rent estimate for p, which in the absence of
further informationwe shall agree to regard as
-
with probability1 p*. Consequently, once again the precise value of the probability that =rn +6
using the principle of optimality, we obtain the
recurrence relation At any particular stage of the process uhen
control decision is made, not only does the sys
change state physically, but on the ofbasis the
knowledge of the original physical state, the
transformed physical state and the parameter valu
(LA) chosen, the controller can determine the sign
of theunknown force for that stage. This may
lead the controller to change its estimate of p.
But howshall the estimate be changed?
+ cl] h + a)
Though there are many ways of answering this
question, let us indicate one specific approach.
Let us regard p itself as a random variable with
an a priori probability density function u(x),
i.e.,

k = 2,3,...,N. The term in brackets represents the

cost during the first period from k)(N to -
Prob
c a 5p5a +A = W(a)A + O(A
2
).

-
(N k + l ) , and the second the minimal expected As
-
over the remaining k1 periods. As before in the we take the expected value
of p,
shall we
costthe initial estimate of p*, a value
call ply
deterministic case, we can determine computationally
the desired functions fk(c1,c2) and
%(c1,c2), using
the foregoing recursive relations.
7. Adaptive Control Upon observing that a positive disturbing force is
exerted, r= + 6, our new estimateof the pro&
In some circumstances, even less information ability density function for p will be given
than was assumed in the previous section be will
available to the controller concerning external
influences which may affect the behaviorof the
system being controlled. Provision may be made,
though, for the controller to "learn" about the
nature of these influences, as the processunfolds. Upon observing a negative disturbing rforce, = 6, -
we shall change our estimate of the probability
It may then be able to improveits control decision-
making capability in the courseof time. In this density function to
sense the controller adapts itself to circumstances.
w,(x) =
(1 x)w(x)
(4)
- .
Observe that we are using the word in a quite
precise sense. There is nothing mystical about the
machine "thinking" or "creating" or "learning" in
this restricted sense. That the human mind works Here we have adopted a Bayes approach,[15]. This
in this way, or that the machine in
any sense is the procedure adoptedin [6,7,9]. Consequently,
approximates the behaviorof the human mind, can after observing a positive disturbing force, the
only be concludedon the basisof a rash evaluation new estimate of p* itself is
of the possibilities of a digital computer or a

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
heldfixed. Then we define the sequence of
functions.

fk(cl,c2;m,n) h
te expected cost of control
=
over the last k stages of a
and after observing a negative disturbing force, process in which i n i t i a l l y
the new estimate of p* i s the system is inphysical (10)
state (clc' 2 ) and m posi-
/- x(l - x)w(x)ax t i v e and n negative forces
havebeen observed, Using an
optimal control policy.
In taking expected values we shall assume that
current estimates of distributions are the true
Byway of s m r y we m y note that i f the a distributions, consistent with our previous
p r i o r i choice of probability density function i s practice.
w(x), then after m positive and n negative forces
havebeen observed the new estimate of the density For the k-stage
process we f i n d (U)
m c t i o n is

xm(l - xpW(x)
6 l x y 1 - X)%(X)dx '
+cl]h+Sh;m+l,n)
and p,, the new estim'e of p itself, i s

lxm+l(l - x)nw(x)ax
PmJn -4 lxm(l - x)nv(x)dx
(7) +cJbGh;m,n+l) ,
The controller i s t o a c t as i f this estimate i s f o r k=.2,3,...,N, and f o r the one-stage process
the exactvalueof p*. This should be recognized
as an assumption of our analysis.
The i n t e M s i n equation (7) s i m p ~ ~ if fy
w(x) is the density function for a beta distri-
bution,

where B(a,b) i s the beta function. G r e a t flexi-

b i l i t y i n the shape of the curve of w(x) can be
achieved by selecting the pamters a and b
appropriately. For this choiceof w(x) we have

4 xWa(l - x)n+bldx A s before, the functions rk and the decision

'm,n -l 1 xm+a-l(l - xln+b-ldx (9)
functions %(c1,c2;m,n)
recursively.
are t o be determined

me n u m e r i a r e s o l u t i o n of equations (11) and

- B(m + a + 1, n + b) ?resents some difficulties, however, i n that
(E)
- B(m + a, n + b) sequences of functions of four arguments a r e t o be
determined. Several methods present themselves
forconsideration, however. I n particular, we
- (m + a ) note that when m + n i s large we can essert with
(m +
a ) + (n + b )
some confidence that m/(m + n) i s a good est-te
The parameters a and b play the roles of t h e a f o r p*. The decisions called for in the solution
p r i o r i numbers of positive and negative forces of the stochastic control process discussed in 5'6
observed. If the sum i s small,not much w e i g h t should povide nearly optimal control decisions.
i s given t o t h e i n i t i a l estimate; i f it i s large, Some advantage may be gained by considering an
many periods of the process are required before i n f i n i t e stage process as i n 5 5 , which has the
the estimate of p* can be significantly changed. e f f e c t of eliminating one subscript. These, and
other matters, will be discussed i n a forthcoming
opt-
We now take up the problem of determining the
decisions for the controller to make.
thesis by M. Aoki fU] , .
F i r s t the function w(x) i s chosen and it i s then
7

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
Notice that the adaptive cFlntroller dis- f ( c ,c ;m,n) =
1 1 2
IC, Ih + IC, jh
cussed i n t h i s s e c t i o n does no research"
regarding the random variables. It merely
observes the history of the process t o d a t e and
combines t h i s with i t s a p r i o r i knowledge i n a
way which i s specified a t the beginning of the
process, i n order to a r r i v e a t i t s current con-
trol decision. How the controller will a c t a6 the
result ofany particular observed history i s f u l l y
s p e c i f i e di n i t i a l l y . More sophisticatedcontrol-
l e r s v i l l be designed to look for correlations,
provide for non-stationarities in the unknown pro-
cess, and so on. Much remains to bedone d o n g
these lines.
This case differs mathematically from the
8. An Extension previous ones only in t h a t t h e minimizations a r e
now over a two-dimensional region rather than a
L e t us now turn our attention to the case i n one-dimensional region.
which the system can be controlled both by modi-
fying the system parameter p and also by exerting 9. Discussion
a control force, gk, a t time kh, where
k = 0,1,2 ,...,
N -1.. The equation for the state of In the earlier sections of t h i s paper we have
sketched a treatment of an adaptive control process
the system becomes
frm the dynamicprogramming viewpoint. Much re-
xWl = X + vnh, x. = cl, mains t o be done a t various levels in the treatment
n (1) of these fascinating control processes.
vn+l = vn - [.,(xn 2
J
- 1 ) vn + xh + rnh + gnh, A t the conceptual level, for example,models
involving other types of uncertalnties on the part
"0 = c.2, of the controller , mentioned i n 51, have y e t t o be
constructed. One of the principal difficulties
where now both p and gn are a t ourdisposal. As occurs in describing the state of knowledgeof the
before, pn i s l&ted t o l i e within the region controller and how t h i s changes as new information
i s added.F'urthermore, so much information may
nI -
<p<%,
I+
n=O,1, ...,N - 1 , (2 1 become available that a way must befound t o
summarize it succinctly without impairing the
and we s h a l l a l s o a s s u m that the control force decision-making capability to a marked degree. In
%is constrained by the inequality t h i s connection, see the discussion of sufficient
s t a t i s t i c s in [141.
14 5 G, n = 0,1,2, ...m - 1.
Insofar as the mathematical analysis itself i s
Upon introduction of the functions concerned, many perplexing problems arise, as, f o r
fk(xJv;m,n), example, questions concerning the convergence of
discrete adaptive processes to continuous processes
and the very formlation of adaptive processes of
continuous type.

Lastly, as we have already indicated, for the

more realistic processes involving more s t a t e
variables, the computational solutions present
special problems of t h e i r own, a l l of which must
be carefully investigated.

Another whole problem area beyond those already

mentioned i s encompassedby the actual construction
of optimsladaptivecontrollers. Challenging
problems arise in t r y i n g t o pursue a s t r a i g h t and
narrow path between the complexity of exact
solutions and t h e f a l l i b i l i t y of approximations.

References

1. Stoker, J. J., Nonlinear Vibrations in Mechani-

c a l and E l e c t r i c a l Systems, Interscience
Publishers, k c . , New York, 19%.

and

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.
2. Bellman, R., 'On the applicationof the through inhomogeneous media,' Proc. Hat.
Zheory of dynamic programming to the study Acad. Sei. USA, vol. 42, 1956, pp. 629-632.
of control processes,' Proc. Symposium on
Nonlinear Circuit Analysis, Polytechnic 9. Bellman, R., 'A problem in sequential design
Institute of Brooklyn, Brooklyn, New York, of experiments,' ~ y avol. , 16, 1956,
1957, PP. 199-2l-3. pp. 221-229.
*
3. Bellman, R., 'Dynamic programming and 10. Aseltine, J. A., A. R. Mancini, and
stochastic control processes,' Information C. W. m u r e , 'A Survey of adaDtive control
and control,POI. 1, 1958, pp. 228-239. systems,' IRE-Trans. On-Automatic Control,
SAC-6, k c . 193, pp. 102-108.
4. Bellman, R., Dynamic P r o g r m i n g , Princeton
University Press, Princeton, New Jersey, ll. Aoki, M., W.D. Thesis, Universityof California
1957 at Los Angeles, to appear.

5. Bellman, R., and R. Kalaba,'On the roleof 12. Freimer, M., F'h.D. Thesis, Harvard University,
dyn&c programming in statistical to appear.
communication theory,'IRE Trans. on
Information Theory, vol.IT-3, 1957, 13. Bellman, R., and R. Kalaba, On k-th best
PP 197-203* licles, The
RAND Coqbration, Paper
p"-t-
P-1 17, July 1958.
'On communication
6. Bellman, R., and R. Kalaba,
processes involving learning and random 14. Mood, A. PI., Introduction to the Theory
of
duration,' 1958 IRE Nat ionai Convention Statistics, McGraw-Hill
Book Compamy, Inc.,
Record, part4, July 1958, pp. 16-21. New York,19%.

7. Bellman, R., andR. Kalaba, Dynamic programm- 15. Cramer, H., Mathematical Methodsof Statistics,
ing and adaptive processes--1: Mathematical Princeton University Press, Princeton,
Foundation, TheRAND Corporation, Paper New Jersey,1951
P-1416,July 3, 1958.
16. Courant, R., and D. Hilbert, Methods of
8. Bellman, R., and R. Kalaba,'On the principle Mathematical Physics,vol. 1, Interscience
of invariant imbedding and propagation Publishers, Inc., New York,1953.

Authorized licensed use limited to: the Leddy Library at the University of Windsor. Downloaded on December 08,2023 at 16:48:18 UTC from IEEE Xplore. Restrictions apply.

Postphenomenological Investigations - Essay On Human-Technology Relations
No ratings yet
Postphenomenological Investigations - Essay On Human-Technology Relations
283 pages
Design of Corbel
No ratings yet
Design of Corbel
5 pages
Particularities of Single Shear Pin Joints Modeling For Msc/Nastran
No ratings yet
Particularities of Single Shear Pin Joints Modeling For Msc/Nastran
12 pages
EE5104 Adaptive Control Systems Part I
0% (1)
EE5104 Adaptive Control Systems Part I
57 pages
Science: Quarter 1 - Module 3 Potential and Kinetic Energy
80% (5)
Science: Quarter 1 - Module 3 Potential and Kinetic Energy
26 pages
2108.11336v2
No ratings yet
2108.11336v2
91 pages
Referencia 17
No ratings yet
Referencia 17
10 pages
Learning Control Systems - Review and Outlook
No ratings yet
Learning Control Systems - Review and Outlook
16 pages
Adaptive Control
No ratings yet
Adaptive Control
30 pages
Nonlinear and Adaptive Control
No ratings yet
Nonlinear and Adaptive Control
125 pages
Adaptive Control 2ed - Karl Johan Astrom, Bjorn Wittenmark
No ratings yet
Adaptive Control 2ed - Karl Johan Astrom, Bjorn Wittenmark
580 pages
Adaptive Fuzzy Controller: Application To The Control of The Temperature of A Dynamic Room in Real Time
No ratings yet
Adaptive Fuzzy Controller: Application To The Control of The Temperature of A Dynamic Room in Real Time
18 pages
Adaptive Fuzzy State-Feedback Control For A Class of Multivariable Nonlinear Systems
No ratings yet
Adaptive Fuzzy State-Feedback Control For A Class of Multivariable Nonlinear Systems
7 pages
1980 - Stable Adaptive Controller Design, Part II - Proof of Stability (Narendra)
No ratings yet
1980 - Stable Adaptive Controller Design, Part II - Proof of Stability (Narendra)
9 pages
Adaptive Control Stability Convergence and Robustness (S. Sastry and M. Bodson.)
No ratings yet
Adaptive Control Stability Convergence and Robustness (S. Sastry and M. Bodson.)
196 pages
Adaptive Control Ss2011 Vl02 WWW
No ratings yet
Adaptive Control Ss2011 Vl02 WWW
20 pages
A Stable Adaptive Fuzzy Sliding-Mode Control For Affine Nonlinear Systems With Application To Four-Bar Linkage Systems
No ratings yet
A Stable Adaptive Fuzzy Sliding-Mode Control For Affine Nonlinear Systems With Application To Four-Bar Linkage Systems
15 pages
Switching Control Mosca
No ratings yet
Switching Control Mosca
103 pages
Model Refrence Adaptive Control Presentation by Rishi
100% (5)
Model Refrence Adaptive Control Presentation by Rishi
63 pages
Adaptive Control: Stability, Convergence, and Robustness
No ratings yet
Adaptive Control: Stability, Convergence, and Robustness
201 pages
1978 Stable Adaptive Controller Design Direct Control Narendra PDF
No ratings yet
1978 Stable Adaptive Controller Design Direct Control Narendra PDF
14 pages
1978 - Stable Adaptive Controller Design-Direct Control (Narendra)
No ratings yet
1978 - Stable Adaptive Controller Design-Direct Control (Narendra)
14 pages
Adaptive Control: Belen, Carla Cadiz, Nino Celemin, Carlvinn Evangelista. Dale Hong, Jericho SNG Daryl
No ratings yet
Adaptive Control: Belen, Carla Cadiz, Nino Celemin, Carlvinn Evangelista. Dale Hong, Jericho SNG Daryl
7 pages
Ligence Methods. For Example, Artificial Sight and Hearing Are Based On The Use
No ratings yet
Ligence Methods. For Example, Artificial Sight and Hearing Are Based On The Use
7 pages
Non Linear Control
No ratings yet
Non Linear Control
15 pages
Feedback Control Theory
From Everand
Feedback Control Theory
Bruce Francis
5/5 (1)
2304.01901v2
No ratings yet
2304.01901v2
8 pages
s14 Completare
No ratings yet
s14 Completare
14 pages
Null PDF
No ratings yet
Null PDF
32 pages
Adaptive Control
No ratings yet
Adaptive Control
382 pages
Adaptive Control PDF
No ratings yet
Adaptive Control PDF
380 pages
Margareta Stefanovic, Michael G. Safonov - Safe Adaptive Control
No ratings yet
Margareta Stefanovic, Michael G. Safonov - Safe Adaptive Control
153 pages
Actuators 13 00167
No ratings yet
Actuators 13 00167
20 pages
Iterative Feedback Tuning, Theory and Applications
No ratings yet
Iterative Feedback Tuning, Theory and Applications
16 pages
A Simple PID Controller With Adaptive Parameter in A DsPIC
No ratings yet
A Simple PID Controller With Adaptive Parameter in A DsPIC
5 pages
Advances in Adaptive Control Theory Grad
No ratings yet
Advances in Adaptive Control Theory Grad
165 pages
The Unfalsified Control Concept and Learning
No ratings yet
The Unfalsified Control Concept and Learning
5 pages
Presentation of Adaptive Control
No ratings yet
Presentation of Adaptive Control
26 pages
Tac 1977 1101561
No ratings yet
Tac 1977 1101561
25 pages
9781315136523_webpdf
No ratings yet
9781315136523_webpdf
304 pages
Drouin 1991
No ratings yet
Drouin 1991
28 pages
Nonlinear and Adaptive Control: June 1986
No ratings yet
Nonlinear and Adaptive Control: June 1986
21 pages
Chap1-2 StSa
No ratings yet
Chap1-2 StSa
48 pages
Output Feedback Reinforcement Learning Control For Linear Systems
No ratings yet
Output Feedback Reinforcement Learning Control For Linear Systems
304 pages
Classification of Adaptive Control Techniques
No ratings yet
Classification of Adaptive Control Techniques
4 pages
Applsci 13 13181
No ratings yet
Applsci 13 13181
21 pages
21-0010_03_MS
No ratings yet
21-0010_03_MS
7 pages
ADAPTIVE CONTROL
No ratings yet
ADAPTIVE CONTROL
12 pages
1989 - Application of Identification-Free Algorithms For Adaptive Control
No ratings yet
1989 - Application of Identification-Free Algorithms For Adaptive Control
5 pages
Adaptive Control of Hyperbolic PDEs (PDFDrive)
No ratings yet
Adaptive Control of Hyperbolic PDEs (PDFDrive)
472 pages
Adaptive Control and Intersections with Reinforcement Learning
No ratings yet
Adaptive Control and Intersections with Reinforcement Learning
31 pages
XTP23 151 Paperstampedno PDF
No ratings yet
XTP23 151 Paperstampedno PDF
6 pages
Imp Adaptive Control
No ratings yet
Imp Adaptive Control
13 pages
10 1109@tcyb 2020 2984791
No ratings yet
10 1109@tcyb 2020 2984791
11 pages
Cessna 2
No ratings yet
Cessna 2
20 pages
20-IEEE-NNLS-Event-Triggered Optimal Control With Performance Guarantees Using Adaptive Dynamic Programming
No ratings yet
20-IEEE-NNLS-Event-Triggered Optimal Control With Performance Guarantees Using Adaptive Dynamic Programming
13 pages
Adaptive Control of Uncertain Gun Control System of Tank (396KB)
No ratings yet
Adaptive Control of Uncertain Gun Control System of Tank (396KB)
6 pages
Model-Based Adaptive Critic Designs: Editor's Summary
No ratings yet
Model-Based Adaptive Critic Designs: Editor's Summary
31 pages
Fault-Tolerant Lyapunov-Gain-Scheduled PID Control of A Quadrotor UAV
No ratings yet
Fault-Tolerant Lyapunov-Gain-Scheduled PID Control of A Quadrotor UAV
6 pages
Stable Adaptive Fuzzy Control of Nonlinear Systems: Li-Xin Wang
No ratings yet
Stable Adaptive Fuzzy Control of Nonlinear Systems: Li-Xin Wang
10 pages
Eee 541
No ratings yet
Eee 541
37 pages
Production System: Fundamentals and Applications
From Everand
Production System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Control: Experimental Approaches
From Everand
Automatic Control: Experimental Approaches
Subodh Keshari
No ratings yet
Hierarchical Control System: Fundamentals and Applications
From Everand
Hierarchical Control System: Fundamentals and Applications
Fouad Sabry
No ratings yet
C I S 2005 International Press Vol. 5, No. 1, Pp. 1-20, 2005 001
No ratings yet
C I S 2005 International Press Vol. 5, No. 1, Pp. 1-20, 2005 001
20 pages
Adaptive Control and The NASA X-15-3 Flight Revisited
No ratings yet
Adaptive Control and The NASA X-15-3 Flight Revisited
17 pages
2003book Notes
No ratings yet
2003book Notes
66 pages
Solution HMW1
No ratings yet
Solution HMW1
15 pages
Speech Probability Distribution
No ratings yet
Speech Probability Distribution
4 pages
Lecture Notes 331 1
No ratings yet
Lecture Notes 331 1
103 pages
Graph Theory PDF
No ratings yet
Graph Theory PDF
105 pages
Don't Slam-Final PDF
No ratings yet
Don't Slam-Final PDF
2 pages
Zero Not Out-V7
No ratings yet
Zero Not Out-V7
2 pages
Emulsions FDA 1694370790
No ratings yet
Emulsions FDA 1694370790
50 pages
Failure Analysis of Structural Steel Subjected To Long Term Exposure
No ratings yet
Failure Analysis of Structural Steel Subjected To Long Term Exposure
13 pages
1st Preboard Design P
No ratings yet
1st Preboard Design P
13 pages
Physics books
No ratings yet
Physics books
13 pages
Inverter Reliability Final
No ratings yet
Inverter Reliability Final
4 pages
Theory of Machines II Lab Manual
No ratings yet
Theory of Machines II Lab Manual
25 pages
Esa Ue20ee101 II Sem August 2021
No ratings yet
Esa Ue20ee101 II Sem August 2021
3 pages
Rt109 Rev l7
No ratings yet
Rt109 Rev l7
12 pages
17 Day Study Plan
No ratings yet
17 Day Study Plan
4 pages
Vedantu Neet Ug Material Part 2
No ratings yet
Vedantu Neet Ug Material Part 2
41 pages
Conduction PDF
No ratings yet
Conduction PDF
18 pages
The Behaviour and Design of Steel Structures to EC3 Fourth Edition N.S. Trahair download
100% (1)
The Behaviour and Design of Steel Structures to EC3 Fourth Edition N.S. Trahair download
55 pages
Full Download Convexity An Analytic Viewpoint 1st Edition Barry Simon PDF DOCX
100% (2)
Full Download Convexity An Analytic Viewpoint 1st Edition Barry Simon PDF DOCX
81 pages
Handouts in Parabola
No ratings yet
Handouts in Parabola
3 pages
Back Analysis2 PDF
No ratings yet
Back Analysis2 PDF
124 pages
Gear Inspection and Measurement: ITW., Imnois Tools, Uncolnwood
No ratings yet
Gear Inspection and Measurement: ITW., Imnois Tools, Uncolnwood
6 pages
CFS 12CS4x070 50 FT
No ratings yet
CFS 12CS4x070 50 FT
9 pages
Alvin Toffler 2020: Elaine J. Hom - Live Science Contributor
No ratings yet
Alvin Toffler 2020: Elaine J. Hom - Live Science Contributor
7 pages
Calorimetry and Conservation of Energy Lab Report
No ratings yet
Calorimetry and Conservation of Energy Lab Report
5 pages
A Thermodynamic Calculation of The Ni-Nb Phase Diagram
No ratings yet
A Thermodynamic Calculation of The Ni-Nb Phase Diagram
9 pages
Graphs and their operations
No ratings yet
Graphs and their operations
9 pages
Applications of Differential Equations in Petroleum Engineering
No ratings yet
Applications of Differential Equations in Petroleum Engineering
5 pages
2.7 Impulsive Force 2020 Answer
No ratings yet
2.7 Impulsive Force 2020 Answer
5 pages
Ch1 1 (Notes) PDF
No ratings yet
Ch1 1 (Notes) PDF
12 pages
Design Calculation of Superstructure For P16-A2 Over Birupa River
No ratings yet
Design Calculation of Superstructure For P16-A2 Over Birupa River
45 pages
Calculating Yaw of Repose and Spin Drift: September 2018
No ratings yet
Calculating Yaw of Repose and Spin Drift: September 2018
52 pages

On Adaptive Control Processes

Uploaded by

On Adaptive Control Processes

Uploaded by

ON MV

Richard BeUlnan and Robert Kalaba

where exp( z) is the exponential function of 2 .

We s h a l l determine first fl(c1,c2), then

The function fl(c1,c2) is e a s i l y determined.

' from the above discussion, f ( c ,c ) can be

where the right-hand side represents the minimum

Now let us assume that the functions

To apply this p r i n c i p l e , l e t u s suppose t h a t

So that the disturbing force is e i t h e r 26. The

k = 2,3,...,N. The term in brackets represents the

where B(a,b) i s the beta function. G r e a t flexi-

4 xWa(l - x)n+bldx A s before, the functions rk and the decision

me n u m e r i a r e s o l u t i o n of equations (11) and

Lastly, as we have already indicated, for the

Another whole problem area beyond those already

1. Stoker, J. J., Nonlinear Vibrations in Mechani-

You might also like