5.4 ONLINE SOLUTION
After J0,...,JK ,J ˉ and D0,...,DK ,D ˉ have been computed o.ine, they are used together online
KK
to determine the approximately optimal action to execute from the current state. For any discrete state s in the original state space, the expected cost J. (s, a) may be computed as follows
K
K J . (s, a)= D ˉ (su)J ˉ (sc,a)+ Dk(su)Jk(sc,a), (27)
K KK
k=0
where su is the discrete uncontrolled state and sc is the discrete controlled state associated with
s. Combining the controlled and uncontrolled solutions online in this way requires time linear in the size of the horizon. Multilinear interpolation can be used to estimate J. (x,a) for an arbitrary
K
state x, and from this the optimal action may be obtained.
The memory requirements for directly storing the true J. (s, a) is O(|A||Sc||Su|). However,
K
the hybrid o.ine-online method presented in this section allows the solution to be represented using O(K|A||Sc| + K|Su|) storage, which can be a tremendous savings when |Sc| and |Su| are large. For the collision avoidance problem, this method allows the cost table to be stored in 500MB instead of over 1 TB. The o.ine computational savings are even more signi.cant.
Simple Monte Carlo Dynamic Programming
×104 ×104
1
40 1
40 0.5
30 0.5
30
0
20 0
20
.0.5
10 .0.5
10
.1
0 .1
0
00.511.52 00.511.52
×104 ×104 ×104 ×104
1
40 1
40 0.5
30 0.5
30
0
20 0
20
.0.5
10 .0.5
10
.1
0 .1
0
00.511.52 00.511.52
×104 ×104 ×104 ×104
1
40 1
40 0.5
30 0.5
30
0
20 0
20
.0.5
10 .0.5
10
.1
0 .1
0
0 0.511.52 00.511.52 ×104 ×104
rv = 250ft/s rv = 500ft/s
Figure 16. Mean of the entry distribution for two slices of the state space when the relative horizontal velocity is pointing directly left. Horizontal axis represents the relative x displacement (ft) and the vertical axis the relative y displacement (ft).
5.5 EXAMPLE ENCOUNTER
Figure
17
shows
an
example
encounter
comparing
the
behavior
of
the
system
using
the
DP
entry
time distribution against the TCAS logic. The encounter was produced using the white-noise encounter
model.
Figure
18
shows
the
entry
time
distribution
computed
using
the
three
methods
of
Section
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址:Robust Airborne Collision Avoidance through Dynamic Programm(28)