Robust Airborne Collision Avoidance through Dynamic Programm(26)_航空信息_民用航空_通用航空

曝光台注意防骗网曝天猫店富美金盛家居专营店坑蒙拐骗欺诈消费者

1. The state su(t + 1) depends only upon su(t). The probability of transitioning from su to s
u
is given by T (su,s).
u
2.
The immediate cost c(t + 1) depends only upon sc(t) and a(t). If the controlled state is sc and action a is executed, the immediate cost is denoted C(sc,a).

3.
The episode terminates when su ∈ G . Su with immediate cost C(sc).

Figure
14
shows
the
in.uence
diagram
for
this
model.

In the collision avoidance problem, sc represents the state of the vertical motion variables, and su represents the state of the horizontal motion variables. The .rst assumption is satis.ed because the advisories issued by the collision avoidance system do not in.uence the horizontal motion. The second assumption is satis.ed because the immediate nonterminal cost only depends on the advisory state and the advisory being issued.
The third assumption requires the episode to terminate when su enters G. In this problem, G is the set of states where there is a horizontal NMAC, de.ned to be when an intruder comes within 500 ft horizontally. The immediate cost when this occurs is given by C(sc), which is one when the intruder is within 100ft vertically and zero otherwise. In simulation, the episode does not necessarily terminate when su enters G, since entering G does not necessarily imply that there has been an NMAC (e.g., the two aircraft may have safely missed each other by 1000ft vertically). However, it is generally su.cient to plan up to the moment where su enters G because adequate separation at that moment generally indicates that the encounter has been resolved.

5.2 CONTROLLED SUBPROBLEM
Solving the controlled subproblem involves computing the optimal policy for the controlled variables under the assumption that the time until su enters G, denoted τ , is known. In the collision avoidance problem, τ is the number of steps until another aircraft comes within 500ft horizontally. Of course, τ cannot be determined exactly from su(t) because it depends upon an event that occurs in the future,
but
this
will
be
addressed
by
the
uncontrolled
subproblem
(Section
5.3).

The expected cost from sc given τ is denoted Jτ (sc). The series J0,...,JK is computed recursively, starting with J0(sc)= C(sc) for all controlled states and iterating as follows:
.. .
Jk(sc) = min.C(sc,a)+ T (sc, a, s)Jk.1(s) . (21)
cc
a
s
c
The expected cost from sc when executing a for one step and then following the optimal policy is given by
Jk(sc,a)= C(sc,a)+ T (sc, a, sc)Jk.1(sc). (22) s
c
The K-step expected cost when τ>K is denoted J ˉ . It is computed by initializing J0(sc)=0
K
for
all
states
and
iterating
Eq.
(21)
to
horizon
K. The series J0,...,JK ,J ˉ is saved in a table in
K
memory, requiring O(K|A||Sc|) entries.
For the collision avoidance problem, the tables were computed o.ine in less than two minutes on a single 3 GHz Intel Xeon core using a horizon of K = 39 steps. Storing only the values for the valid state-action pairs requires 263MB using a 64bit .oating point representation. For the experiments
in
this
section,
the
same
model
and
cost
parameters
assumed
in
Section
3
were
used,
except the cost of alerting was decreased to 0.001.

5.3 UNCONTROLLED SUBPROBLEM
Solving the uncontrolled subproblem involves using the probabilistic model of the uncontrolled dynamics to infer a distribution over τ for each uncontrolled state su. This distribution is referred to as the entry time distribution because it represents the distribution over the time for su to enter
　
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址：Robust Airborne Collision Avoidance through Dynamic Programm(26)