Robust Airborne Collision Avoidance through Dynamic Programm(11)_航空信息_民用航空_通用航空

曝光台注意防骗网曝天猫店富美金盛家居专营店坑蒙拐骗欺诈消费者

Section
9
generalizes
the
logic
to
handle
encounters
in
which
multiple
intruder
aircraft
pose
a threat to the own aircraft. Because adding new state variables for each intruder does not scale, this section examines di.erent approximation methods, such as command arbitration and utility fusion, that decompose the full problem into smaller, more manageable subproblems. The section presents ways to visualize the behavior of the multithreat logic as well as simulation results.
Section
10
concludes
the
report,
summarizing
the
major
contributions,
and
outlines
some
directions for future work.
Appendix
A
describes
the
.le
format
used
to
represent
the
expected
cost
table.

Appendix
B
overviews the Kalman .lter and an extension to the Kalman .lter, called the un-scented Kalman .lter, to approximate the e.ects of nonlinear dynamic and measurement functions. The
section
discusses
the
details
of
the
horizontal
and
vertical
.lters
used
in
Section
7.

2. PROBLEM FORMULATION

If a collision avoidance system is equipped with nearly perfect sensors, the problem of collision avoidance can be framed as a Markov decision process (MDP). In an MDP, the state of the world is assumed to evolve according to a .xed dynamic model. A solution to an MDP is a strategy that maximizes performance according to some metric. Due to the assumptions made by the dynamic model and the performance metric, to be outlined in this section, solutions may be found e.ciently using a computational technique called dynamic programming (DP).
In cases where a collision avoidance system is equipped with noisy sensors, the problem is better framed as a partially observable Markov decision process (POMDP), which augments the MDP formulation with an observation model that is used to generate observations based on the state of the world. This section discusses both MDPs and POMDPs, and later sections will show how they can be applied to collision avoidance.
2.1 MARKOV DECISION PROCESSES
MDPs have been well studied since the 1950s and have been applied to a wide variety of problems [16–18]. They require that the dynamic model be Markovian, meaning that the probability of transitioning to state s depends only on the current state s and action a. This probability is denoted T (s, a, s ), and T is often called the state-transition function. So long as su.cient information about the problem can be encoded in the state, the Markov assumption is usually valid. The set of possible states is denoted S and the set of possible actions is denoted A.
Solving an MDP involves searching for a strategy for choosing actions, also called a policy, that maximizes a performance metric. Although policies can depend upon the entire history of states and actions, it is well known that under certain assumptions regarding the structure of the performance metric, it is su.cient to only consider policies that deterministically depend only on the
current
state
without
losing
optimality
[19].
Given
a
policy
π, the action to execute from state s is denoted π(s).
There are several common performance metrics, also called optimality criteria, typically used with MDPs. One common metric is the expected sum of instantaneous reward up to some .xed horizon. The optimal policy is the one that maximizes this metric. Because collision avoidance typically involves avoiding particular events, such as collision and alerting, it is a little easier to de.ne the metric in terms of positive costs instead of negative rewards. The objective, then, is to minimize the expected sum of instantaneous costs. In this report, the word “cost” is used to mean “sum of instaneneous costs.” When “instantaneous cost” is intended, it will always be written out as such.1
　
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址：Robust Airborne Collision Avoidance through Dynamic Programm(11)