Uncoordinated Policy
Instead of using the joint policy, the uncoordinated policy can be used. The assumption in the uncoordinated policy is that the intruder never follows any advisories, which is the assumption the current TCAS logic uses. However, instead of using linear projection like TCAS, the MDP uncoordinated formulation models the intruder following a white-noise acceleration model. The uncoordinated policy will tend to alert earlier and more often than the joint policy because it does not model the increased safety bene.t due to the intruder responding to advisories.
0 5 10152025303540
(a) h˙0 = 0ft/min,h˙1 = 0ft/min,sRA = COC/COC
0 5 10152025303540
(b) h˙0 = 1500ft/min,h˙1 = 0ft/min,sRA = COC/COC
If both aircraft use the uncoordinated policy, most of the time they will issue compatible advisories when there is no sensor noise. However, it is easy to construct situations where, in the absence of communication, the aircraft issue incompatible advisories even with perfect state information. For example, if the aircraft are co-altitude, level, and neither system has issued an advisory, both aircraft will be in the same state and, consequently, issue identical advisories. When there is sensor noise, the aircraft can have di.erent views of the world, resulting in the issuance of more incompatible advisories. Communicating intended advisories between aircraft during execution can help resolve this issue.
Communication Policy
The problem with the uncoordinated policy is that it does not account for the other aircraft following their advisories. However, the uncoordinated model can be augmented to include the joint advisory state. During execution, updating the posterior distribution over the joint advisory state would involve taking into account the advisory being executed by the intruder as inferred from the coordination message. The coordination scheme currently used by TCAS does not allow the exact advisory of the intruder to be communicated, only the complement of the sense employed by the aircraft. However, the VRC can be used to infer a distribution over the intruder advisories and, consequently, the posterior over the joint advisory state.
Introducing the joint advisory state as a state variable in the MDP requires de.ning its dynamics. One way to do this is to assume that the intruder will continue executing its active advisory inde.nitely. Although, in reality, the intruder may reverse, strengthen, or discontinue the advisory, modeling the intruder advisory as constant may still result in a sensible policy. Usually, advisories do not have to be reversed. If the intruder strengthens, it will only result in greater separation between the aircraft, so modeling the intruder as performing a less severe maneuver will only result in more conservative behavior. If the intruder discontinues the advisory, it is usually because the probability of collision at that point is negligible.
Assuming the intruder continues executing the same action inde.nitely makes specifying the model very simple. It requires de.ning Pr(sRA | sRA,a), where sRA is the joint advisory state at the next step, sRA is the current joint advisory state, and a is the action executed by the own aircraft. If it is important to model changes in the advisory of the intruder, it can be done by introducing a dependence on the other state variables: h, h˙0, h˙1, and τ. These other state variables in.uence the action taken by the intruder and, consequently, sRA at the next step. It is not expected that the optimized policy will be overly sensitive to the action model of the intruder. The action model could be based on the uncoordinated policy, the joint policy, or the policy of any system (e.g., TCAS) with which the system must interoperate.
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址:Robust Airborne Collision Avoidance through Dynamic Programm(40)