航空资料8(91)_航空信息_民用航空_通用航空

曝光台注意防骗网曝天猫店富美金盛家居专营店坑蒙拐骗欺诈消费者

when multiple human models were used
for the comparison.
The use of cosine similarity in Donaway et
al. (2000) is more directly related to our work.
They show that the difference between evaluations
3The scores were computed after stemming but stop
words were retained in the summaries.
308
based on two different human models is about the
same as the difference between system ranking
based on one model summary and the ranking produced
using input-summary similarity. Inputs and
summaries were compared using only one metric:
cosine similarity.
Kullback Leibler (KL) divergence: The KL divergence
between two probability distributions P
and Q is given by
D(P||Q) =
X
w
pP (w) log2
pP (w)
pQ(w)
(1)
It is defined as the average number of bits wasted
by coding samples belonging to P using another
distribution Q, an approximate of P. In our case,
the two distributions are those for words in the
input and summary respectively. Since KL divergence
is not symmetric, both input-summary
and summary-input divergences are used as features.
In addition, the divergence is undefined
when pP (w) > 0 but pQ(w) = 0. We perform
simple smoothing to overcome the problem.
p(w) =
C +
N + B
(2)
Here C is the count of word w and N is the
number of tokens; B = 1.5|V |, where V is the
input vocabulary and was set to a small value
of 0.0005 to avoid shifting too much probability
mass to unseen events.
Jensen Shannon (JS) divergence: The JS divergence
incorporates the idea that the distance between
two distributions cannot be very different
from the average of distances from their mean distribution.
It is formally defined as
J(P||Q) =
1
2
[D(P||A) + D(Q||A)], (3)
where A = P+Q
2 is the mean distribution of P
and Q. In contrast to KL divergence, the JS distance
is symmetric and always defined. We use
both smoothed and unsmoothed versions of the divergence
as features.
Similarity between input and summary: The
third metric is cosine overlap between the tf idf
vector representations (with max-tf normalization)
of input and summary contents.
cos =
vinp.vsumm
||vinp||||vsumm||
(4)
We compute two variants:
1. Vectors contain all words from input and
summary
2. Vectors contain only topic signatures from
the input and all words of the summary
Topic signatures are words highly descriptive of
the input, as determined by the application of loglikelihood
test (Lin and Hovy, 2000). Using only
topic signatures from the input to represent text is
expected to be more accurate because the reduced
vector has fewer dimensions compared with using
all the words from the input.
4.2 Summary likelihood
The likelihood of a word appearing in the summary
is approximated as being equal to its probability
in the input. We compute both a summary’s
unigram probability as well as its probability under
a multinomial model.
Unigram summary probability:
(pinpw1)n1 (pinpw2)n2 ...(pinpwr)nr (5)
where pinpwi is the probability in the input of
word wi, ni is the number of times wi appears
in the summary, and w1...wr are all words in the
summary vocabulary.
Multinomial summary probability:
N!
n1!n2!...nr !
(pinpw1)n1 (pinpw2)n2 ...(pinpwr)nr (6)
where N = n1 +n2 +... +nr is the total number
of words in the summary.
4.3 Use of topic words in the summary
Summarization systems that directly optimize for
more topic signatures during content selection
have fared very well in evaluations (Conroy et al.,
2006). Hence the number of topic signatures from
the input present in a summary might be a good
indicator of summary content quality. We experiment
with two features that quantify the presence
of topic signatures in a summary:
1. Fraction of the summary composed of input’s
topic signatures.
2. Percentage of topic signatures from the input
that also appear in the summary.
While both features will obtain higher values
for summaries containing many topic words, the
first is guided simply by the presence of any topic
word while the second measures the diversity of
topic words used in the summary.
309
4.4 Feature combination using linear
regression
We also evaluated the performance of a linear regression
metric combining all of the above features.
The value of the regression-based score for
each summary was obtained using a leave-oneout
approach. For a particular input and systemsummary
　
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址：航空资料8(91)