航空资料36(42)_航空信息_民用航空_通用航空

曝光台注意防骗网曝天猫店富美金盛家居专营店坑蒙拐骗欺诈消费者

shown that the differences between evaluations based on
two different models is about the same as the difference
between system ranking based on one model summary
and ranking produced using input-summary comparisons.
Cosine similarity with singular value decomposition was
used to compare input with summaries. Only this one approach
for similarity comparison was used. In contrast,
we explore a variety of features and the experiments outlined
in this paper enable us to compare the usefulness of
different similarity measures for automatic evaluation.
Kullback Leibler (KL) divergence The KL divergence
between two probability distributions P and Q is
given by
D(P||Q) =X
w
pP (w) log2
pP (w)
pQ(w)
(1)
It is defined as the average number of bits wasted by coding
samples belonging to P using another distribution Q,
an approximate of P. In our case, the two distributions
are those for words in the input and summary respectively.
However, KL divergence is not symmetric. So
the divergence computed both ways, input-summary and
summary-input are used as features.
In addition, the divergence is undefined when
pP (w) > 0 but pQ(w) = 0. We perform simple smoothing
to overcome the problem.
p(w) =
C +
N + B
(2)
Here C is the count of word w and N is the number of tokens.
A value of 1.5 times the input vocabulary was used
as an estimate for outcomes (B) of the probability distribution
and was set to a small value of 0.0005 to avoid
shifting too much probability mass to unseen events.
Jensen Shannon (JS) divergence The JS divergence
is based on the idea that the distance between two distributions
cannot be very different from the average of distances
from their mean distribution. It is formally defined
as
J(P||Q) =
1
2
[D(P||A) + D(Q||A)],
where A =
P + Q
2
is the mean distribution of P and Q.
In contrast to KL divergence, the JS distance is symmetric
and always defined. We use both smoothed and unsmoothed
versions of the divergence as features.
Similarity between input and summary The third
metric is cosine overlap between the tf-idf vector representations
of input and summary contents.
cos =
vinp.vsumm
||vinp||||vsumm||
. (3)
We compute two variants,
1. Cosine overlap between input and summary words
2. Cosine overlap between topic signatures of input and
words of summary
Topic signatures are words highly descriptive of the input,
as determined by the application of log-likelihood
test (Lin and Hovy, 2000). Using only topic signatures
from the input to represent text is expected to be more
accurate and to remove noise from peripherally related
content. In addition, the refined input vector has a smaller
dimension suitable for comparison with a vector of summary
words which is typically small compared to a complete
bag of words vector of the input.
3.2 Summary Probabilities
These features capture the log likelihood of a summary
given its input. The probability of a word appearing in
the summary is estimated from the input. We compute
both the unigram bag of words probability as well as the
probability of the summary under a multinomial model.
The comparison with ROUGE in (Lin et al., 2006) (described
under Section 3.1) also included unigramlog likelihood
alongside KL and JS divergences. However JS divergence
proved better than the other two.
Unigram summary probability
(pinpw1)n1 (pinpw2)n2 ...(pinpwr)nr (4)
where pinpwi is the probability in the input of word wi,
ni is the number of times wi appears in the summary, and
w1...wr are all words in the summary vocabulary.
Multinomial summary probability
N!
n1!n2!...nr!
(pinpw1)n1(pinpw2)n2 ...(pinpwr)nr (5)
where N = n1 + n2 + ... + nr is the total number of
words in the summary.
3.3 Use of input’s topic words in summary
Summarizer systems that directly optimize formore topic
signatures during content selection have fared very well
in evaluations (Conroy et al., 2006). Hence the number
of topic signatures from the input present in a summary
might be a good indicator of summary content quality.
We experiment with two features that quantify the presence
of topic signatures in a summary.
1. Percentage of summary composed from input’s topic
signatures
2. Percentage of topic signature words from the input
that also appear in the summary
While both features will obtain higher values for summaries
containing many topic signature words, the first is
　
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址：航空资料36(42)