Multilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage

TitleMultilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage
Publication TypeJournal Article
Year of Publication2013
AuthorsConroy, JM, Davis, ST, Kubina, J, Liu, Y-K, O'Leary, DP, Schlesinger, JD
JournalMultiLing (Workshop on Multilingual Multi-document Summarization)
Pages55-63
Date Published2013/08/09
Abstract

In this paper we present three term weighting
approaches for multi-lingual document
summarization and give results on
the DUC 2002 data as well as on the
2013 Multilingual Wikipedia feature articles
data set. We introduce a new intervalbounded
nonnegative matrix factorization.
We use this new method, latent semantic
analysis (LSA), and latent Dirichlet allocation
(LDA) to give three term-weighting
methods for multi-document multi-lingual
summarization. Results on DUC and TAC
data, as well as on the MultiLing 2013
data, demonstrate that these methods are
very promising, since they achieve oracle
coverage scores in the range of humans
for 6 of the 10 test languages. Finally,
we present three term weighting approaches
for the MultiLing13 single document
summarization task on the Wikipedia
featured articles. Our submissions signifi-
cantly outperformed the baseline in 19 out
of 41 languages.

URLhttp://aclweb.org/anthology/W/W13/W13-3108.pdf