Warm-Starting for Improving the Novelty of Abstractive Summarization
Document Type
Article
Publication Date
1-1-2023
Abstract
The Abstractive summarization is distinguished by using novel phrases that are not found in the source text. However, most previous research ignores this feature in favour of enhancing syntactical similarity with the reference. To improve novelty aspects, we have used multiple warm-started models with varying encoder and decoder checkpoints and vocabulary. These models are then adapted to the paraphrasing task and the sampling decoding strategy to further boost the levels of novelty and quality. In addition, to avoid relying only on the syntactical similarity assessment, two additional abstractive summarization metrics are introduced: 1) NovScore: a new novelty metric that delivers a summary novelty score; and 2) NSSF: a new comprehensive metric that ensembles Novelty, Syntactic, Semantic, and Faithfulness features into a single score to simulate human assessment in providing a reliable evaluation. Finally, we compare our models to the state-of-the-art sequence-to-sequence models using the current and the proposed metrics. As a result, warm-starting, sampling, and paraphrasing improve novelty degrees by 2%, 5%, and 14%, respectively, while maintaining comparable scores on other metrics.
Keywords
Abstractive summarization, novelty, warm-started models, deep learning, metrics
Divisions
ai
Funders
Ministry of Education, Malaysia (JPT(BKPI)1000/016/018/25(58))
Publication Title
IEEE Access
Volume
11
Publisher
Institute of Electrical and Electronics Engineers
Publisher Location
445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA