ACL2023

Cross-lingual Science Journalism: Select, Simplify and Rewrite Summaries for Non-expert Readers

Mehwish Fatima, Michael Strube

2 citations

Abstract

Automating Cross-lingual Science Journalism (CSJ) aims to generate popular science summaries from English scientific texts for nonexpert readers in their local language. We introduce CSJ as a downstream task of text simplification and cross-lingual scientific summarization to facilitate science journalists' work. We analyze the performance of possible existing solutions as baselines for the CSJ task. Based on these findings, we propose to combine the three components -SELECT, SIMPLIFY and REWRITE (SSR) to produce cross-lingual simplified science summaries for non-expert readers. Our empirical evaluation on the WIKIPEDIA dataset shows that SSR significantly outperforms the baselines for the CSJ task and can serve as a strong baseline for future work. We also perform an ablation study investigating the impact of individual components of SSR. Further, we analyze the performance of SSR on a high-quality, real-world CSJ dataset with human evaluation and in-depth analysis, demonstrating the superior performance of SSR for CSJ. A Scientific and News Structure Figure A.1 presents the difference between a scientific text discourse and a news text discourse.