Effectiveness of Data Augmentation and Pretraining for Improving Neural Headline Generation in Low-Resource Settings

Show full item record



Permalink

http://hdl.handle.net/10138/346642

Citation

Martinc , M , Montariol , S , Pivovarova , L & Zosa , E 2022 , Effectiveness of Data Augmentation and Pretraining for Improving Neural Headline Generation in Low-Resource Settings . in Proceedings of the 13th Language Resources and Evaluation Conference . European Language Resources Association (ELRA) , pp. 3561–3570 , LREC 2022 , Marseille , France , 20/06/2022 . < http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.381.pdf >

Title: Effectiveness of Data Augmentation and Pretraining for Improving Neural Headline Generation in Low-Resource Settings
Author: Martinc, Matej; Montariol, Syrielle; Pivovarova, Lidia; Zosa, Elaine
Contributor organization: Department of Digital Humanities
Department of Computer Science
Discovery Research Group/Prof. Hannu Toivonen
Publisher: European Language Resources Association (ELRA)
Date: 2022-07
Language: eng
Belongs to series: Proceedings of the 13th Language Resources and Evaluation Conference
ISBN: 979-10-95546-72-6
URI: http://hdl.handle.net/10138/346642
Abstract: We tackle the problem of neural headline generation in a low-resource setting, where only limited amount of data is available to train a model. We compare the ideal high-resource scenario on English with results obtained on a smaller subset of the same data and also run experiments on two small news corpora covering low-resource languages, Croatian and Estonian. Two options for headline generation in a multilingual low-resource scenario are investigated: a pretrained multilingual encoder-decoder model and a combination of two pretrained language models, one used as an encoder and the other as a decoder, connected with a cross-attention layer that needs to be trained from scratch. The results show that the first approach outperforms the second one by a large margin. We explore several data augmentation and pretraining strategies in order to improve the performance of both models and show that while we can drastically improve the second approach using these strategies, they have little to no effect on the performance of the pretrained encoder-decoder model. Finally, we propose two new measures for evaluating the performance of the models besides the classic ROUGE scores.
Subject: 113 Computer and information sciences
6121 Languages
Peer reviewed: Yes
Rights: cc_by_nc
Usage restriction: openAccess
Self-archived version: publishedVersion


Files in this item

Total number of downloads: Loading...

Files Size Format View
Multilingual_he ... a_low_resource_setting.pdf 217.2Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record