EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity

Show full item record



Permalink

http://hdl.handle.net/10138/346641

Citation

Zosa , E , Boros , E , Koloski , B & Pivovarova , L 2022 , EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity . in Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) . The Association for Computational Linguistics , Seattle, United States , pp. 1107–1113 , International Workshop on Semantic Evaluation , Seattle , Washington , United States , 14/07/2022 . < https://aclanthology.org/2022.semeval-1.156/ >

Title: EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity
Author: Zosa, Elaine; Boros, Emanuela; Koloski, Boshko; Pivovarova, Lidia
Contributor organization: Department of Computer Science
Discovery Research Group/Prof. Hannu Toivonen
Publisher: The Association for Computational Linguistics
Date: 2022-07-11
Language: eng
Belongs to series: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
ISBN: 978-1-955917-80-3
URI: http://hdl.handle.net/10138/346641
Abstract: In this paper, we present the participation of the EMBEDDIA team in the SemEval-2022 Task 8 (Multilingual News Article Similarity). We cover several techniques and propose different methods for finding the multilingual news article similarity by exploring the dataset in its entirety. We take advantage of the textual content of the articles, the provided metadata (e.g., titles, keywords, topics), the translated articles, the images (those that were available), and knowledge graph-based representations for entities and relations present in the articles. We, then, compute the semantic similarity between the different features and predict through regression the similarity scores. Our findings show that, while our proposed methods obtained promising results, exploiting the semantic textual similarity with sentence representations is unbeatable. Finally, in the official SemEval-2022 Task 8, we ranked fifth in the overall team ranking cross-lingual results, and second in the English-only results.
Subject: 113 Computer and information sciences
Peer reviewed: Yes
Rights: cc_by
Usage restriction: openAccess
Self-archived version: publishedVersion


Files in this item

Total number of downloads: Loading...

Files Size Format View
2022.semeval_1.156.pdf 2.240Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record