Holes in the Outline : Subject-dependent Abstract Quality and its Implications for Scientific Literature Search

Show full item record



Permalink

http://hdl.handle.net/10138/313970

Citation

Huang , C , Casey , A , Glowacka , D & Medlar , A 2019 , Holes in the Outline : Subject-dependent Abstract Quality and its Implications for Scientific Literature Search . in CHIIR '19 : Proceedings of the 2019 Conference on Human Information Interaction and Retrieval . ACM , New York , pp. 289-293 , ACM SIGIR Conference on Human Information Interaction and Retrieval , Glasgow , United Kingdom , 10/03/2019 . https://doi.org/10.1145/3295750.3298953

Title: Holes in the Outline : Subject-dependent Abstract Quality and its Implications for Scientific Literature Search
Author: Huang, Chien-yu; Casey, Arlene; Glowacka, Dorota; Medlar, Alan
Contributor: University of Helsinki, Department of Computer Science
University of Helsinki, Department of Computer Science
Publisher: ACM
Date: 2019
Language: eng
Number of pages: 5
Belongs to series: CHIIR '19 Proceedings of the 2019 Conference on Human Information Interaction and Retrieval
ISBN: 978-1-4503-6025-8
URI: http://hdl.handle.net/10138/313970
Abstract: Scientific literature search engines typically index abstracts instead of the full-text of publications. The expectation is that the abstract provides a comprehensive summary of the article, enumerating key points for the reader to assess whether their information needs could be satisfied by reading the full-text. Furthermore, from a practical standpoint, obtaining the full-text is more complicated due to licensing issues, in the case of commercial publishers, and resource limitations of public repositories and pre-print servers. In this article, we use topic modelling to represent content in abstracts and full-text articles. Using Computer Science as a case study, we demonstrate that how well the abstract summarises the full-text is subfield-dependent. Indeed, we show that abstract representativeness has a direct impact on retrieval performance, with poorer abstracts leading to degraded performance. Finally, we present evidence that how well an abstract represents the full-text of an article is not random, but is a consequence of style and writing conventions in different subdisciplines and can be used to infer an "evolutionary" tree of subfields within Computer Science.
Subject: scientific literature search
topic models
term taxonomy
INFORMATION
113 Computer and information sciences
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
huang_2019_1.pdf 526.4Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record