Huang , C , Casey , A , Glowacka , D & Medlar , A 2019 , Holes in the Outline : Subject-dependent Abstract Quality and its Implications for Scientific Literature Search . in CHIIR '19 : Proceedings of the 2019 Conference on Human Information Interaction and Retrieval . ACM , New York , pp. 289-293 , ACM SIGIR Conference on Human Information Interaction and Retrieval , Glasgow , United Kingdom , 10/03/2019 . https://doi.org/10.1145/3295750.3298953
Title: | Holes in the Outline : Subject-dependent Abstract Quality and its Implications for Scientific Literature Search |
Author: | Huang, Chien-yu; Casey, Arlene; Glowacka, Dorota; Medlar, Alan |
Contributor organization: | Department of Computer Science |
Publisher: | ACM |
Date: | 2019 |
Language: | eng |
Number of pages: | 5 |
Belongs to series: | CHIIR '19 |
ISBN: | 978-1-4503-6025-8 |
DOI: | https://doi.org/10.1145/3295750.3298953 |
URI: | http://hdl.handle.net/10138/313970 |
Abstract: | Scientific literature search engines typically index abstracts instead of the full-text of publications. The expectation is that the abstract provides a comprehensive summary of the article, enumerating key points for the reader to assess whether their information needs could be satisfied by reading the full-text. Furthermore, from a practical standpoint, obtaining the full-text is more complicated due to licensing issues, in the case of commercial publishers, and resource limitations of public repositories and pre-print servers. In this article, we use topic modelling to represent content in abstracts and full-text articles. Using Computer Science as a case study, we demonstrate that how well the abstract summarises the full-text is subfield-dependent. Indeed, we show that abstract representativeness has a direct impact on retrieval performance, with poorer abstracts leading to degraded performance. Finally, we present evidence that how well an abstract represents the full-text of an article is not random, but is a consequence of style and writing conventions in different subdisciplines and can be used to infer an "evolutionary" tree of subfields within Computer Science. |
Subject: |
scientific literature search
topic models term taxonomy INFORMATION 113 Computer and information sciences |
Peer reviewed: | Yes |
Rights: | cc_by_nc |
Usage restriction: | openAccess |
Self-archived version: | acceptedVersion |
Total number of downloads: Loading...
Files | Size | Format | View |
---|---|---|---|
huang_2019_1.pdf | 526.4Kb |
View/ |