Lexical ambiguity detection in professional discourse

Show full item record



Permalink

http://hdl.handle.net/10138/346636

Citation

Liu , Y , Medlar , A & Głowacka , D 2022 , ' Lexical ambiguity detection in professional discourse ' , Information Processing and Management , vol. 59 , no. 5 , 103000 . https://doi.org/10.1016/j.ipm.2022.103000

Title: Lexical ambiguity detection in professional discourse
Author: Liu, Yang; Medlar, Alan; Głowacka, Dorota
Contributor organization: Department of Computer Science
Date: 2022-09
Language: eng
Number of pages: 12
Belongs to series: Information Processing and Management
ISSN: 0306-4573
DOI: https://doi.org/10.1016/j.ipm.2022.103000
URI: http://hdl.handle.net/10138/346636
Abstract: Professional discourse is the language used by specialists, such as lawyers, doctors and academics, to communicate the knowledge and assumptions associated with their respective fields. Professional discourse can be especially difficult for non-specialists to understand due to the lexical ambiguity of commonplace words that have a different or more specific meaning within a specialist domain. This phenomena also makes it harder for specialists to communicate with the general public because they are similarly unaware of the potential for misunderstandings. In this article, we present an approach for detecting domain terms with lexical ambiguity versus everyday English. We demonstrate the efficacy of our approach with three case studies in statistics, law and biomedicine. In all case studies, we identify domain terms with a precision@100 greater than 0.9, outperforming the best performing baseline by 18.1–91.7%. Most importantly, we show this ranking is broadly consistent with semantic differences. Our results highlight the difficulties that existing semantic difference methods have in the cross-domain setting, which rank non-domain terms highly due to noise or biases in the data. We additionally show that our approach generalizes to short phrases and investigate its data efficiency by varying the number of labeled examples.
Description: Publisher Copyright: © 2022 The Author(s)
Subject: Lexical ambiguity
Professional discourse
Specialist terminology
Word embeddings
6121 Languages
113 Computer and information sciences
Peer reviewed: Yes
Rights: cc_by
Usage restriction: openAccess
Self-archived version: publishedVersion


Files in this item

Total number of downloads: Loading...

Files Size Format View
1_s2.0_S0306457322001133_main.pdf 752.1Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record