Discovering Synonyms and Other Related Words

Show full item record



Permalink

http://hdl.handle.net/10138/33867

Citation

Linden , K & Piitulainen , J O 2004 , Discovering Synonyms and Other Related Words . in S Ananadiou & P Zweigenbaum (eds) , Proceedings of COLING 2004 : CompuTerm 2004: 3rd International Workshop on Computational Terminology . pp. 63-70 , CompuTerm 2004: 3rd International Workshop on Computational Terminology , Geneva , Switzerland , 29/08/2004 .

Title: Discovering Synonyms and Other Related Words
Author: Linden, Krister; Piitulainen, Jussi Olavi
Editor: Ananadiou, Sophia; Zweigenbaum, Pierre
Contributor: University of Helsinki, Department of Modern Languages 2010-2017
University of Helsinki, Department of Modern Languages 2010-2017
Date: 2004-08
Language: eng
Belongs to series: Proceedings of COLING 2004 CompuTerm 2004: 3rd International Workshop on Computational Terminology
URI: http://hdl.handle.net/10138/33867
Abstract: Discovering synonyms and other related words among the words in a document collection can be seen as a clustering problem, where we expect the words in a cluster to be closely related to one another. The intuition is that words occurring in similar contexts tend to convey similar meaning. We introduce a way to use translation dictionaries for several languages to evaluate the rate of synonymy found in the word clusters. We also apply the information radius to calculating similarities between words using a full dependency syntactic feature space, and introduce a method for similarity recalculation during clustering as a fast approximation of the high-dimensional feature space. Finally, we show that 69-79% of the words in the clusters we discover are useful for thesaurus construction.
Subject: 612 Languages and Literature
113 Computer and information sciences
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
linden04b.pdf 110.9Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record