Contrastive pretraining in discourse change detection

Show full item record



Permalink

http://urn.fi/URN:NBN:fi:hulib-202205182021
Title: Contrastive pretraining in discourse change detection
Author: Lipsanen, Mikko
Other contributor: Helsingin yliopisto, Matemaattis-luonnontieteellinen tiedekunta
University of Helsinki, Faculty of Science
Helsingfors universitet, Matematisk-naturvetenskapliga fakulteten
Publisher: Helsingin yliopisto
Date: 2022
Language: eng
URI: http://urn.fi/URN:NBN:fi:hulib-202205182021
http://hdl.handle.net/10138/343851
Thesis level: master's thesis
Degree program: Datatieteen maisteriohjelma
Master's Programme in Data Science
Magisterprogrammet i data science
Specialisation: ei opintosuuntaa
no specialization
ingen studieinriktning
Abstract: The thesis presents and evaluates a model for detecting changes in discourses in diachronic text corpora. Detecting and analyzing discourses that typically evolve over a period of time and differ in their manifestations in individual documents is a challenging task, and existing approaches like topic modeling are often not able to reach satisfactory results. One key problem is the difficulty of properly evaluating the results of discourse detection methods, due in large part to the lack of annotated text corpora. The thesis proposes a solution where synthetic datasets containing non-stable discourse patterns are generated from a corpus of news articles. Using the news categories as a proxy for discourses allows both to control the complexity of the data and to evaluate the model results based on the known discourse patterns. The complex task of extracting topics from texts is commonly performed using generative models, which are based on simplifying assumptions regarding the process of data generation. The model presented in the thesis explores instead the potential of deep neural networks, combined with contrastive learning, to be used for discourse detection. The neural network model is first trained using supervised contrastive loss function, which teaches the model to differentiate the input data based on the type of discourse pattern it belongs to. This pretrained model is then employed for both supervised and unsupervised downstream classification tasks, where the goal is to detect changes in the discourse patterns at the timepoint level. The main aim of the thesis is to find out whether contrastive pretraining can be used as a part of a deep learning approach to discourse change detection, and whether the information encoded into the model during contrastive training can generalise to other, closely related domains. The results of the experiments show that contrastive pretraining can be used to encode information that directly relates to its learning goal into the end products of the model, although the learning process is still incomplete. However, the ability of the model to generalise this information in a way that could be useful in the timepoint level classification tasks remains limited. More work is needed to improve the model performance, especially if it is to be used with complex real world datasets.


Files in this item

Total number of downloads: Loading...

Files Size Format View
Lipsanen_Mikko_tutkielma_2022.pdf 8.556Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record