Nothing but the Truth! : Deception Prediction on Hotel Reviews using Language Technology

Show full item record

Title: Nothing but the Truth! : Deception Prediction on Hotel Reviews using Language Technology
Author: Ciarlanti, Alberto
Other contributor: Helsingin yliopisto, Humanistinen tiedekunta, Nykykielten laitos
University of Helsinki, Faculty of Arts, Department of Modern Languages
Helsingfors universitet, Humanistiska fakulteten, Institutionen för moderna språk
Publisher: Helsingfors universitet
Date: 2016
Language: eng
Thesis level: master's thesis
Discipline: Språkteknologi
Language Technology
Abstract: This work goes through the study of deception in psychology, forensic sciences and language technology, focusing specifically to the techniques used in language technology to predict deception. Using a corpus of thruthful and deceptive hotel reviews, this work shows a Naïve-Bayes classifier which achieves a 90.4% accuracy rate. This Thesis shows that even though since 1998 text classifier are based on Support Vector Machines, with the corpus used and the features applied to such corpus, my Naïve-Bayes classifier achieves better results than any of the possible SVM counterparts. By studying the categorizer produced and noticing which features are most relevant, I show it is easily possible writing a deceptive review, that the machine classifier labels as truthful. The use of the Regressing Imagery Dictionary as psycholinguistic part of the classifier proved to be as effective as the more expensive and closed source option known as the Linguistic Inquiry and Word Count (LIWC). Also this is the first Thesis in the General Linguistics Department to use the new open source Natural Language Processing library spaCy (
Subject (yso): harhauttaminen
tekstuaalinen harhauttaminen
naïve-bayes luokittelija

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show full item record