Nothing but the Truth! : Deception Prediction on Hotel Reviews using Language Technology

Show full item record



Permalink

http://urn.fi/URN:NBN:fi:hulib-201611283115
Title: Nothing but the Truth! : Deception Prediction on Hotel Reviews using Language Technology
Author: Ciarlanti, Alberto
Contributor: University of Helsinki, Faculty of Arts, Department of Modern Languages
Publisher: Helsingfors universitet
Date: 2016
Language: eng
URI: http://urn.fi/URN:NBN:fi:hulib-201611283115
http://hdl.handle.net/10138/169705
Thesis level: master's thesis
Discipline: Språkteknologi
Language Technology
kieliteknologia
Abstract: This work goes through the study of deception in psychology, forensic sciences and language technology, focusing specifically to the techniques used in language technology to predict deception. Using a corpus of thruthful and deceptive hotel reviews, this work shows a Naïve-Bayes classifier which achieves a 90.4% accuracy rate. This Thesis shows that even though since 1998 text classifier are based on Support Vector Machines, with the corpus used and the features applied to such corpus, my Naïve-Bayes classifier achieves better results than any of the possible SVM counterparts. By studying the categorizer produced and noticing which features are most relevant, I show it is easily possible writing a deceptive review, that the machine classifier labels as truthful. The use of the Regressing Imagery Dictionary as psycholinguistic part of the classifier proved to be as effective as the more expensive and closed source option known as the Linguistic Inquiry and Word Count (LIWC). Also this is the first Thesis in the General Linguistics Department to use the new open source Natural Language Processing library spaCy (https://spacy.io/).
Subject (yso): harhauttaminen
tekstuaalinen harhauttaminen
korpuslingvistiikka
naïve-bayes luokittelija


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show full item record