A method for estimating regression errors with application to virtual concept drift detection

Show full item record



Permalink

http://urn.fi/URN:NBN:fi:hulib-202001211114
Title: A method for estimating regression errors with application to virtual concept drift detection
Author: Tiittanen, Henri
Contributor: University of Helsinki, Faculty of Science
Publisher: Helsingin yliopisto
Date: 2019
Language: eng
URI: http://urn.fi/URN:NBN:fi:hulib-202001211114
http://hdl.handle.net/10138/310008
Thesis level: master's thesis
Degree program: Datatieteen maisteriohjelma
Master's Programme in Data Science
Magisterprogrammet i data science
Specialisation: ei opintosuuntaa
no specialization
ingen studieinriktning
Discipline: none
Abstract: Estimating the error level of models is an important task in machine learning. If the data used is independent and identically distributed, as is usually assumed, there exist standard methods to estimate the error level. However, if the data distribution changes, i.e., a phenomenon known as concept drift occurs, those methods may not work properly anymore. Most existing methods for detecting concept drift focus on the case in which the ground truth values are immediately known. In practice, that is often not the case. Even when the ground truth is unknown, a certain type of concept drift called virtual concept drift can be detected. In this thesis we present a method called drifter for estimating the error level of arbitrary regres- sion functions when the ground truth is not known. Concept drift detection is a straightforward application of error level estimation. Error level based concept drift detection can be more useful than traditional approaches based on direct distribution comparison, since only changes that affect the error level are detected. In this work we describe the drifter algorithm in detail, including its theoretical basis, and present an experimental evaluation of its performance in virtual concept drift detection on multiple datasets consisting of both synthetic and real-world datasets and multiple regression functions. Our experi- ments show that the drifter algorithm can be used to detect virtual concept drift with a reasonable accuracy.


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show full item record