Interactive visual data exploration with subjective feedback : an information-theoretic approach

Show full item record



Permalink

http://hdl.handle.net/10138/317145

Citation

Puolamäki , K , Oikarinen , E , Kang , B , Lijffijt , J & Bie , T D 2020 , ' Interactive visual data exploration with subjective feedback : an information-theoretic approach ' , Data Mining and Knowledge Discovery , vol. 34 , no. 1 , pp. 21–49 . https://doi.org/10.1007/s10618-019-00655-x

Title: Interactive visual data exploration with subjective feedback : an information-theoretic approach
Author: Puolamäki, Kai; Oikarinen, Emilia; Kang, Bo; Lijffijt, Jefrey; Bie, Tijl de
Contributor: University of Helsinki, Department of Computer Science
University of Helsinki, Department of Computer Science
Date: 2020-01
Number of pages: 29
Belongs to series: Data Mining and Knowledge Discovery
ISSN: 1384-5810
URI: http://hdl.handle.net/10138/317145
Abstract: Visual exploration of high-dimensional real-valued datasets is a fundamental task in exploratory data analysis (EDA). Existing projection methods for data visualization use predefined criteria to choose the representation of data. There is a lack of methods that (i) use information on what the user has learned from the data and (ii) show patterns that she does not know yet. We construct a theoretical model where identified patterns can be input as knowledge to the system. The knowledge syntax here is intuitive, such as "this set of points forms a cluster", and requires no knowledge of maths. This background knowledge is used to find a maximum entropy distribution of the data, after which the user is provided with data projections for which the data and the maximum entropy distribution differ the most, hence showing the user aspects of data that are maximally informative given the background knowledge. We study the computational performance of our model and present use cases on synthetic and real data. We find that the model allows the user to learn information efficiently from various data sources and works sufficiently fast in practice. In addition, we provide an open source EDA demonstrator system implementing our model with tailored interactive visualizations. We conclude that the information theoretic approach to EDA where patterns observed by a user are formalized as constraints provides a principled, intuitive, and efficient basis for constructing an EDA system.
Subject: 113 Computer and information sciences
Rights:


Files in this item

Total number of downloads: Loading...

Files Size Format View
Puolamaki2020_A ... iveVisualDataExplorati.pdf 3.672Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record