Supervised Classification Using Balanced Training

Show full item record



Du , M , Pierce , M , Pivovarova , L & Yangarber , R 2014 , Supervised Classification Using Balanced Training . in Unknown host publication . Lecture notes in artificial intelligence , no. 8791 , Springer-Verlag , International Conference on Statistical Language and Speech Processing (SLSP 2014) , Grenoble , France , 14/10/2014 . < >

Title: Supervised Classification Using Balanced Training
Author: Du, Mian; Pierce, Matthew; Pivovarova, Lidia; Yangarber, Roman
Contributor organization: Department of Computer Science
Computational Linguistics research group / Roman Yangarber
Publisher: Springer-Verlag
Date: 2014-10
Language: eng
Number of pages: 12
Belongs to series: Unknown host publication
Belongs to series: Lecture notes in artificial intelligence
Abstract: We examine supervised learning for multi-class, multi-label text classification. We are interested in exploring classification in a real-world setting, where the distribution of labels may change dynamically over time. First, we compare the performance of an array of binary classifiers trained on the label distribution found in the original corpus against classifiers trained on balanced data, where we try to make the label distribution as nearly uniform as possible. We discuss the performance trade-offs between balanced vs. unbalanced training, and highlight the advantages of balancing the training set. Second, we compare the performance of two classifiers, Naive Bayes and SVM, with several feature-selection methods, using balanced training. We combine a Named-Entity-based rote classifier with the statistical classifiers to obtain better performance than either method alone.
Subject: 113 Computer and information sciences
Peer reviewed: Yes
Usage restriction: restrictedAccess
Self-archived version: submittedVersion

Files in this item

Total number of downloads: Loading...

Files Size Format View
main.pdf 252.5Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record