Supervised Classification Using Balanced Training

Show simple item record

dc.contributor University of Helsinki, Department of Computer Science en
dc.contributor University of Helsinki, Department of Computer Science en
dc.contributor University of Helsinki, Department of Computer Science en
dc.contributor University of Helsinki, Department of Computer Science en
dc.contributor.author Du, Mian
dc.contributor.author Pierce, Matthew
dc.contributor.author Pivovarova, Lidia
dc.contributor.author Yangarber, Roman
dc.date.accessioned 2015-02-03T13:16:21Z
dc.date.available 2015-02-03T13:16:21Z
dc.date.issued 2014-10
dc.identifier.citation Du , M , Pierce , M , Pivovarova , L & Yangarber , R 2014 , Supervised Classification Using Balanced Training . in Unknown host publication . Lecture notes in artificial intelligence , no. 8791 , Springer-Verlag , International Conference on Statistical Language and Speech Processing (SLSP 2014) , Grenoble , France , 14/10/2014 . < http://grammars.grlmc.com/slsp2014/ > en
dc.identifier.citation conference en
dc.identifier.other PURE: 40404480
dc.identifier.other PURE UUID: b257c30a-5b0c-4059-9c26-48949710a596
dc.identifier.other Scopus: 84921646704
dc.identifier.other ORCID: /0000-0001-5264-9870/work/68618654
dc.identifier.other ORCID: /0000-0002-0026-9902/work/81734768
dc.identifier.uri http://hdl.handle.net/10138/153192
dc.description.abstract We examine supervised learning for multi-class, multi-label text classification. We are interested in exploring classification in a real-world setting, where the distribution of labels may change dynamically over time. First, we compare the performance of an array of binary classifiers trained on the label distribution found in the original corpus against classifiers trained on balanced data, where we try to make the label distribution as nearly uniform as possible. We discuss the performance trade-offs between balanced vs. unbalanced training, and highlight the advantages of balancing the training set. Second, we compare the performance of two classifiers, Naive Bayes and SVM, with several feature-selection methods, using balanced training. We combine a Named-Entity-based rote classifier with the statistical classifiers to obtain better performance than either method alone. en
dc.format.extent 12
dc.language.iso eng
dc.publisher Springer-Verlag
dc.relation.ispartof Unknown host publication
dc.relation.ispartofseries Lecture notes in artificial intelligence
dc.relation.uri http://grammars.grlmc.com/slsp2014/
dc.rights en
dc.subject 113 Computer and information sciences en
dc.title Supervised Classification Using Balanced Training en
dc.type Conference contribution
dc.type.uri info:eu-repo/semantics/other
dc.contributor.pbl
dc.contributor.pbl

Files in this item

Total number of downloads: Loading...

Files Size Format View
main.pdf 252.5Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record