New construction of ensemble classifiers for imbalanced datasets

Yun Zhai, Bingru Yang, Nan Ma, Da Ruan, Jan Wagemans

Research outputpeer-review

Abstract

Learning in the presence of data imbalances presents a great challenge to machine learning. Imbalanced data sets represent a significant problem because the corresponding classifier has a tendency to ignore samples which have smaller representation in the training sets. In this paper, we propose an ensemble-based learning algorithm as a new ensemble classifier model called as SVM-C5.0 Ensemble Classifier Model, SCECM. SCECM adopts a differentiated sampling rate algorithm (DSRA) based on an improved Adaboost algorithm and further employs unique classifier-selection strategy, novel classifier integration approach and original classification decision-making method. Comparative experimental results show that the proposed approach improves performance for the minority class while preserving the ability to recognize examples from the majority classes.
Original languageEnglish
Title of host publicationProceedings of 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering
Place of PublicationBeijing, China
Pages228-233
StatePublished - Nov 2010
Event2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering - ISKE2010, Hangzhou
Duration: 15 Nov 201016 Nov 2010

Conference

Conference2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering
Country/TerritoryChina
CityHangzhou
Period2010-11-152010-11-16

Cite this