Standard measure and SVM measure for feature selection and their performance effect for text classification

Yusuke Adachi, Naoya Onimura, Takanori Yamashita, Sachio Hirokawa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

This paper compares the prediction performance of document classification based on a variety of feature selection measures. Empirical experiments were conducted for the dataset re0 with 10 measures for feature selection and with SVM. It is confirmed that the feature selection based on the SVM-score proposed by Sakai and Hirokawa (2012) outper-forms the standard measures with small number of features. In fact, 100 words are enough to get the similar performance obtained with all words. The reason of good performance of this feature selection is that the SVM-score capture not only the characteristic words of positive samples but of negative samples as well.

Original languageEnglish
Title of host publication18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Proceedings
EditorsMaria Indrawan-Santiago, Gabriele Anderst-Kotsis, Matthias Steinbauer, Ismail Khalil
PublisherAssociation for Computing Machinery
Pages262-266
Number of pages5
ISBN (Electronic)9781450348072
DOIs
Publication statusPublished - Nov 28 2016
Event18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016 - Singapore, Singapore
Duration: Nov 28 2016Nov 30 2016

Publication series

NameACM International Conference Proceeding Series

Other

Other18th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2016
Country/TerritorySingapore
CitySingapore
Period11/28/1611/30/16

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Standard measure and SVM measure for feature selection and their performance effect for text classification'. Together they form a unique fingerprint.

Cite this