Improvement of automatic Chinese text classification by combining multiple features

Xi Luo, Wataru Ohyama, Tetsushi Wakabayashi, Fumitaka Kimura

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


In this paper, we present an effective way of combining character-based (N-gram) and word-based approaches for Chinese text classification. Uni-gram and bi-gram features are considered as the baseline model, which are then combined with word features of length greater than or equal to 3. A weight coefficient that can be used to give higher weights to word features is also introduced. We further employ a serial approach based on feature transformation and dimension reduction techniques. The results of McNemar's test indicate that the performance is significantly improved by our proposed method.

Original languageEnglish
Pages (from-to)166-174
Number of pages9
JournalIEEJ Transactions on Electrical and Electronic Engineering
Issue number2
Publication statusPublished - Mar 1 2015
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering


Dive into the research topics of 'Improvement of automatic Chinese text classification by combining multiple features'. Together they form a unique fingerprint.

Cite this