A method of extracting related words using standardized mutual information

Tomohiko Sugimachi, Akira Ishino, Masayuki Takeda, Fumihiro Matsuo

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


Techniques of automatic extraction of related words are of great importance in many applications such as query expansion and automatic thesaurus construction. In this paper, a method of extracting related words is proposed basing on the statistical information about the co-occurrences of words from huge corpora. The mutual information is one of such statistical measures and has been used for application mainly in natural language processing. A drawback is, however, the mutual information depends mainly on frequencies of words. To overcome this difficulty, we propose as a new measure a normalize deviation of mutual information. We also reveal a correspondence between word ambiguity and related words using word relation graphs constructed using this measure.

Original languageEnglish
Pages (from-to)478-485
Number of pages8
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publication statusPublished - 2003

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'A method of extracting related words using standardized mutual information'. Together they form a unique fingerprint.

Cite this