Worst case and a distribution-based case analyses of sampling for rule discovery based on generality and accuracy

研究成果: ジャーナルへの寄稿学術誌査読

2 被引用数 (Scopus)

抄録

In this paper, we propose two sampling theories of rule discovery based on generality and accuracy. The first theory concerns the worst case: it extends a preliminary version of PAC learning, which represents a worst-case analysis for classification. In our analysis, a rule is defined as a probabilistic constraint of true assignment to the class attribute for corresponding examples, and we mainly analyze the case in which we try to avoid finding a bad rule. Effectiveness of our approach is demonstrated through examples for conjunction-rule discovery. The second theory concerns a distribution-based case: it represents the conditions that a rule exceeds pre-specified thresholds for generality and accuracy with high reliability. The idea is to assume a 2-dimensional normal distribution for two probabilistic variables, and obtain the conditions based on their confidence region. This approach has been validated experimentally using 21 benchmark data sets in the machine learning community against conventional methods each of which evaluates the reliability of generality. Discussions on related work are provided for PAC learning, multiple comparison, and analysis of association-rule discovery.

本文言語英語
ページ(範囲)29-36
ページ数8
ジャーナルApplied Intelligence
22
1
DOI
出版ステータス出版済み - 1月 2005
外部発表はい

!!!All Science Journal Classification (ASJC) codes

  • 人工知能

フィンガープリント

「Worst case and a distribution-based case analyses of sampling for rule discovery based on generality and accuracy」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル