On learning multicategory classification with sample queries

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Consider the pattern recognition problem of learning multicategory classification from a labeled sample, for instance, the problem of learning character recognition where a category corresponds to an alphanumeric letter. The classical theory of pattern recognition assumes labeled examples appear according to the unknown underlying pattern-class conditional probability distributions where the pattern classes are picked randomly according to their a priori probabilities. In this paper we pose the following question: Can the learning accuracy be improved if labeled examples are independently randomly drawn according to the underlying class conditional probability distributions but the pattern classes are chosen not necessarily according to their a priori probabilities? We answer this in the affirmative by showing that there exists a tuning of the sub-sample proportions which minimizes a loss criterion. The tuning is relative to the intrinsic complexity of the Bayes-classifier. As this complexity depends on the underlying probability distributions which are assumed to be unknown, we provide an algorithm which learns the proportions in an on-line manner utilizing sample querying which asymptotically minimizes the criterion. In practice, this algorithm may be used to boost the performance of existing learning classification algorithms by apportioning better sub-sample proportions.

Original languageEnglish
Pages (from-to)298-327
Number of pages30
JournalInformation and Computation
Issue number2
StatePublished - 15 Sep 2003
Externally publishedYes


  • Multicategory classification
  • On-line learning algorithm
  • Pattern recognition
  • Stochastic gradient descent learning
  • Structural risk minimization


Dive into the research topics of 'On learning multicategory classification with sample queries'. Together they form a unique fingerprint.

Cite this