Efficient error setting for subspace miners

Eran Shaham, David Sarne, Boaz Ben-Moshe

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A typical mining problem is the extraction of patterns from subspaces of multidimensional data. Such patterns, known as a biclusters, comprise subsets of objects that behave similarly across subsets of attributes, and may overlap each other, i.e., objects/attributes may belong to several patterns, or to none. For many miners, a key input parameter is the maximum allowed error used which greatly affects the quality, quantity and coherency of the mined clusters. As the error is dataset dependent, setting it demands either domain knowledge or some trial-and-error. The paper presents a new method for automatically setting the error to the value that maximizes the number of clusters mined. This error value is strongly correlated to the value for which performance scores are maximized. The correlation is extensively evaluated using six datasets, two mining algorithms, seven prevailing performance measures, and compared with five prior literature methods, demonstrating a substantial improvement in the mining score.

Original languageEnglish
Title of host publicationMachine Learning and Data Mining in Pattern Recognition - 10th International Conference, MLDM 2014, Proceedings
Pages1-15
Number of pages15
DOIs
StatePublished - 2014
Event10th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2014 - St. Petersburg, Russian Federation
Duration: 21 Jul 201424 Jul 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8556 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2014
Country/TerritoryRussian Federation
CitySt. Petersburg
Period21/07/1424/07/14

Keywords

  • Biclustering
  • Error Setting
  • Subspace Mining

Fingerprint

Dive into the research topics of 'Efficient error setting for subspace miners'. Together they form a unique fingerprint.

Cite this