Joint cluster analysis of attribute data and relationship data: The Connected k-Center problem

Martin Ester, Rong Ge, Byron J. Gao, Zengjian Hu, Boaz Ben-Moshe

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

36 Scopus citations

Abstract

Attribute data and relationship data are two principle types of data, representing the intrinsic and extrinsic properties of entities. While attribute data has been the main source of data for cluster analysis, relationship data such as social networks or metabolic networks are becoming increasingly available. It is also common to observe both data types carry orthogonal information such as in market segmentation and community identification, which calls for a joint cluster analysis of both data types so as to achieve more accurate results. For this purpose, we introduce the novel Connected k-Center problem, taking into account attribute data as well as relationship data. We analyze the complexity of this problem and prove its NP-completeness. We also present a constant factor approximation algorithm, based on which we further design NetScan, a heuristic algorithm that is efficient for large, real databases. Our experimental evaluation demonstrates the meaningfulness and accuracy of the NetScan results.

Original languageEnglish
Title of host publicationProceedings of the Sixth SIAM International Conference on Data Mining
Pages246-257
Number of pages12
DOIs
StatePublished - 2006
Externally publishedYes
EventSixth SIAM International Conference on Data Mining - Bethesda, MD, United States
Duration: 20 Apr 200622 Apr 2006

Publication series

NameProceedings of the Sixth SIAM International Conference on Data Mining
Volume2006

Conference

ConferenceSixth SIAM International Conference on Data Mining
Country/TerritoryUnited States
CityBethesda, MD
Period20/04/0622/04/06

Fingerprint

Dive into the research topics of 'Joint cluster analysis of attribute data and relationship data: The Connected k-Center problem'. Together they form a unique fingerprint.

Cite this