Nearly optimal classi-cation for semimetrics

Lee Ad Gottlieb, Aryeh Kontorovich, Pinhas Nisnevitch

פרסום מחקרי: פרסום בכתב עתמאמרביקורת עמיתים

14 ציטוטים ‏(Scopus)

תקציר

We initiate the rigorous study of classification in semimetric spaces, which are point sets with a distance function that is non-negative and symmetric, but need not satisfy the triangle inequality. We define the density dimension dens and discover that it plays a central role in the statistical and algorithmic feasibility of learning in semimetric spaces. We compute this quantity for several widely used semimetrics and present nearly optimal sample compression algorithms, which are then used to obtain generalization guarantees, including fast rates. Our claim of near-optimality holds in both computational and statistical senses. When the sample has radius R and margin , we show that it can be compressed down to roughly d = (R/γ)dens points, and further that finding a significantly better compression is algorithmically intractable unless P=NP. This compression implies generalization via standard Occam-Type arguments, to which we provide a nearly matching lower bound.

שפה מקוריתאנגלית
כתב עתJournal of Machine Learning Research
כרך18
סטטוס פרסוםפורסם - 1 אפר׳ 2017

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Nearly optimal classi-cation for semimetrics'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי