TY - GEN
T1 - Efficient regression in metric spaces via approximate Lipschitz extension
AU - Gottlieb, Lee Ad
AU - Kontorovich, Aryeh
AU - Krauthgamer, Robert
PY - 2013
Y1 - 2013
N2 - We present a framework for performing efficient regression in general metric spaces. Roughly speaking, our regressor predicts the value at a new point by computing a Lipschitz extension - the smoothest function consistent with the observed data - while performing an optimized structural risk minimization to avoid overfitting. The offline (learning) and online (inference) stages can be solved by convex programming, but this naive approach has runtime complexity O(n3), which is prohibitive for large datasets. We design instead an algorithm that is fast when the doubling dimension, which measures the "intrinsic" dimensionality of the metric space, is low. We make dual use of the doubling dimension: first, on the statistical front, to bound fat-shattering dimension of the class of Lipschitz functions (and obtain risk bounds); and second, on the computational front, to quickly compute a hypothesis function and a prediction based on Lipschitz extension. Our resulting regressor is both asymptotically strongly consistent and comes with finite-sample risk bounds, while making minimal structural and noise assumptions.
AB - We present a framework for performing efficient regression in general metric spaces. Roughly speaking, our regressor predicts the value at a new point by computing a Lipschitz extension - the smoothest function consistent with the observed data - while performing an optimized structural risk minimization to avoid overfitting. The offline (learning) and online (inference) stages can be solved by convex programming, but this naive approach has runtime complexity O(n3), which is prohibitive for large datasets. We design instead an algorithm that is fast when the doubling dimension, which measures the "intrinsic" dimensionality of the metric space, is low. We make dual use of the doubling dimension: first, on the statistical front, to bound fat-shattering dimension of the class of Lipschitz functions (and obtain risk bounds); and second, on the computational front, to quickly compute a hypothesis function and a prediction based on Lipschitz extension. Our resulting regressor is both asymptotically strongly consistent and comes with finite-sample risk bounds, while making minimal structural and noise assumptions.
KW - convex program
KW - metric space
KW - regression
UR - http://www.scopus.com/inward/record.url?scp=84879861898&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-39140-8_3
DO - 10.1007/978-3-642-39140-8_3
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84879861898
SN - 9783642391392
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 43
EP - 58
BT - Similarity-Based Pattern Recognition - Second International Workshop, SIMBAD 2013, Proceedings
T2 - 2nd International Workshop on Similarity-Based Pattern Analysis and Recognition, SIMBAD 2013
Y2 - 3 July 2013 through 5 July 2013
ER -