SPEAKER: Eddie Aamari (École Normale Supérieure de Paris)
TITLE: A theory of stratification learning: clustering-by-dimensionality with reconstruction
ABSTRACT: Given i.i.d. random variables X_1, ..., X_n in R^D drawn from a stratified mixture \cup_k M_k of immersed C2-manifolds of different dimensions d_k with k at most K, we study the minimax estimation of the family M_k and the associated unsupervised clustering problem. We provide a constructive algorithm allowing to estimate each mixture component M_k at its optimal dimension-specific rate (log n /n)^{2/d_k} adaptively. The method is based on an ascending hierarchical co-detection of points belonging to different layers which also identifies the number of layers K, the dimensions d_k, assign each point X_i to a layer accurately, and estimate tangent spaces optimally. The results hold regardless of any reach assumption on the M_k's nor on intersection configurations M_k \cap M_{k'}. They open the way to a broad clustering framework, where each mixture component (or stratum) M_k models a cluster, emanating from a specific nonlinear correlation phenomenon leaving only d_k local degrees of freedom.