Printable PDF
Department of Mathematics,
University of California San Diego

****************************

Math 278C - Optimization seminar and Data Science

Alex Cloninger

UCSD

Two-sample Statistics and Distance Metrics Based on Anisotropic Kernels

Abstract:

This talk introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between n data points and a set of $n_R$ reference points, where $n_R$ can be drastically smaller than n. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as $\Vert p-q \Vert \sim$ O($n^{-1/2+\delta})$ for any $\delta>$ 0 based on a result of convergence in distribution of the test statistic. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.

Jiawang Nie

October 11, 2017

4:00 PM

AP&M 2402

****************************