Department of Mathematics,
University of California San Diego
****************************
Math 278B: Mathematics of Information, Data, and Signals
Sanjoy Dasgupta
UCSD
Recent progress on interpretable clustering
Abstract:
The widely-used k-means procedure returns k clusters that have arbitrary convex shapes. In high dimension, such a clustering might not be easy to understand. A more interpretable alternative is to constraint the clusters to be the leaves of a decision tree with axis-parallel splits; then each cluster is a hyperrectangle given by a small number of features.
Is it always possible to find clusterings that are intepretable in this sense and yet have k-means cost that is close to the unconstrained optimum? A recent line of work has answered this in the affirmative and moreover shown that these interpretable clusterings are easy to construct.
I will give a survey of these results: algorithms, methods of analysis, and open problems.
January 24, 2025
11:00 AM
APM 2402
Research Areas
Mathematics of Information, Data, and Signals****************************