A simple algorithm for spectral clustering of random graphs
Samuel Cole , Professor, University of Illinois at Chicago
1 – 2PM
Friday Mar 24, 2017
POB 6.304
Abstract
A basic problem in data science is to partition a data set into “clusters" of similar data. When the data are represented in a matrix, the spectrum of the matrix can be used to capture this similarity. This talk will consider how this spectral clustering performs on random matrices. Specifically, we consider the planted partition model, in which $n = ks$ vertices of a random graph are partitioned into $k$ clusters, each of size $s$. Edges between vertices in the same cluster and different clusters are included with constant probability $p$ and $q$, respectively (where $0 \le q < p \le 1$). We present a simple, efficient algorithm that, with high probability, recovers the clustering as long as the cluster sizes are are least $\Omega(\sqrt{n})$.