搜索
您的当前位置:首页正文

VLFeat教程之k-means

来源:二三娱乐

Running K-means

KMeans is a clustering algorithm. Its purpose is to partition a set of vectors into K
groups that cluster around common mean vector. This can also be thought as approximating the input each of the input vector with one of the means, so the clustering process finds, in principle, the best dictionary or codebook to vector quantize the data.
Consider a dataset containing 1000 randomly sampled 2D points:
numData = 5000 ;dimension = 2 ;data = rand(dimension,numData) ;

KMeans clustering of 5000 randomly sampled data points. The black dots are the cluster centers.

Given a new data point x
, this can be mapped to one of the clusters by looking for the closest center:
x = rand(dimension, 1) ;[~, k] = min(vl_alldist(x, centers)) ;

Comparisons of Elkan, Lloyd and ANN for different values of MaxNumComparison
, expressed as a fraction of the number of clusters. The figure reports the duration, final energy value, and speedup factor, using both the serial and parallel versions of the code. The figure was generated using .
Top