Clustering

Clustering API Reference

K-means Elbow Plot Recipe

  using Plots
  ExplainedVar = []
  for K in 1:10
      km = KMeans( X, K; tolerance = 1e-14, maxiters = 1000 )
      TCSS = TotalClusterSS( km )
      WCSS = WithinClusterSS( km )
      #BCSS = BetweenClusterSS( km )
      push!(ExplainedVar, WCSS / TCSS)
  end
  scatter(ExplainedVar, title = "Elbow Plot", ylabel = "WCSS/TCSS", xlabel = "Clusters (#)", label = "K-means" )

Functions

BetweenClusterSS( Clustered::ClusterModel )

Returns a scalar of the between cluster sum of squares for a ClusterModel object.

source
KMeans( X, Clusters; tolerance = 1e-8, maxiters = 200 )

Returns a ClusterModel object after finding clusterings for data in X via MacQueens K-Means algorithm. Clusters is the K parameter, or the # of clusters.

MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. 1. University of California Press. pp. 281–297.

source
TotalClusterSS( Clustered::ClusterModel )

Returns a scalar of the total sum of squares for a ClusterModel object.

source
WithinClusterSS( Clustered::ClusterModel )

Returns a scalar of the within cluter sum of squares for a ClusterModel object.

source