kmeans
kmeans(
X2d,k,options?):object
Defined in: packages/pleco-xa/src/cluster/kmeans.js:34
K-means clustering — Lloyd’s algorithm with greedy k-means++ seeding.
Faithful port of scikit-learn’s sklearn.cluster.KMeans (algorithm=“lloyd”):
- greedy k-means++ initialization with 2 + floor(log k) local trials
(Arthur & Vassilvitskii 2007; sklearn
_kmeans_plusplus, _kmeans.py l.180) - Lloyd expectation-maximization with strict-label and center-shift
tolerance convergence (sklearn
_kmeans_single_lloyd, _kmeans.py l.630) nInitindependent restarts, keeping the lowest-inertia result- dataset-scaled tolerance
mean(var(X, axis=0)) * tol(sklearn_tolerance)
Determinism is total: the run is driven by a seeded mulberry32 PRNG. With a
fixed seed the labels, centers and inertia are reproducible bit-for-bit;
there is deliberately NO Date.now()/Math.random() fallback.
Validated against committed reference fixtures (three separable blobs, k=3, generated by sklearn.cluster.KMeans).
Parameters
Section titled “Parameters”ArrayLike<number>[]
Observations, shape (nSamples, nFeatures). Each row is a plain array or a typed array; all rows must share the same length.
number
number of clusters (1 ≤ k ≤ nSamples).
options?
Section titled “options?”maxIter?
Section titled “maxIter?”number = 300
max Lloyd iterations per restart (≥ 1).
nInit?
Section titled “nInit?”number = 10
number of k-means++ restarts (≥ 1).
number = 0
PRNG seed for reproducible seeding.
number = 1e-4
relative center-shift tolerance (scaled by the mean feature variance, matching sklearn).
Returns
Section titled “Returns”object
labels[i] is the cluster index of observation i, centers[c] is the
centroid of cluster c, and inertia is the summed squared distance of
every observation to its assigned centroid.
centers
Section titled “centers”centers:
number[][]
inertia
Section titled “inertia”inertia:
number
labels
Section titled “labels”labels:
Int32Array