给定5个样本的样本矩阵
X ⊤ = [ 0 0 1 5 5 2 0 0 0 2 ] {\bm X}^\top= \begin{bmatrix} 0 & 0 & 1 & 5 & 5 \\ 2 & 0 & 0 & 0 & 2 \end{bmatrix} X⊤=[0200105052]
使用 K K K均值聚类算法将样本聚到两个类中。选择两个样本点 x 1 = ( 0 , 2 ) ⊤ {\bm x}_1 = (0,2)^\top x1=(0,2)⊤, x 2 = ( 0 , 0 ) ⊤ {\bm x}_2 = (0,0)^\top x2=(0,0)⊤作为类中心的初值。
解答
(1) 类中心的初值
m 1 ( 0 ) = x 1 = ( 0 , 2 ) ⊤ , m 2 ( 0 ) = x 2 = ( 0 , 0 ) ⊤ {\bm m}_1^{(0)} = {\bm x}_1 = (0,2)^\top,{\bm m}_2^{(0)} = {\bm x}_2 = (0,0)^\top m1(0)=x1=(0,2)⊤,m2(0)=x2=(0,0)⊤
(2) 以 m 1 ( 0 ) {\bm m}_1^{(0)} m1(0), m 2 ( 0 ) {\bm m}_2^{(0)} m2(0) 为类 C 1 ( 0 ) C_1^{(0)} C1(0), C 2 ( 0 ) C_2^{(0)} C2(0) 的中心,计算 x 3 = ( 1 , 0 ) ⊤ {\bm x}_3 = (1,0)^\top x3=(1,0)⊤, x 4 = ( 5 , 0 ) ⊤ {\bm x}_4 = (5,0)^\top x4=(5,0)⊤, x 5 = ( 5 , 2 ) ⊤ {\bm x}_5 = (5,2)^\top x5=(5,2)⊤ 与 m 1 ( 0 ) = ( 0 , 2 ) ⊤ {\bm m}_1^{(0)} = (0,2)^\top m1(0)=(0,2)⊤, m 2 ( 0 ) = ( 0 , 0 ) ⊤ {\bm m}_2^{(0)} = (0,0)^\top m2(0)=(0,0)⊤ 的欧氏距离平方。
(a) 对于 x 3 = ( 1 , 0 ) ⊤ {\bm x}_3 = (1,0)^\top x3=(1,0)⊤, d ( x 3 , m 1 ( 0 ) ) = 5 d({\bm x}_3, m_1^{(0)}) = 5 d(x3,m1(0))=5, d ( x 3 , m 2 ( 0 ) ) = 1 d({\bm x}_3, {\bm m}_2^{(0)}) = 1 d(x3,m2(0))=1,将 x 3 {\bm x}_3 x3 分到类 C 2 ( 0 ) C_2^{(0)} C2(0)。
(b) 对于 x 4 = ( 5 , 0 ) ⊤ {\bm x}_4 = (5,0)^\top x4=(5,0)⊤, d ( x 4 , m 1 ( 0 ) ) = 29 d({\bm x}_4, {\bm m}_1^{(0)}) = 29 d(x4,m1(0))=29, d ( x 4 , m 2 ( 0 ) ) = 25 d({\bm x}_4, {\bm m}_2^{(0)}) = 25 d(x4,m2(0))=25,将 x 4 {\bm x}_4 x4 分到类 C 2 ( 0 ) C_2^{(0)} C2(0)。
(c ) 对于 x 5 = ( 5 , 2 ) ⊤ {\bm x}_5 = (5,2)^\top x5=(5,2)⊤, d ( x 5 , m 1 ( 0 ) ) = 25 d({\bm x}_5, {\bm m}_1^{(0)}) = 25 d(x5,m1(0))=25, d ( x 5 , m 2 ( 0 ) ) = 29 d({\bm x}_5, {\bm m}_2^{(0)}) = 29 d(x5,m2(0))=29,将 x 5 {\bm x}_5 x5 分到类 C 1 ( 0 ) C_1^{(0)} C1(0)。
(3) 得到新的类 C 1 ( 1 ) = { x 1 , x 5 } C_1^{(1)} = \{{\bm x}_1, {\bm x}_5\} C1(1)={x1,x5}, C 2 ( 1 ) = { x 2 , x 3 , x 4 } C_2^{(1)} = \{{\bm x}_2, {\bm x}_3, {\bm x}_4\} C2(1)={x2,x3,x4},计算类的中心 m 1 ( 1 ) {\bm m}_1^{(1)} m1(1), m 2 ( 1 ) {\bm m}_2^{(1)} m2(1):
m 1 ( 1 ) = ( 2.5 , 2.0 ) ⊤ {\bm m}_1^{(1)} = (2.5, 2.0)^\top m1(1)=(2.5,2.0)⊤
m 2 ( 1 ) = ( 2 , 0 ) ⊤ {\bm m}_2^{(1)} = (2, 0)^\top m2(1)=(2,0)⊤
(4) 重复步骤 (2) 和步骤 (3)。将 x 1 {\bm x}_1 x1 分到类 C 1 ( 1 ) C_1^{(1)} C1(1),将 x 2 {\bm x}_2 x2 分到类 C 2 ( 1 ) C_2^{(1)} C2(1), x 3 {\bm x}_3 x3 分到类 C 2 ( 1 ) C_2^{(1)} C2(1), x 4 {\bm x}_4 x4 分到类 C 2 ( 1 ) C_2^{(1)} C2(1), x 5 {\bm x}_5 x5 分到类 C 1 ( 1 ) C_1^{(1)} C1(1),得到新的类 C 1 ( 2 ) = { x 1 , x 5 } C_1^{(2)} = \{{\bm x}_1, {\bm x}_5\} C1(2)={x1,x5}, C 2 ( 2 ) = { x 2 , x 3 , x 4 } C_2^{(2)} = \{{\bm x}_2, {\bm x}_3, {\bm x}_4\} C2(2)={x2,x3,x4}。
由于得到的新的类没有改变,聚类停止。得到聚类结果:
C 1 ∗ = { x 1 , x 5 } C_1^* = \{{\bm x}_1, {\bm x}_5\} C1∗={x1,x5}
C 2 ∗ = { x 2 , x 3 , x 4 } C_2^* = \{{\bm x}_2, {\bm x}_3, {\bm x}_4\} C2∗={x2,x3,x4}