感知器算法例题

如图所示的训练数据集,其正样本是 x 1 = ( 3 , 3 ) ⊤ x_1 = (3,3)^\top x1=(3,3) x 2 = ( 4 , 3 ) ⊤ x_2 = (4,3)^\top x2=(4,3),负样本是 x 3 = ( 1 , 1 ) ⊤ x_3 = (1,1)^\top x3=(1,1),使用感知器算法的随机梯度法求感知机模型 f ( x ) = sign ( w ⋅ x + b ) f(x) = \text{sign}(w \cdot x + b) f(x)=sign(wx+b)。这里, w = ( w ( 1 ) , w ( 2 ) ) ⊤ w = (w^{(1)}, w^{(2)})^\top w=(w(1),w(2)) x = ( x ( 1 ) , x ( 2 ) ) ⊤ x = (x^{(1)}, x^{(2)})^\top x=(x(1),x(2))


解答
构建最优化问题:

min ⁡ w , b L ( w , b ) = − ∑ x i ∈ M y i ( w ⋅ x i + b ) \min_{w,b} L(w,b) = -\sum_{x_i \in M} y_i (w \cdot x_i + b) w,bminL(w,b)=xiMyi(wxi+b)

求解 w w w b b b η = 1 \eta = 1 η=1

(1) 取初值 w 0 = 0 w_0 = 0 w0=0 b 0 = 0 b_0 = 0 b0=0

(2) 对 x 1 = ( 3 , 3 ) ⊤ x_1 = (3,3)^\top x1=(3,3) y 1 ( w 0 ⋅ x 1 + b 0 ) = 0 y_1 (w_0 \cdot x_1 + b_0) = 0 y1(w0x1+b0)=0,未能被正确分类,更新 w w w b b b

w 1 = w 0 + y 1 x 1 = ( 3 , 3 ) ⊤ b 1 = b 0 + y 1 = 1 w_1 = w_0 + y_1 x_1 = (3,3)^\top \quad b_1 = b_0 + y_1 = 1 w1=w0+y1x1=(3,3)b1=b0+y1=1

得到线性模型:

w 1 ⋅ x + b 1 = 3 x ( 1 ) + 3 x ( 2 ) + 1 w_1 \cdot x + b_1 = 3x^{(1)} + 3x^{(2)} + 1 w1x+b1=3x(1)+3x(2)+1

(3) 对 x 1 x_1 x1 x 2 x_2 x2,显然, y i ( w 1 ⋅ x i + b 1 ) > 0 y_i (w_1 \cdot x_i + b_1) > 0 yi(w1xi+b1)>0,被正确分类,不修改 w w w b b b;对 x 3 = ( 1 , 1 ) ⊤ x_3 = (1,1)^\top x3=(1,1) y 3 ( w 1 ⋅ x 3 + b 1 ) < 0 y_3 (w_1 \cdot x_3 + b_1) < 0 y3(w1x3+b1)<0,被错分类,更新 w w w b b b

w 2 = w 1 + y 3 x 3 = ( 2 , 2 ) ⊤ b 2 = b 1 + y 3 = 0 w_2 = w_1 + y_3 x_3 = (2,2)^\top \quad b_2 = b_1 + y_3 = 0 w2=w1+y3x3=(2,2)b2=b1+y3=0

得到线性模型:

w 2 ⋅ x + b 2 = 2 x ( 1 ) + 2 x ( 2 ) w_2 \cdot x + b_2 = 2x^{(1)} + 2x^{(2)} w2x+b2=2x(1)+2x(2)
如此继续下去,直到

w 7 = ( 1 , 1 ) ⊤ , b 7 = − 3 w_7 = (1, 1)^\top, \quad b_7 = -3 w7=(1,1),b7=3

w 7 ⋅ x + b 7 = x ( 1 ) + x ( 2 ) − 3 w_7 \cdot x + b_7 = x^{(1)} + x^{(2)} - 3 w7x+b7=x(1)+x(2)3

对所有数据点 y i ( w 7 ⋅ x i + b 7 ) > 0 y_i(w_7 \cdot x_i + b_7) > 0 yi(w7xi+b7)>0,没有错分类点,损失函数达到极小。

分离超平面为 x ( 1 ) + x ( 2 ) − 3 = 0 x^{(1)} + x^{(2)} - 3 = 0 x(1)+x(2)3=0,感知机模型为 f ( x ) = sign ( x ( 1 ) + x ( 2 ) − 3 ) f(x) = \text{sign}(x^{(1)} + x^{(2)} - 3) f(x)=sign(x(1)+x(2)3)

迭代过程见表。

表 求解的迭代过程

迭代次数 错分类点 w w w b b b w ⋅ x + b w \cdot x + b wx+b
0 - 0 0 0
1 x 1 x_1 x1 ( 3 , 3 ) ⊤ (3, 3)^\top (3,3) 1 3 x ( 1 ) + 3 x ( 2 ) + 1 3x^{(1)} + 3x^{(2)} + 1 3x(1)+3x(2)+1
2 x 3 x_3 x3 ( 2 , 2 ) ⊤ (2, 2)^\top (2,2) 0 2 x ( 1 ) + 2 x ( 2 ) 2x^{(1)} + 2x^{(2)} 2x(1)+2x(2)
3 x 3 x_3 x3 ( 1 , 1 ) ⊤ (1, 1)^\top (1,1) -1 x ( 1 ) + x ( 2 ) − 1 x^{(1)} + x^{(2)} - 1 x(1)+x(2)1
4 x 3 x_3 x3 ( 0 , 0 ) ⊤ (0, 0)^\top (0,0) -2 -2
5 x 1 x_1 x1 ( 3 , 3 ) ⊤ (3, 3)^\top (3,3) -1 3 x ( 1 ) + 3 x ( 2 ) − 1 3x^{(1)} + 3x^{(2)} - 1 3x(1)+3x(2)1
6 x 3 x_3 x3 ( 2 , 2 ) ⊤ (2, 2)^\top (2,2) -2 2 x ( 1 ) + 2 x ( 2 ) − 2 2x^{(1)} + 2x^{(2)} - 2 2x(1)+2x(2)2
7 x 3 x_3 x3 ( 1 , 1 ) ⊤ (1, 1)^\top (1,1) -3 x ( 1 ) + x ( 2 ) − 3 x^{(1)} + x^{(2)} - 3 x(1)+x(2)3
8 0 ( 1 , 1 ) ⊤ (1, 1)^\top (1,1) -3 x ( 1 ) + x ( 2 ) − 3 x^{(1)} + x^{(2)} - 3 x(1)+x(2)3

这是在计算中错分类点先后取 x 1 , x 3 , x 3 , x 3 , x 1 , x 3 , x 3 x_1, x_3, x_3, x_3, x_1, x_3, x_3 x1,x3,x3,x3,x1,x3,x3 得到的分离超平面和感知机模型。如果在计算中错分类点依次取 x 1 , x 3 , x 3 , x 3 , x 2 , x 3 , x 3 , x 1 , x 3 , x 3 x_1, x_3, x_3, x_3, x_2, x_3, x_3, x_1, x_3, x_3 x1,x3,x3,x3,x2,x3,x3,x1,x3,x3,那么得到的分离超平面是 2 x ( 1 ) + x ( 2 ) − 5 = 0 2x^{(1)} + x^{(2)} - 5 = 0 2x(1)+x(2)5=0

可见,感知器算法由于采用不同的初值或选取不同的错分类点,解可以不同。

你可能感兴趣的:(PR书稿,算法)