RLS算法-公式初探

RLS算法-公式推导

不带遗忘因子的推导:递推最小二乘法推导(RLS)——全网最简单易懂的推导过程 - 阿Q在江湖的文章 - 知乎
https://zhuanlan.zhihu.com/p/111758532

对于一组观测点 ( x 1 , y 1 ) (x_1, y_1) (x1,y1) ( x 2 , y 2 ) (x_2, y_2) (x2,y2) ⋯ \cdots ( x n , y n ) (x_n, y_n) (xn,yn),用 y ^ = k x + b \hat{y} = kx + b y^=kx+b进行线性拟合,有如下优化问题:
e r r m i n = m i n ∑ i = 1 n ζ n − i ( y ^ − y i ) 2 = m i n ∑ i = 1 n ζ n − i ( k x i + b − y i ) 2 err_{min} = min \sum_{i=1}^n \zeta^{n - i} (\hat{y} - y_i)^2 =min \sum_{i=1}^n \zeta^{n - i} (kx_i + b - y_i)^2 errmin=mini=1nζni(y^yi)2=mini=1nζni(kxi+byi)2
f ( k , b ) = ∑ i = 1 n ζ n − i ( k x i + b − y i ) 2 f(k,b) = \sum_{i=1}^n \zeta^{n - i} (kx_i + b - y_i)^2 f(k,b)=i=1nζni(kxi+byi)2,分别对 k , b k,b k,b求偏导,令其等于 0 0 0,有
{ ∂ f ∂ k = ∑ i = 1 n ζ n − i ( k x i + b − y i ) x i = 0 ∂ f ∂ b = ∑ i = 1 n ζ n − i ( k x i + b − y i ) = 0 \begin{cases} \frac {\partial f } {\partial k } = \sum_{i=1}^n \zeta^{n - i} (kx_i + b - y_i)x_i = 0 \\ \quad \\ \frac {\partial f } {\partial b } = \sum_{i=1}^n \zeta^{n - i} (kx_i + b - y_i) = 0 \end{cases} kf=i=1nζni(kxi+byi)xi=0bf=i=1nζni(kxi+byi)=0
改写成矩阵形式:
( ∑ i = 1 n ζ n − i x i 2 ∑ i = 1 n ζ n − i x i ∑ i = 1 n ζ n − i x i ∑ i = 1 n ζ n − i ) ∗ ( k b ) = ( ∑ i = 1 n ζ n − i x i y i ∑ i = 1 n ζ n − i y i ) (1) \begin{pmatrix}\sum_{i=1}^n\zeta^{n-i}x_i^2 & \sum_{i=1}^n\zeta^{n-i}x_i \\ \quad \\ \sum_{i=1}^n\zeta^{n-i}x_i & \sum_{i=1}^n\zeta^{n-i} \end{pmatrix} * \begin{pmatrix} k \\ \quad \\ b \end{pmatrix} = \begin{pmatrix} \sum_{i=1}^n\zeta^{n-i}x_iy_i \\ \quad \\ \sum_{i=1}^n\zeta^{n-i}y_i \end{pmatrix} \tag{1} i=1nζnixi2i=1nζnixii=1nζnixii=1nζnikb=i=1nζnixiyii=1nζniyi(1)
细致一点:
( x 1 x 2 ⋯ x n 1 1 ⋯ 1 ) ∗ ( ζ n − 1 0 ⋯ 0 0 ζ n − 2 ⋯ 0 ⋮ ⋮ ⋱ ⋮ 0 0 ⋯ ζ 0 ) ∗ ( x 1 1 x 2 1 ⋮ ⋮ x n 1 ) ∗ ( k b ) = ( x 1 x 2 ⋯ x n 1 1 ⋯ 1 ) ∗ ( ζ n − 1 0 ⋯ 0 0 ζ n − 2 ⋯ 0 ⋮ ⋮ ⋱ ⋮ 0 0 ⋯ ζ 0 ) ∗ ( y 1 y 2 ⋮ y n ) \begin{pmatrix} x_1&x_2& \cdots &x_n \\ 1 & 1 & \cdots &1 \end{pmatrix} * \begin{pmatrix} \zeta^{n-1} & 0& \cdots & 0 \\ 0 & \zeta^{n-2} & \cdots &0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \zeta^0 \end{pmatrix} * \begin{pmatrix} x_1 & 1 \\ x_2 & 1 \\ \vdots & \vdots \\ x_n &1 \end{pmatrix} * \begin{pmatrix} k \\ b \end{pmatrix} = \begin{pmatrix} x_1&x_2& \cdots &x_n \\ 1 & 1 & \cdots &1 \end{pmatrix} * \begin{pmatrix} \zeta^{n-1} & 0& \cdots & 0 \\ 0 & \zeta^{n-2} & \cdots &0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \zeta^0 \end{pmatrix} * \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix} (x11x21xn1)ζn1000ζn2000ζ0x1x2xn111(kb)=(x11x21xn1)ζn1000ζn2000ζ0y1y2yn

记 公式 ( 1 ) (1) (1)中的系数矩阵(自相关矩阵)的逆矩阵 R ( n ) R(n) R(n) n n n代表观测数据的组数,有:
R ( n ) = ( ∑ i = 1 n ζ n − i x i 2 ∑ i = 1 n ζ n − i x i ∑ i = 1 n ζ n − i x i ∑ i = 1 n ζ n − i ) − 1 = ( ζ ( ∑ i = 1 n − 1 ζ n − 1 − i x i 2 ∑ i = 1 n − 1 ζ n − 1 − i x i ∑ i = 1 n − 1 ζ n − 1 − i x i ∑ i = 1 n − 1 ζ n − 1 − i ) + ( x n 2 x n x n ζ 0 ) ) − 1 = ( ζ R − 1 ( n − 1 ) + ( x n 1 ) ∗ ( x n 1 ) ) − 1 (2) \begin{aligned} R(n)=\begin{pmatrix}\sum_{i=1}^n\zeta^{n-i}x_i^2 & \sum_{i=1}^n\zeta^{n-i}x_i \\ \quad \\ \sum_{i=1}^n\zeta^{n-i}x_i & \sum_{i=1}^n\zeta^{n-i} \end{pmatrix} ^{-1} &= \Bigg( \zeta \begin{pmatrix}\sum_{i=1}^{n-1}\zeta^{n-1-i}x_i^2 & \sum_{i=1}^{n-1}\zeta^{n-1-i}x_i \\ \quad \\ \sum_{i=1}^{n-1}\zeta^{n-1-i}x_i & \sum_{i=1}^{n-1}\zeta^{n-1-i} \end{pmatrix} + \begin{pmatrix} x_n^2 & x_n \\ \quad \\ x_n & \zeta^0 \end{pmatrix} \Bigg)^{-1} \\ \quad \\ &= \Bigg(\zeta R^{-1}(n-1) + \begin{pmatrix} x_n \\ \quad \\1 \end{pmatrix} * \begin{pmatrix} x_n & 1 \end{pmatrix} \Bigg)^{-1} \tag{2} \end{aligned} R(n)=i=1nζnixi2i=1nζnixii=1nζnixii=1nζni1=(ζi=1n1ζn1ixi2i=1n1ζn1ixii=1n1ζn1ixii=1n1ζn1i+xn2xnxnζ0)1=(ζR1(n1)+xn1(xn1))1(2)
ϕ ( i ) = ( x i 1 ) \phi(i) = \begin{pmatrix} x_i \\ \quad \\ 1 \end{pmatrix} ϕ(i)=xi1
根据矩阵引逆定理,展开上式可得:
R ( n ) = R ( n − 1 ) ζ − R ( n − 1 ) ϕ ( n ) ϕ T ( n ) R ( n − 1 ) ζ 2 + ζ ϕ T ( n ) R ( n − 1 ) ϕ ( n ) (3) R(n) = \frac{R(n-1)}{\zeta} - \frac{R(n-1) \phi(n) \phi^T(n)R(n-1)}{\zeta^2 + \zeta \phi^T(n)R(n-1)\phi(n)} \tag{3} R(n)=ζR(n1)ζ2+ζϕT(n)R(n1)ϕ(n)R(n1)ϕ(n)ϕT(n)R(n1)(3)
记公式 ( 1 ) (1) (1)中右边结果值矩阵(互相关矩阵)为 D ( n ) D(n) D(n),有
D ( n ) = ( ∑ i = 1 n ζ n − i x i y i ∑ i = 1 n ζ n − i y i ) = ζ ( ∑ i = 1 n − 1 ζ n − 1 − i x i y i ∑ i = 1 n − 1 ζ n − 1 − i y i ) + ( x n y n y n ) = ζ D ( n − 1 ) + ϕ ( n ) ∗ ( y n ) (4) \begin{aligned} D(n) = \begin{pmatrix} \sum_{i=1}^n\zeta^{n-i}x_iy_i \\ \quad \\ \sum_{i=1}^n\zeta^{n-i}y_i \end{pmatrix} &= \zeta \begin{pmatrix} \sum_{i=1}^{n-1}\zeta^{n-1-i}x_iy_i \\ \quad \\ \sum_{i=1}^{n-1}\zeta^{n-1-i}y_i \end{pmatrix} + \begin{pmatrix} x_ny_n \\ \quad \\ y_n \end{pmatrix} \\ \quad\\ &= \zeta D(n-1) + \phi(n) * \begin{pmatrix} y_n \end{pmatrix} \tag{4} \end{aligned} D(n)=i=1nζnixiyii=1nζniyi=ζi=1n1ζn1ixiyii=1n1ζn1iyi+xnynyn=ζD(n1)+ϕ(n)(yn)(4)



Θ = ( k b ) \Theta = \begin{pmatrix} k \\ \quad \\ b \end{pmatrix} Θ=kb
根据公式 ( 1 ) (1) (1)可得:
Θ ( n ) = R ( n ) D ( n ) = R ( n ) [ ζ D ( n − 1 ) + ϕ ( n ) ∗ ( y n ) ] = R ( n ) [ ζ R − 1 ( n − 1 ) Θ ( n − 1 ) + ϕ ( n ) ∗ ( y n ) ] = R ( n ) [ ζ ( R − 1 ( n ) − ϕ ( n ) ϕ T ( n ) ) ζ Θ ( n − 1 ) + ϕ ( n ) ∗ ( y n ) ] = Θ ( n − 1 ) + R ( n ) ϕ ( n ) [ ( y n ) − ϕ T ( n ) Θ ( n − 1 ) ] (5) \begin{aligned} \Theta(n) = R(n)D(n) &= R(n)[\zeta D(n-1) + \phi(n) * \begin{pmatrix} y_n \end{pmatrix}] \\ &= R(n)[\zeta R^{-1}(n-1) \Theta(n-1) + \phi(n) * \begin{pmatrix} y_n \end{pmatrix}] \\ &= R(n)[\zeta \frac{(R^{-1}(n) - \phi(n)\phi^T(n))}{\zeta}\Theta (n-1) + \phi(n) * \begin{pmatrix} y_n \end{pmatrix}] \\ &= \Theta(n-1) + R(n)\phi(n)[\begin{pmatrix} y_n \end{pmatrix} - \phi^T(n)\Theta (n-1)] \end{aligned} \tag{5} Θ(n)=R(n)D(n)=R(n)[ζD(n1)+ϕ(n)(yn)]=R(n)[ζR1(n1)Θ(n1)+ϕ(n)(yn)]=R(n)[ζζ(R1(n)ϕ(n)ϕT(n))Θ(n1)+ϕ(n)(yn)]=Θ(n1)+R(n)ϕ(n)[(yn)ϕT(n)Θ(n1)](5)

总结:只需要计算得到 Θ ( n − 1 ) \Theta(n - 1) Θ(n1) R ( n ) R(n) R(n),就可以递推出 Θ ( n ) \Theta(n) Θ(n)

你可能感兴趣的:(数学,算法)