3-11 瑞丽商

3-11 瑞丽商

定义3.11.1 设 A H = A \boldsymbol{A}^{\mathrm{H}}=\boldsymbol{A} AH=A ,称实数

R ( X ) = X H A X X H X ( X ∈ C n , X ≠ 0 ) R(\boldsymbol{X})=\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}} \quad\left(\boldsymbol{X} \in C^n, \boldsymbol{X} \neq 0\right) R(X)=XHXXHAX(XCn,X=0)

为 Hermite 矩阵 A \boldsymbol{A} A 的 Rayleigh 商。
由于 Hermite 矩阵 A \boldsymbol{A} A 的特征值全是实数,不妨设 A \boldsymbol{A} A n n n 个特征值如下排列

λ 1 ⩽ λ 2 ⩽ ⋯ ⩽ λ n \lambda_1 \leqslant \lambda_2 \leqslant \cdots \leqslant \lambda_n λ1λ2λn

定理 3.11.1 Hermite 矩阵 A \boldsymbol{A} A 的 Rayleigh 商具有如下性质:
(1) R ( k X ) = R ( X ) ( k ∈ R ) R(k X)=R(X) \quad(k \in \mathbf{R}) R(kX)=R(X)(kR)
(2) λ 1 ⩽ R ( X ) ⩽ λ n \lambda_1 \leqslant R(X) \leqslant \lambda_n λ1R(X)λn
(3) min ⁡ X ≠ 0 R ( X ) = λ 1 , max ⁡ X ≠ 0 R ( X ) = λ n \min _{X \neq 0} R(X)=\lambda_1, \quad \max _{X \neq 0} R(X)=\lambda_n minX=0R(X)=λ1,maxX=0R(X)=λn
证明(1)由定义3.11.1 可得。
(2)矩阵 A \boldsymbol{A} A 可以酉对角化,即

U H A U = diag ⁡ ( λ 1 , λ 2 , ⋯   , λ n ) = Λ \boldsymbol{U}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{U}=\operatorname{diag}\left(\lambda_1, \lambda_2, \cdots, \lambda_n\right)=\Lambda UHAU=diag(λ1,λ2,,λn)=Λ
X = U Y \boldsymbol{X}=\boldsymbol{U} \boldsymbol{Y} X=UY ,则

R ( X ) = Y H U H A U Y Y H Y = Y H Λ Y Y H Y = λ 1 y 1 y ˉ 1 + λ 2 y 2 y ˉ 2 + ⋯ + λ n y n y ˉ n Y H Y \begin{aligned} R(\boldsymbol{X}) & =\frac{\boldsymbol{Y}^{\mathrm{H}} \boldsymbol{U}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{U} \boldsymbol{Y}}{\boldsymbol{Y}^{\mathrm{H}} \boldsymbol{Y}}=\frac{\boldsymbol{Y}^{\mathrm{H}} \boldsymbol{\Lambda} \boldsymbol{Y}}{\boldsymbol{Y}^{\mathrm{H}} \boldsymbol{Y}} \\ & =\frac{\lambda_1 y_1 \bar{y}_1+\lambda_2 y_2 \bar{y}_2+\cdots+\lambda_n y_n \bar{y}_n}{\boldsymbol{Y}^{\mathrm{H}} \boldsymbol{Y}} \end{aligned} R(X)=YHYYHUHAUY=YHYYHΛY=YHYλ1y1yˉ1+λ2y2yˉ2++λnynyˉn

因为

λ 1 ( y 1 y ˉ 1 + ⋯ + y n y ˉ n ) ⩽ λ 1 y 1 y ˉ 1 + ⋯ + λ n y n y ˉ n ⩽ λ n ( y 1 y ˉ 1 + ⋯ + y n y ˉ n ) \begin{aligned} \lambda_1\left(y_1 \bar{y}_1+\cdots+y_n \bar{y}_n\right) & \leqslant \lambda_1 y_1 \bar{y}_1+\cdots+\lambda_n y_n \bar{y}_n \\ & \leqslant \lambda_n\left(y_1 \bar{y}_1+\cdots+y_n \bar{y}_n\right) \end{aligned} λ1(y1yˉ1++ynyˉn)λ1y1yˉ1++λnynyˉnλn(y1yˉ1++ynyˉn)

λ 1 Y H Y ⩽ Y H Λ Y ⩽ λ n Y H Y \lambda_1 \boldsymbol{Y}^{\mathrm{H}} \boldsymbol{Y} \leqslant \boldsymbol{Y}^{\mathrm{H}} \boldsymbol{\Lambda} \boldsymbol{Y} \leqslant \lambda_n \boldsymbol{Y}^{\mathrm{H}} \boldsymbol{Y} λ1YHYYHΛYλnYHY

于是

λ 1 ⩽ R ( X ) ⩽ λ n \lambda_1 \leqslant R(X) \leqslant \lambda_n λ1R(X)λn

(3)对于(2)中的每一个 U \boldsymbol{U} U 适当选取 X \boldsymbol{X} X ,使得 y 2 = y 3 = ⋯ = y n = 0 y_2=y_3=\cdots=y_n=0 y2=y3==yn=0 ,便得

R ( X ) = λ 1 R(\boldsymbol{X})=\lambda_1 R(X)=λ1

类似地,适当选取 X X X ,使得 y 1 = y 2 = ⋯ = y n − 1 = 0 y_1=y_2=\cdots=y_{n-1}=0 y1=y2==yn1=0 ,便得

R ( X ) = λ n R(X)=\lambda_n R(X)=λn

综合之,便得

min ⁡ X ≠ 0 R ( X ) = λ 1 , max ⁡ X ≠ 0 R ( X ) = λ n \min _{X \neq 0} R(X)=\lambda_1, \quad \max _{X \neq 0} R(X)=\lambda_n X=0minR(X)=λ1,X=0maxR(X)=λn

定理 3.11.2 设 X 1 , X 2 , ⋯   , X k − 1 \boldsymbol{X}_1, \boldsymbol{X}_2, \cdots, \boldsymbol{X}_{k-1} X1,X2,,Xk1 是 Hermite 矩阵 A \boldsymbol{A} A 的分别属于特征值 λ 1 \lambda_1 λ1 λ 2 , ⋯   , λ k − 1 \lambda_2, \cdots, \lambda_{k-1} λ2,,λk1 的特征向量, R k R_k Rk 是子空间 span ⁡ ( X 1 , X 2 , ⋯   , X k − 1 ) \operatorname{span}\left(X_1, X_2, \cdots, X_{k-1}\right) span(X1,X2,,Xk1) 的正交补子空间,则

λ k = min ⁡ X ∈ R k R ( X ) \lambda_k=\min _{\boldsymbol{X} \in R_k} R(\boldsymbol{X}) λk=XRkminR(X)

证明 不妨设 X 1 , X 2 , ⋯   , X k − 1 , X k , ⋯   , X n \boldsymbol{X}_1, \boldsymbol{X}_2, \cdots, \boldsymbol{X}_{k-1}, \boldsymbol{X}_k, \cdots, \boldsymbol{X}_n X1,X2,,Xk1,Xk,,Xn A \boldsymbol{A} A n n n 个标准正交的特征向量组.显然

R k = span ⁡ ( X k , X k + 1 , ⋯   , X n ) R_k=\operatorname{span}\left(X_k, X_{k+1}, \cdots, X_n\right) Rk=span(Xk,Xk+1,,Xn)

对于任意 n n n 维向量 X \boldsymbol{X} X ,均有

X = C 1 X 1 + C 2 X 2 + ⋯ + C n X n X=C_1 X_1+C_2 X_2+\cdots+C_n X_n X=C1X1+C2X2++CnXn

于是

R ( X ) = X H A X X H X = ( C 1 X 1 + C 2 X 2 + ⋯ + C n X n ) H A ( C 1 X 1 + C 2 X 2 + ⋯ + C n X n ) ( C 1 X 1 + C 2 X 2 + ⋯ + C n X n ) H ( C 1 X 1 + C 2 X 2 + ⋯ + C n X n ) = λ 1 C ˉ 1 C 1 + λ 2 C ˉ 2 C 2 + ⋯ + λ n C ˉ n C n C 1 C ˉ 1 + C 2 C ˉ 2 + ⋯ + C n C ˉ n = λ 1 a 1 + λ 2 a 2 + ⋯ + λ n a n \begin{aligned} R(X) & =\frac{X^{\mathrm{H}} A X}{X^{\mathrm{H}} X} \\ & =\frac{\left(C_1 X_1+C_2 X_2+\cdots+C_n X_n\right)^{\mathrm{H}} A\left(C_1 X_1+C_2 X_2+\cdots+C_n X_n\right)}{\left(C_1 X_1+C_2 X_2+\cdots+C_n X_n\right)^{\mathrm{H}}\left(C_1 X_1+C_2 X_2+\cdots+C_n X_n\right)} \\ & =\frac{\lambda_1 \bar{C}_1 C_1+\lambda_2 \bar{C}_2 C_2+\cdots+\lambda_n \bar{C}_n C_n}{C_1 \bar{C}_1+C_2 \bar{C}_2+\cdots+C_n \bar{C}_n} \\ & =\lambda_1 a_1+\lambda_2 a_2+\cdots+\lambda_n a_n \end{aligned} R(X)=XHXXHAX=(C1X1+C2X2++CnXn)H(C1X1+C2X2++CnXn)(C1X1+C2X2++CnXn)HA(C1X1+C2X2++CnXn)=C1Cˉ1+C2Cˉ2++CnCˉnλ1Cˉ1C1+λ2Cˉ2C2++λnCˉnCn=λ1a1+λ2a2++λnan

其中

a i = C ˉ i C i C ˉ 1 C 1 + C ˉ 2 C 2 + ⋯ + C ˉ n C n ⩾ 0 ,  且  ∑ i = 1 n a i = 1 a_i=\frac{\bar{C}_i C_i}{\bar{C}_1 C_1+\bar{C}_2 C_2+\cdots+\bar{C}_n C_n} \geqslant 0, \text { 且 } \sum_{i=1}^n a_i=1 ai=Cˉ1C1+Cˉ2C2++CˉnCnCˉiCi0,  i=1nai=1
k = 1 k=1 k=1 时, R 1 = C n R_1=C^n R1=Cn .此即定理 3.11.1.
k = 2 k=2 k=2 时, X ∈ R 2 X \in R_2 XR2 ,这时 C 1 = 0 C_1=0 C1=0 ,故

X = C 2 X 2 + C 3 X 3 + ⋯ + C n X n R ( X ) = λ 2 a 2 + λ 3 a 3 + ⋯ + λ n a n . λ 2 = min ⁡ X ∈ R 2 R ( X ) \begin{gathered} \boldsymbol{X}=C_2 \boldsymbol{X}_2+C_3 X_3+\cdots+C_n X_n \\ R(X)=\lambda_2 a_2+\lambda_3 a_3+\cdots+\lambda_n a_n . \\ \lambda_2=\min _{\boldsymbol{X} \in R_2} R(X) \end{gathered} X=C2X2+C3X3++CnXnR(X)=λ2a2+λ3a3++λnan.λ2=XR2minR(X)

于是
其余类推。
类似地还可以证明:
定理 3.11.3 设 X ∈ span ⁡ ( X r , X r + 1 , ⋯   , X s ) , 1 ⩽ r < s ⩽ n \boldsymbol{X} \in \operatorname{span}\left(\boldsymbol{X}_r, \boldsymbol{X}_{r+1}, \cdots, \boldsymbol{X}_s\right), 1 \leqslant rXspan(Xr,Xr+1,,Xs),1r<sn ,则

min ⁡ X ≠ 0 R ( X ) = λ r , max ⁡ X ≠ 0 R ( X ) = λ s \min _{\boldsymbol{X} \neq 0} R(\boldsymbol{X})=\lambda_r, \quad \max _{\boldsymbol{X} \neq 0} R(\boldsymbol{X})=\lambda_s X=0minR(X)=λr,X=0maxR(X)=λs

定理3.11.4 设 V k V_k Vk n n n 维复向量空间中任意 k k k 维子空间,则有极小一极大原理

λ k = min ⁡ V k max ⁡ X ∈ V k R ( X ) \lambda_k=\min _{V_k} \max _{\boldsymbol{X} \in V_k} R(\boldsymbol{X}) λk=VkminXVkmaxR(X)

或极大一极小原理

λ k = max ⁡ V n − k + 1 min ⁡ X ∈ V n − k + 1 R ( X ) \lambda_k=\max _{V_{n-k+1}} \min _{X \in V_{n-k+1}} R(X) λk=Vnk+1maxXVnk+1minR(X)

证明 k − 1 k-1 k1 维子空间 span ⁡ ( X 1 , X 2 , ⋯   , X k − 1 ) \operatorname{span}\left(\boldsymbol{X}_1, \boldsymbol{X}_2, \cdots, \boldsymbol{X}_{k-1}\right) span(X1,X2,,Xk1) 的正交补子空间 R k R_k Rk n − k + 1 n-k+1 nk+1维,因此 V k V_k Vk R k R_k Rk 必有公共的非零向量 Y k \boldsymbol{Y}_k Yk ,故

min ⁡ X ∈ R k R ( X ) = λ k ⩽ R ( Y k ) \min _{\boldsymbol{X} \in R_k} R(\boldsymbol{X})=\lambda_k \leqslant R\left(\boldsymbol{Y}_k\right) XRkminR(X)=λkR(Yk)

又因为 Y k ∈ V k \boldsymbol{Y}_k \in V_k YkVk ,故

R ( Y k ) ⩽ max ⁡ X ∈ V k R ( X ) λ k ⩽ min ⁡ V k max ⁡ X ∈ V k R ( X ) \begin{aligned} & R\left(\boldsymbol{Y}_k\right) \leqslant \max _{\boldsymbol{X} \in V_k} R(\boldsymbol{X}) \\ & \lambda_k \leqslant \min _{V_k} \max _{\boldsymbol{X} \in V_k} R(\boldsymbol{X}) \end{aligned} R(Yk)XVkmaxR(X)λkVkminXVkmaxR(X)

因此
又由前面定理知

min ⁡ V k max ⁡ X ∈ V k R ( X ) ⩽ max ⁡ X ∈ L ( X 1 , X 2 , ⋯   , X k ) R ( X ) = λ k \min _{V_k} \max _{\boldsymbol{X} \in V_k} R(\boldsymbol{X}) \leqslant \max _{\boldsymbol{X} \in L\left(X_1, X_2, \cdots, X_k\right)} R(\boldsymbol{X})=\lambda_k VkminXVkmaxR(X)XL(X1,X2,,Xk)maxR(X)=λk

综合两不等式可得

λ k = min ⁡ V k max ⁡ X ∈ V k R ( X ) \lambda_k=\min _{V_k} \max _{X \in V_k} R(\boldsymbol{X}) λk=VkminXVkmaxR(X)

B = − A \boldsymbol{B}=-\boldsymbol{A} B=A ,则 B \boldsymbol{B} B 的特征值按递减顺序排列

μ 1 ⩾ μ 2 ⩾ ⋯ ⩾ μ n \mu_1 \geqslant \mu_2 \geqslant \cdots \geqslant \mu_n μ1μ2μn

其中 μ k = − λ n − k + 1 \mu_k=-\lambda_{n-k+1} μk=λnk+1 ,由刚才所证有

λ n − k + 1 = − μ k = − min ⁡ V k max ⁡ X ∈ V k X H B X X H X = − min ⁡ V k { max ⁡ X ∈ V k − X H A X X H X } = − min ⁡ V k { − min ⁡ X ∈ V k X H A X X H X } = max ⁡ V k min ⁡ X ∈ V k X H A X X H X = max ⁡ V k min ⁡ X ∈ V k R ( X ) \begin{aligned} \lambda_{n-k+1} & =-\mu_k=-\min _{V_k} \max _{\boldsymbol{X} \in V_k} \frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{B} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}} \\ & =-\min _{V_k}\left\{\max _{\boldsymbol{X} \in V_k} \frac{-\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}\right\} \\ & =-\min _{V_k}\left\{-\min _{\boldsymbol{X} \in V_k} \frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{X^{\mathrm{H}} \boldsymbol{X}}\right\}\\ &=\max _{V_k} \min _{\boldsymbol{X} \in V_k} \frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}=\max _{\boldsymbol{V}_k} \min _{\boldsymbol{X} \in \boldsymbol{V}_k} R(\boldsymbol{X})\\ \end{aligned} λnk+1=μk=VkminXVkmaxXHXXHBX=Vkmin{XVkmaxXHXXHAX}=Vkmin{XVkminXHXXHAX}=VkmaxXVkminXHXXHAX=VkmaxXVkminR(X)

n − k + 1 n-k+1 nk+1 i i i 代替上式得

λ i = max ⁡ V n − i + 1 min ⁡ X ∈ V n − i + 1 R ( X ) \lambda_i=\max _{V_{n-i+1}} \min _{X \in V_{n-i+1}} R(X) λi=Vni+1maxXVni+1minR(X)

最后应用 Rayleigh 商研究 Hermite 矩阵特征值的摄动定理,即讨论矩阵的元素发生微小变化时对应矩阵特征值的变化范围。

定理3.11.5 设 A , B \boldsymbol{A}, \boldsymbol{B} A,B 是 Hermite 矩阵, λ i ( A ) , λ i ( B ) \lambda_i(\boldsymbol{A}), \lambda_i(\boldsymbol{B}) λi(A),λi(B) λ i ( A + B ) \lambda_i(\boldsymbol{A}+\boldsymbol{B}) λi(A+B) 分别表示矩阵 A , B \boldsymbol{A}, \boldsymbol{B} A,B A + B \boldsymbol{A}+\boldsymbol{B} A+B 的特征值,且特征值从小到大按递增顺序排列.则对于每一个 k k k ,有

λ k ( A ) + λ 1 ( B ) ⩽ λ k ( A + B ) ⩽ λ k ( A ) + λ n ( B ) \lambda_k(\boldsymbol{A})+\lambda_1(\boldsymbol{B}) \leqslant \lambda_k(\boldsymbol{A}+\boldsymbol{B}) \leqslant \lambda_k(\boldsymbol{A})+\lambda_n(\boldsymbol{B}) λk(A)+λ1(B)λk(A+B)λk(A)+λn(B)

证明 因为

λ k ( A + B ) = max ⁡ V n − k + 1 min ⁡ X ∈ V n − k + 1 X H ( A + B ) X X H X = max ⁡ V n − k + 1 min ⁡ X ∈ V n − k + 1 [ X H A X X H X + X H B X X H X ] ⩽ max ⁡ V n − k + 1 min ⁡ X ∈ V n − k + 1 [ X H A X X H X + λ n ( B ) ] = λ k ( A ) + λ n ( B ) λ k ( A + B ) = max ⁡ V n − k + 1 min ⁡ X ∈ V n − k + 1 [ X H A X X H X + X H B X X H X ] ⩾ max ⁡ V n − k + 1 min ⁡ X ∈ V n − k + 1 [ X H A X X H X + λ 1 ( B ) ] = λ k ( A ) + λ 1 ( B ) \begin{aligned} \lambda_k(\boldsymbol{A}+\boldsymbol{B})= & \max _{V_{n-k+1}} \min _{\boldsymbol{X} \in V_{n-k+1}} \frac{\boldsymbol{X}^{\mathrm{H}}(\boldsymbol{A}+\boldsymbol{B}) \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}} \\ = & \max _{V_{n-k+1}} \min _{\boldsymbol{X} \in V_{n-k+1}}\left[\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}+\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{B} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}\right] \leqslant \\ & \max _{V_{n-k+1}} \min _{\boldsymbol{X} \in V_{n-k+1}}\left[\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}+\lambda_n(\boldsymbol{B})\right] \\ = & \lambda_k(\boldsymbol{A})+\lambda_n(\boldsymbol{B}) \\ \lambda_k(\boldsymbol{A}+\boldsymbol{B})= & \max _{V_{n-k+1}} \min _{\boldsymbol{X} \in V_{n-k+1}}\left[\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}+\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{B} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}\right] \geqslant \\ & \max _{V_{n-k+1}} \min _{\boldsymbol{X} \in V_{n-k+1}}\left[\frac{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{A} \boldsymbol{X}}{\boldsymbol{X}^{\mathrm{H}} \boldsymbol{X}}+\lambda_1(\boldsymbol{B})\right] \\ = & \lambda_k(\boldsymbol{A})+\lambda_1(\boldsymbol{B}) \end{aligned} λk(A+B)===λk(A+B)==Vnk+1maxXVnk+1minXHXXH(A+B)XVnk+1maxXVnk+1min[XHXXHAX+XHXXHBX]Vnk+1maxXVnk+1min[XHXXHAX+λn(B)]λk(A)+λn(B)Vnk+1maxXVnk+1min[XHXXHAX+XHXXHBX]Vnk+1maxXVnk+1min[XHXXHAX+λ1(B)]λk(A)+λ1(B)

例 3.11.1 设 A , B \boldsymbol{A}, \boldsymbol{B} A,B 是 Hermite 矩阵,且 B \boldsymbol{B} B 是半正定的,则

λ k ( A ) ⩽ λ k ( A + B ) \lambda_k(\boldsymbol{A}) \leqslant \lambda_k(\boldsymbol{A}+\boldsymbol{B}) λk(A)λk(A+B)

解 因为

λ k ˙ ( A + B ) ⩾ λ k ( A ) + λ 1 ( B ) \lambda_{\dot{k}}(\boldsymbol{A}+\boldsymbol{B}) \geqslant \lambda_k(\boldsymbol{A})+\lambda_1(\boldsymbol{B}) λk˙(A+B)λk(A)+λ1(B)

由于 B \boldsymbol{B} B 为半正定矩阵,所以 λ 1 ( B ) ⩾ 0 \lambda_1(\boldsymbol{B}) \geqslant 0 λ1(B)0 .从而得到所需结论.

你可能感兴趣的:(矩阵)