吴恩达Machine Learning 编程测验Programming Exercise 1: Linear Regression

https://blog.csdn.net/qq_35564813/article/details/79229413?utm_source=blogxgwz2

作业参考答案

for iter = 1:num_iters
    theta1=theta(1,1)-alpha/m*sum(X*theta-y);
    theta2=theta(2,1)-alpha/m*sum((X*theta-y) .* X(:,2));
    theta(1,1)=theta1;
    theta(2,1)=theta2;


for iter = 1:num_iters
    for i=1:length(theta)
        temp(i,1)=theta(i,1)-alpha/m*sum((X*theta-y) .* X(:,i));

计算梯度下降的公式中有两个疑问:
1.为什么要乘以X(:,2)或X(:,i)
2.为什么是点成.* X(:,2),.* X(:,i)

答:
1.X(:,2)来自于求导(求偏导)的推导结果:
吴恩达Machine Learning 编程测验Programming Exercise 1: Linear Regression_第1张图片

对θ1求导时,X是系数,所以被留下来了,而θ0系数是1,所以没有留下X

2.因为X的每行是一个样本,样本与样本之间没有联系,所以用点乘批量求解。不需要交叉相乘,点乘代表对应的行相乘。

function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X 
%   FEATURENORMALIZE(X) returns a normalized version of X where
%   the mean value of each feature is 0 and the ==standard deviation==
%   is 1. This is often a good preprocessing step to do when
%   working with learning algorithms.

% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2));
for i=1:size(X,2)
    mu(i)=mean(X(:,i));
    sigma(i)=std(X(:,i));
end
X_norm=(X_norm - mu) ./ sigma;

standard deviation 标准差
size(X, 2)表示求矩阵的列数,size(X, 1)表示求行数,size(X)表示显示行数和列数(先行后列)

X(:,i)表示取i列的所有行,括号里逗号左侧是行,右侧是列(行, 列)

归一化的公式是(x减去均值)再除以标准差
注意是标准差,不是课上说的最大值减最小值

你可能感兴趣的:(吴恩达Machine Learning 编程测验Programming Exercise 1: Linear Regression)