The linear regression model is a particular type of supervised learning model.
Terminology
Training set(训练集): Data used to train the model
Notation
x x x = “input” variable / feature / input feature
y y y = “output” variable / “target” variable
m m m = number of training examples 训练样本的总数
( x , y ) (x,y) (x,y) = single training example 单个训练样本
( x ( i ) , y ( i ) ) (x^{(i)},y^{(i)}) (x(i),y(i)) = i t h i^{th} ith training example ( 1 s t , 2 n d , 3 r d . . . ) 1^{st},2^{nd},3^{rd}...) 1st,2nd,3rd...)
training set: features, targets → \rightarrow → learning algorithm → f \rightarrow f →f
x → f x\rightarrow f x→f(function) → y ^ \rightarrow \hat{y} →y^ “y-hat” (y的估计值或预测值)
f e a t u r e → m o d e l → p r e d i c t i o n feature\rightarrow model\rightarrow prediction feature→model→prediction (estimated y)
How to represent f f f?
f w , b ( x ) = w x + b f_{w,b}(x)=wx+b fw,b(x)=wx+b
Linear regression with one variable.(Univariate linear regression 单变量线性回归模型)
Model: f w , b ( x ) = w x + b f_{w,b}(x)=wx+b fw,b(x)=wx+b
w , b : w,b: w,b: parameters 模型的参数 / coefficients 系数 / weights 权重
在训练的过程中可以调整以改进模型
What do w , b w,b w,b do?
y ^ ( i ) = f w , b ( x ( i ) ) \hat{y}^{(i)}=f_{w,b}(x^{(i)}) y^(i)=fw,b(x(i))
f w , b ( x ( i ) ) = w x ( i ) + b f_{w,b}(x^{(i)})=wx^{(i)}+b fw,b(x(i))=wx(i)+b
Find w , b : w,b: w,b: y ^ ( i ) \hat{y}^{(i)} y^(i) is close to y ( i ) y^{(i)} y(i) for all ( x ( i ) , y ( i ) ) (x^{(i)},y^{(i)}) (x(i),y(i)).
Cost function: Squared error cost function 平方误差代价函数
J ( w , b ) = 1 2 m ∑ i = 1 m ( y ^ ( i ) − y ( i ) ) 2 J(w,b)=\frac{1}{2m}\sum_{i=1}^{m}(\hat{y}^{(i)}-y^{(i)})^2 J(w,b)=2m1i=1∑m(y^(i)−y(i))2
or
J ( w , b ) = 1 2 m ∑ i = 1 m ( f w , b ( x ( i ) ) − y ( i ) ) 2 J(w,b)=\frac{1}{2m}\sum_{i=1}^{m}(f_{w,b}(x^{(i)})-y^{(i)})^2 J(w,b)=2m1i=1∑m(fw,b(x(i))−y(i))2
our goal: minimize J ( w , b ) J(w,b) J(w,b) by adjusting w , b w,b w,b.
学习来源:吴恩达机器学习课程