logistical regression

1.logistic regression model

1.1 classification

want


1.2 cost function



logistical regression_第1张图片
y=1

logistical regression_第2张图片
y=0

simplified cost function:

gradient descent:
repeat {

}

1.3Optimization algorithms:

  • Gradient descent
  • Conjugate gradient
  • BFGS
  • L-BFGS
    cost function
function [jVal, gradient] = costFunction(theta)
  jVal = [...code to compute J(theta)...];
  gradient = [...code to compute derivative of J(theta)...];
end

Then we can use octave's "fminunc()" optimization algorithm along with the "optimset()" function that creates an object containing the options we want to send to "fminunc()"

options = optimset('GradObj', 'on', 'MaxIter', 100);
initialTheta = zeros(2,1);
   [optTheta, functionVal, exitFlag] = fminunc(@costFunction, initialTheta, options);

1.4 multiclass classfication

Train a logistic regression classifier for each class to predict the probability that y = i .To make a prediction on a new x, pick the class that maximizes

logistical regression_第3张图片

1.5 how to solve overFitting

  • reduce number of features
  • regularization
1.5.1 Regularized Linear Regression

small values for parameters
①simpler hypothesis
②less prone to overfitting


if is too large,then it will result in underfitting, because will be close to 0 at this moment.

gradient descent:
repeat{


}

Normal Equation:


logistical regression_第4张图片
normal equation

1.5.2 Regularized logistic Regression


attention:

gradient descent:
repeat{


}


Appendices

the derivation of cost function
first:


so we can get the general formula:

then use the Maximum likelihood estimation(MLE):
note that
substitute equation (1) into equation (2):

on equation (3) on both sides of the natural logarithm:

we know that MLE's goal is to get the best that makes equation (4) max, so we let

next, we will to get the deviation:
\begin{aligned} \\&\frac{\partial J}{\partial \theta_j}=-\sum_{i=1}^n\big[y^{(i)}\frac{1}{g(z^{(i)})}+(1-y^{(i)})\frac{-1}{1-g(z^{(i)})}\big]\frac{\partial g(z^{(i)})}{\partial \theta_j} \\& \\&=-\sum_{i=1}^n\big[\frac{y^{(i)}}{g(z^{(i)})}-\frac{(1-y^{(i)})}{1-g(z^{(i)})}\big]g(z^{(i)})(1-g(z^{(i)}))x_j^{(i)} \\&=-\sum_{i=1}^n\big[y^{(i)}(1-g(z^{(i)}))-(1-y^{(i)})g(z^{(i)})\big]x_j^{(i)} \end{aligned}
so further,we can calculate :

你可能感兴趣的:(logistical regression)