today we will talk about the following points: - model rePResentation - cost function - gradient descent - gradient descent for linear regression - what’s next
model representation
training set notation: m = number of training examples x’s = input variable / features y’s = output variable / target variable
(x,y) one traing example (x ^i, y ^i) i-th training example
learning algorithmhypothesiscost function
what ‘s cost function? how it comes?gradient descent if we have training set,learning algorithm and hypothesis, then we want to minimize the cost function J of theta. one way of finding the minimum cost function is gradient descent.
methods:
start with some theta0,theta1keep changing theta0,theta1 to reduce J of theta until we end up at a minimumgraphs:
problem: diff start position may lead to diff local optima
definition:
intuition: let’s just think about the parameter theta1: the problem of learning rate:
why gradient descent can converge without turning learning rate smaller?
gradient descent for linear regression
outcome:
batch Gradient Descent:
two extension
solve theta0,theta1 without iterationlearn with larger number of features subscripts to describe features. Ex: x1 x2 x3use linear algebra. Matrix and Vector新闻热点
疑难解答