前言目录
前言
一、进度
二、基本内容
1.ML定义
2.Supervised/Unsupervised Learning定义
3.Hypothesis Function
4.Cost Function
5.Hypothesis Function 和 Cost Function的关系
6.Gradient Descent
7.Feature Scaling
8.Normal Equation
9.作业
总结
大学没干过什么事,最近被推荐了coursera,尝试开发一下自己在Machine Learning上的兴趣。
咱就不鸽一次,看看能不能学完
一、进度
第一周、第二周(36%)
二、基本内容 1.ML定义关于E、T、P三者的关系
2.Supervised/Unsupervised Learning定义1.Classification(Supervised)
2.Regression(Supervised)
3.Clustering(Unsupervised)
个人理解区别在于有无训练集,能预判给出想要的结果。以该题为例:
3.Hypothesis Function就是估计函数,前两课学习的都是简单的预估函数,例如
4.Cost Function核心公式,用于计算损失函数,是关于的函数
5.Hypothesis Function 和 Cost Function的关系以图为例,假设=0,则一个取定值的hypothesis function会对应到cost function图像的一点。可以看到对于图中的例子,取1时损失最小
6.Gradient Descent
两个核心公式,一个是定义式,一个是表达式
一般使用后面已经求完偏导的式子来迭代。
基本思路:对于所有初值给定的θ,进行迭代,每迭代一次,每个θ都会进行修改,修改的过程是根据损失函数决定的。有点像之前学过的Simulated Annealing的意思
为学习率。不可以过大或者过小(过大就会反向求解,过小效率过低)特别注意的是,在类似二次函数的图像中通过gradient descent,刚开始的时候斜率比较大,所以求偏导的值会比较大,那么θ就会更快的靠近最优解,所以不需要实时变化。
再进行迭代的时候要注意先给出求完偏导后的该部分的值矩阵:
然后再将所有θ值同时更新(后面的代码作业就在这里出现了问题)
7.Feature Scaling我们可以修改所有的样本值,使得计算的结果放在等高线图里更为平均
8.Normal Equation先有一个我觉得极其优美的公式,第一期看到的时候真的受到了震撼:
然后的核心式子就是以下式子:
以该样本为例:
注意第一列需要手动加上一列1,然后对于给出的样本矩阵和结果向量,就能用Normal Equation进行求解了。Angrew提到有可能XTX是不可逆的可能性,那就要注意样本值会不会有线性相关的存在,比如一个x是圆的半径,另一个x是圆的面积。这就可能导致XTX不可逆。
至于什么时候用Gradient Descent,什么时候用Normal Equation,当样本矩阵的维数过大的时候,就不适合用Normal Equation,计算时间过长。但是Gradient Descent什么时候都可以用。
9.作业值得注意的就是第六块的同时更新θ值部分。有一道题我之前写的是:
可以很明显看到,每一次的迭代我的θ减去的值都在改变。而事实上allparameter向量的值是不能改编的。
参考他人资料后改的写法是:
直接用矩阵进行运算,而我的思路还停留在数组:(
最后备份一下作业:
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
J=sum((X*theta-y).^2)/(2*m);
% =========================================================================
end
function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
% J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
J=sum((X*theta-y).^2)/(2*m);
% =========================================================================
end
function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X
% FEATURENORMALIZE(X) returns a normalized version of X where
% the mean value of each feature is 0 and the standard deviation
% is 1. This is often a good preprocessing step to do when
% working with learning algorithms.
% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2));
% ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
% of the feature and subtract it from the dataset,
% storing the mean value in mu. Next, compute the
% standard deviation of each feature and divide
% each feature by it's standard deviation, storing
% the standard deviation in sigma.
%
% Note that X is a matrix where each column is a
% feature and each row is an example. You need
% to perform the normalization separately for
% each feature.
%
% Hint: You might find the 'mean' and 'std' functions useful.
%
mu=mean(X);
sigma=std(X)
X_norm=(X-mu)./sigma;
% ============================================================
end
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
%
parameter = alpha*X'*(X*theta-y)/m;
theta = theta-parameter;
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
% theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCostMulti) and gradient here.
%
parameter = alpha*X'*(X*theta-y)/m;
theta = theta-parameter;
% ============================================================
% Save the cost J in every iteration
J_history(iter) = computeCostMulti(X, y, theta);
end
end
function [theta] = normalEqn(X, y)
%NORMALEQN Computes the closed-form solution to linear regression
% NORMALEQN(X,y) computes the closed-form solution to linear
% regression using the normal equations.
theta = zeros(size(X, 2), 1);
% ====================== YOUR CODE HERE ======================
% Instructions: Complete the code to compute the closed form solution
% to linear regression and put the result in theta.
%
% ---------------------- Sample Solution ----------------------
theta = (pinv(X'*X))*X'*y;
% -------------------------------------------------------------
% ============================================================
end
function A = warmUpExercise()
%WARMUPEXERCISE Example function in octave
% A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrix
A = [];
% ============= YOUR CODE HERE ==============
% Instructions: Return the 5x5 identity matrix
% In octave, we return values by defining which variables
% represent the return values (at the top of the file)
% and then set them accordingly.
A=eye(5);
% ===========================================
end
总结
别鸽,继续 : )
Andrew Ng赛高
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)