机器学习线性回归学习心得_机器学习中的线性回归

机器学习线性回归学习心得

Most of you reading this article must be having a fair idea of the term machine learning. If we talk in lay man’s language it is basically an application of artificial intelligence wherein we give in a set of data to a machine and make the machine train and learn using those data. Now the question arises that how will the machine learn on its own? There has to be some algorithm or you can say a method which will help the machine to learn through the data provided by us. We have numerous algorithms from mathematics that we use in machine these algos are broadly categorized into 2 parts:

大部分阅读本文的人都必须对机器学习一词有一个清晰的认识 。 如果我们使用外行人的语言交谈,那么它基本上就是人工智能的应用,其中我们向机器提供一组数据,并使机器训练并使用这些数据进行学习。 现在出现的问题是,机器将如何自行学习? 必须有某种算法,或者您可以说一种方法,它将帮助机器学习我们提供的数据。 我们在计算机中使用了许多数学算法,这些算法大致分为两部分:

  1. Regression

    回归

  2. Classification

    分类

Well, regression is used basically when we are dealing with continuous sets of data and classification is applied when the data set used is scattered.

好吧, 当我们处理连续的数据集时基本上使用回归,而当使用的数据集分散时,则应用分类

To start with, we are going to discuss one of the simplest regression i.e. linear regression and we will code a simple machine learning programme to predict the relationship between the head size and the brain weight of different users.

首先,我们将讨论最简单的回归方法之一, 即线性回归,并且我们将编写一个简单的机器学习程序,以预测不同用户的头部大小和大脑重量之间的关系

To start with we have taken the data of 237 users. The data is in the form of a .csv format that contains the following details about the users:

首先,我们获取了237位用户的数据。 数据采用.csv格式 ,其中包含有关用户的以下详细信息:

  1. Gender

    性别

  2. Age range

    年龄范围

  3. Head size

    头部尺寸

  4. Brain Weight

    脑重量

The best way to describe the relationship is by using graphs and images so for that we will take values of head size in x-axis (dependent variable) and values of brain Weight in y-axis (independent variable) and will plot the graph between then we start of by splitting the data into train and test set, train data will be used to train our model and test set will be used for testing the accuracy using the code:

描述关系的最佳方法是使用图形和图像,因此我们将在x轴上获取头部大小的值(因变量),在y轴上获取大脑权重的值(因变量),并在然后我们首先将数据分为训练和测试集,训练数据将用于训练我们的模型,测试集将用于使用代码测试准确性:

 from sklearn.cross_validation import train_test_split
 x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=1/4,random_state=0).

After plotting a scattered graph of the trainning set, we get the following result:

在绘制训练集的散点图后,我们得到以下结果:

From here we can easily see that the relationship between the brain weight and head size is following a positive linear pattern. Therefore, to predict the values in the test set we would be using Linear regression.

从这里我们可以很容易地看到, 大脑重量和头部大小之间关系遵循正线性模式 。 因此,要预测测试集中的值,我们将使用线性回归。

As we draw a scattered graph between the test values we get the similar type of a graph:

在测试值之间绘制分散的图形时,我们得到了类似的图形类型:

Now in order to predict the test set values, we need to fit in the values in the training set into the linear regression function using the following code:

现在,为了预测测试集的值,我们需要使用以下代码将训练集中的值拟合到线性回归函数中:

    from sklearn.linear_model import LinearRegression
    regressor=LinearRegression()
    regressor.fit(x_train,y_train)


After fitting in the linear regression function. This is how we get the predicted values of brain weight using linear regression:

拟合线性回归函数之后。 这就是我们如何使用线性回归获得脑重量的预测值:

Here the increasing liner slope is the predicted set of values using linear regression algos and the red dots are the actual test values from here we can say that our mode performed fairly well in predicting brain weight values from head size values in the test set.

在此,增加的线性斜率是使用线性回归算法预测的一组值,而红点是此处的实际测试值,可以说我们的模式在根据测试集中的头部大小值预测脑重量值方面表现相当不错。

Below is code for linear regression which is written in Python. It is advisable to run this code in the SPYDER tool provided by ANACONDA which works on python 3.6. Library used for regression is scikit learn. The dataset is in the form of .csv can be downloaded from here (headbrain.CSV).

下面是用Python编写的线性回归代码。 建议在适用于python 3.6的ANACONDA提供的SPYDER工具中运行此代码。 scikit学习了用于回归的库。 数据集为.csv格式,可从此处下载( headbrain.CSV )。

Python code

Python代码

# -*- coding: utf-8 -*-
"""
Created on Sun Jul 29 22:21:12 2018

@author: Logan
"""

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#reading the data
"""	here the directory of my code and the headbrain.csv 
	file is same make sure both the files are stored in 
	the same folder or directory""" 
	
data=pd.read_csv('headbrain.csv')
data.head()

x=data.iloc[:,2:3].values
y=data.iloc[:,3:4].values

#splitting the data into training and test
from sklearn.cross_validation import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=1/4,random_state=0)

#fitting simple linear regression to the training set
from sklearn.linear_model import LinearRegression
regressor=LinearRegression()
regressor.fit(x_train,y_train)

#predict the test result
y_pred=regressor.predict(x_test)

#to see the relationship between the training data values
plt.scatter(x_train,y_train,c='red')
plt.show()

#to see the relationship between the predicted brain weight values using scattered graph
plt.plot(x_test,y_pred)   
plt.scatter(x_test,y_test,c='red')
plt.xlabel('headsize')
plt.ylabel('brain weight')

That is all for today guys hope you liked it .

今天,这就是全部,希望大家喜欢。

翻译自: https://www.includehelp.com/ml-ai/linear-regression-in-machine-learning.aspx

机器学习线性回归学习心得

你可能感兴趣的:(算法,python,机器学习,人工智能,深度学习)