前向传播、反向传播与感知器

前向传播、反向传播与感知器,第1张

前向传播、反向传播与感知器 设计思路 前向传播

前向传播是计算输出值 y ^ = W T x + b \hat y=W^Tx+b y^=WTx+b与损失函数 L o s s ( y ^ , y ) Loss(\hat y,y) Loss(y^,y)


具体而言就在神经网络中是根据输入的数据x与网络的权重w经过激活函数后输出 y ^ \hat y y^,并根据 y ^ \hat y y^ y y y计算出损失函数的过程。


反向传播

方向传播是计算损失函数的梯度,即 Δ w = α ∂ L ∂ w \Delta w=\alpha\frac{\partial L}{\partial w} Δw=αwL Δ b = α ∂ L ∂ b \Delta b=\alpha\frac{\partial L}{\partial b} Δb=αbL,$w = w - \Delta w , , b = b - \Delta b , 其 中 ,其中 \alpha$是学习率。


根据微积分中的梯度方向是函数值增长方向的原理,通过向梯度方向的反方向更新权重与偏置,来使得损失函数的取值最小化的过程。


代码

以下采用简单的一元线性函数 y = w x + b y=wx+b y=wx+b进行实验,实现感知器。


激活函数选择线性函数,损失函数采用平方误差函数 L o s s ( y ^ , y ) = ( y ^ − y ) 2 Loss(\hat y,y)=(\hat y-y)^2 Loss(y^,y)=(y^y)2,平方误差函数对w的偏导数为 ∂ L ∂ w = 2 ( y ^ − y ) × x \frac{\partial L}{\partial w}=2(\hat y-y)\times x wL=2(y^y)×x,对b的偏导数为 ∂ L ∂ b = 2 ( y ^ − y ) \frac{\partial L}{\partial b}=2(\hat y-y) bL=2(y^y)


本段代码大约在30epoch后收敛。


Pure Python
import numpy as np

# generate the data and add random noise
a = 3
b = 1
data_size = 1000
train_x = np.random.randn(1, data_size)
train_y = a * train_x + b + 0.1 * np.random.randn(1, data_size)


class Network:
    def __init__(self, size, learning_rate):
        self.w = np.random.randn(size)
        self.b = np.random.randn(1)
        self.learning_rate = learning_rate

    def y_hat(self, x):
        return np.dot(self.w, x)+self.b

    def loss(self, x, y):
        return np.square(self.y_hat(x) - y)

    def update(self, x, y):
        gradient = 2*(self.y_hat(x) - y)
        self.w -= gradient*self.learning_rate*x
        self.b -= gradient*self.learning_rate

    def train(self, x, y):
        for index in range(y.shape[1]):
            self.update(x[:,index], y[:,index])


network = Network(1, 1e-4)

valid_x = np.random.randn(1, data_size//3)
valid_y = a*valid_x + b + 0.1*np.random.randn(1, data_size//3)

epoch = 30
for i in range(epoch):
    print("epoch:{}/{}".format(i,epoch))
    network.train(train_x, train_y)
    print("w:{},b:{},train_loss:{},valid_loss:{}".format(network.w, network.b,
                                                         np.sum(network.loss(train_x, train_y))/data_size,
                                                         np.sum(network.loss(valid_x, valid_y))/(data_size//3)))

mindspore

参考简单线性函数拟合。


import numpy as np
import matplotlib.pyplot as plt
from mindspore import dataset as ds
from mindspore import nn
from mindspore import Tensor
from mindspore import Model


def generate_data(data_size, w=3.0, b=1.0):
    for _ in range(data_size):
        x = np.random.randn(1)
        y = w * x + b + 0.1 * np.random.randn(1)
        yield np.array([x]).astype(np.float32), np.array([y]).astype(np.float32)


def create_dataset(data_size, batch_size=16, repeat_size=1):
    input_data = ds.GeneratorDataset(list(generate_data(data_size)), column_names=['data', 'label'])
    input_data = input_data.batch(batch_size)
    input_data = input_data.repeat(repeat_size)
    return input_data


def model_display(net):
    model_params = net.trainable_params()
    for param in model_params:
        print(param, param.asnumpy())

    x_model_label = np.array([-10, 10, 0.1])
    y_model_label = (x_model_label * Tensor(model_params[0]).asnumpy()[0][0] +
                     Tensor(model_params[1]).asnumpy()[0])

    x_label, y_label = zip(*generate_data(data_number))

    plt.axis([-10, 10, -20, 25])
    plt.scatter(x_label, y_label, color="red", s=5)
    plt.plot(x_model_label, y_model_label, color="blue")
    plt.show()


class LinearNet(nn.Cell):
    def __init__(self):
        super(LinearNet, self).__init__()
        self.fc = nn.Dense(1, 1)

    def construct(self, x):
        x = self.fc(x)
        return x


data_number = 100
batch_number = 16
repeat_number = 1

ds_train = create_dataset(data_number, batch_size=batch_number, repeat_size=repeat_number)

net = LinearNet()
model_display(net)
net_loss = nn.loss.MSELoss()
opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9)

model = Model(net, net_loss, opt)
epoch = 10
model.train(epoch, ds_train, dataset_sink_mode=False)

for net_param in net.trainable_params():
    print(net_param, net_param.asnumpy())

model_display(net)

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/578008.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-04-11
下一篇 2022-04-11

发表评论

登录后才能评论

评论列表(0条)

保存