python – 如何使用TensorFlow中的官方批量标准化层？_python

概述我试图使用批量标准化来训练我的神经网络使用TensorFlow,但我不清楚如何使用 the official layer implementation of Batch Normalization(注意这与 API的不同). 在他们的github issues上进行了一些痛苦的挖掘后,似乎需要一个tf.cond来正确使用它并且还有一个’resue = True’标志,以便BN移位和缩放变量被正确地我试图使用批量标准化来训练我的神经网络使用TensorFlow,但我不清楚如何使用 the official layer implementation of Batch Normalization(注意这与 API的不同).

在他们的github issues上进行了一些痛苦的挖掘后,似乎需要一个tf.cond来正确使用它并且还有一个’resue = True’标志,以便BN移位和缩放变量被正确地重复使用.在弄清楚之后,我提供了一个小小的描述,说明我认为是使用它的正确方法here.

现在我已经编写了一个简短的脚本来测试它(只有一个层和一个ReLu,很难让它比这个小).但是,我不是百分百确定如何测试它.现在我的代码运行时没有错误消息,但意外返回NaN.这降低了我对我在其他帖子中提供的代码可能正确的信心.或许我所拥有的网络很奇怪.无论哪种方式,有人知道什么是错的？这是代码：

import tensorflow as tf# download and install the MNIST data automaticallyfrom tensorflow.examples.tutorials.mnist import input_datafrom tensorflow.contrib.layers.python.layers import batch_norm as batch_normdef batch_norm_layer(x,train_phase,scope_bn):    bn_train = batch_norm(x,decay=0.999,center=True,scale=True,is_training=True,reuse=None,# is this right?    trainable=True,scope=scope_bn)    bn_inference = batch_norm(x,is_training=False,reuse=True,scope=scope_bn)    z = tf.cond(train_phase,lambda: bn_train,lambda: bn_inference)    return zdef get_NN_layer(x,input_dim,output_dim,scope,train_phase):    with tf.name_scope(scope+'vars'):        W = tf.Variable(tf.truncated_normal(shape=[input_dim,output_dim],mean=0.0,stddev=0.1))        b = tf.Variable(tf.constant(0.1,shape=[output_dim]))    with tf.name_scope(scope+'Z'):        z = tf.matmul(x,W) + b    with tf.name_scope(scope+'BN'):        if train_phase is not None:            z = batch_norm_layer(z,scope+'BN_unit')    with tf.name_scope(scope+'A'):        a = tf.nn.relu(z) # (M x D1) = (M x D) * (D x D1)    return amnist = input_data.read_data_sets("MNIST_data/",one_hot=True)# placeholder for datax = tf.placeholder(tf.float32,[None,784])# placeholder that turns BN during training or off during inferencetrain_phase = tf.placeholder(tf.bool,name='phase_train')# variables for parametershIDen_units = 25layer1 = get_NN_layer(x,input_dim=784,output_dim=hIDen_units,scope='layer1',train_phase=train_phase)# create modelW_final = tf.Variable(tf.truncated_normal(shape=[hIDen_units,10],stddev=0.1))b_final = tf.Variable(tf.constant(0.1,shape=[10]))y = tf.nn.softmax(tf.matmul(layer1,W_final) + b_final)### trainingy_ = tf.placeholder(tf.float32,10])cross_entropy = tf.reduce_mean( -tf.reduce_sum(y_ * tf.log(y),reduction_indices=[1]) )train_step = tf.train.GradIEntDescentoptimizer(0.5).minimize(cross_entropy)with tf.Session() as sess:    sess.run(tf.initialize_all_variables())    steps = 3000    for iter_step in xrange(steps):        #Feed_dict_batch = get_batch_Feed(X_train,Y_train,M,phase_train)        batch_xs,batch_ys = mnist.train.next_batch(100)        # Collect model statistics        if iter_step%1000 == 0:            batch_xstrain,batch_xstrain = batch_xs,batch_ys #simualtes train data            batch_xcv,batch_ycv = mnist.test.next_batch(5000) #simualtes CV data            batch_xtest,batch_ytest = mnist.test.next_batch(5000) #simualtes test data            # do inference            train_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xs,y_:batch_ys,train_phase: False})            cv_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xcv,y_:batch_ycv,train_phase: False})            test_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xtest,y_:batch_ytest,train_phase: False})            def do_stuff_with_errors(*args):                print args            do_stuff_with_errors(train_error,cv_error,test_error)        # Run Train Step        sess.run(fetches=train_step,train_phase: True})    # List of booleans indicating correct predictions    correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))    # accuracy    accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))    print(sess.run(accuracy,Feed_dict={x: mnist.test.images,y_: mnist.test.labels,train_phase: False}))

当我运行它时,我得到：

Extracting MNIST_data/train-images-IDx3-ubyte.gzExtracting MNIST_data/train-labels-IDx1-ubyte.gzExtracting MNIST_data/t10k-images-IDx3-ubyte.gzExtracting MNIST_data/t10k-labels-IDx1-ubyte.gz(2.3474066,2.3498712,2.3461707)(0.49414295,0.88536006,0.91152304)(0.51632041,0.393666,nan)0.9296

它曾经是最后一个是南,现在只有少数几个.一切都好还是我是偏执狂？

解决方法我不确定这是否能解决您的问题,Batchnorm的文档不是很容易使用/提供信息,所以这里简单回顾一下如何使用简单的Batchnorm：

首先,您定义Batchnorm图层.如果你想在仿射/完全连接的层之后使用它,你可以这样做(只是一个例子,订单可以根据需要不同)：

...inputs = tf.matmul(inputs,W) + binputs = tf.layers.batch_normalization(inputs,training=is_training)inputs = tf.nn.relu(inputs)...

函数tf.layers.batch_normalization调用变量初始值设定项.这些是内部变量,需要调用一个特殊的范围,它位于tf.GraphKeys.UPDATE_OPS中.因此,您必须按如下方式调用优化程序函数(在定义了所有图层之后！)：

...extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)with tf.control_dependencIEs(extra_update_ops):    trainer = tf.train.AdamOptimizer()     updateModel = trainer.minimize(loss,global_step=global_step)...

你可以阅读更多关于它here.我知道回答你的问题有点晚了,但它可能会帮助其他人在tensorflow中遇到Batchnorm问题！总结

以上是内存溢出为你收集整理的python – 如何使用TensorFlow中的官方批量标准化层？全部内容，希望文章能够帮你解决python – 如何使用TensorFlow中的官方批量标准化层？所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/1206512.html

python – 如何使用TensorFlow中的官方批量标准化层？

发表评论

评论列表（0条）