在他们的github issues上进行了一些痛苦的挖掘后,似乎需要一个tf.cond来正确使用它并且还有一个’resue = True’标志,以便BN移位和缩放变量被正确地重复使用.在弄清楚之后,我提供了一个小小的描述,说明我认为是使用它的正确方法here.
现在我已经编写了一个简短的脚本来测试它(只有一个层和一个ReLu,很难让它比这个小).但是,我不是百分百确定如何测试它.现在我的代码运行时没有错误消息,但意外返回NaN.这降低了我对我在其他帖子中提供的代码可能正确的信心.或许我所拥有的网络很奇怪.无论哪种方式,有人知道什么是错的?这是代码:
import tensorflow as tf# download and install the MNIST data automaticallyfrom tensorflow.examples.tutorials.mnist import input_datafrom tensorflow.contrib.layers.python.layers import batch_norm as batch_normdef batch_norm_layer(x,train_phase,scope_bn): bn_train = batch_norm(x,decay=0.999,center=True,scale=True,is_training=True,reuse=None,# is this right? trainable=True,scope=scope_bn) bn_inference = batch_norm(x,is_training=False,reuse=True,scope=scope_bn) z = tf.cond(train_phase,lambda: bn_train,lambda: bn_inference) return zdef get_NN_layer(x,input_dim,output_dim,scope,train_phase): with tf.name_scope(scope+'vars'): W = tf.Variable(tf.truncated_normal(shape=[input_dim,output_dim],mean=0.0,stddev=0.1)) b = tf.Variable(tf.constant(0.1,shape=[output_dim])) with tf.name_scope(scope+'Z'): z = tf.matmul(x,W) + b with tf.name_scope(scope+'BN'): if train_phase is not None: z = batch_norm_layer(z,scope+'BN_unit') with tf.name_scope(scope+'A'): a = tf.nn.relu(z) # (M x D1) = (M x D) * (D x D1) return amnist = input_data.read_data_sets("MNIST_data/",one_hot=True)# placeholder for datax = tf.placeholder(tf.float32,[None,784])# placeholder that turns BN during training or off during inferencetrain_phase = tf.placeholder(tf.bool,name='phase_train')# variables for parametershIDen_units = 25layer1 = get_NN_layer(x,input_dim=784,output_dim=hIDen_units,scope='layer1',train_phase=train_phase)# create modelW_final = tf.Variable(tf.truncated_normal(shape=[hIDen_units,10],stddev=0.1))b_final = tf.Variable(tf.constant(0.1,shape=[10]))y = tf.nn.softmax(tf.matmul(layer1,W_final) + b_final)### trainingy_ = tf.placeholder(tf.float32,10])cross_entropy = tf.reduce_mean( -tf.reduce_sum(y_ * tf.log(y),reduction_indices=[1]) )train_step = tf.train.GradIEntDescentoptimizer(0.5).minimize(cross_entropy)with tf.Session() as sess: sess.run(tf.initialize_all_variables()) steps = 3000 for iter_step in xrange(steps): #Feed_dict_batch = get_batch_Feed(X_train,Y_train,M,phase_train) batch_xs,batch_ys = mnist.train.next_batch(100) # Collect model statistics if iter_step%1000 == 0: batch_xstrain,batch_xstrain = batch_xs,batch_ys #simualtes train data batch_xcv,batch_ycv = mnist.test.next_batch(5000) #simualtes CV data batch_xtest,batch_ytest = mnist.test.next_batch(5000) #simualtes test data # do inference train_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xs,y_:batch_ys,train_phase: False}) cv_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xcv,y_:batch_ycv,train_phase: False}) test_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xtest,y_:batch_ytest,train_phase: False}) def do_stuff_with_errors(*args): print args do_stuff_with_errors(train_error,cv_error,test_error) # Run Train Step sess.run(fetches=train_step,train_phase: True}) # List of booleans indicating correct predictions correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1)) # accuracy accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32)) print(sess.run(accuracy,Feed_dict={x: mnist.test.images,y_: mnist.test.labels,train_phase: False}))
当我运行它时,我得到:
Extracting MNIST_data/train-images-IDx3-ubyte.gzExtracting MNIST_data/train-labels-IDx1-ubyte.gzExtracting MNIST_data/t10k-images-IDx3-ubyte.gzExtracting MNIST_data/t10k-labels-IDx1-ubyte.gz(2.3474066,2.3498712,2.3461707)(0.49414295,0.88536006,0.91152304)(0.51632041,0.393666,nan)0.9296
它曾经是最后一个是南,现在只有少数几个.一切都好还是我是偏执狂?
解决方法 我不确定这是否能解决您的问题,Batchnorm的文档不是很容易使用/提供信息,所以这里简单回顾一下如何使用简单的Batchnorm:首先,您定义Batchnorm图层.如果你想在仿射/完全连接的层之后使用它,你可以这样做(只是一个例子,订单可以根据需要不同):
...inputs = tf.matmul(inputs,W) + binputs = tf.layers.batch_normalization(inputs,training=is_training)inputs = tf.nn.relu(inputs)...
函数tf.layers.batch_normalization调用变量初始值设定项.这些是内部变量,需要调用一个特殊的范围,它位于tf.GraphKeys.UPDATE_OPS中.因此,您必须按如下方式调用优化程序函数(在定义了所有图层之后!):
...extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)with tf.control_dependencIEs(extra_update_ops): trainer = tf.train.AdamOptimizer() updateModel = trainer.minimize(loss,global_step=global_step)...
你可以阅读更多关于它here.我知道回答你的问题有点晚了,但它可能会帮助其他人在tensorflow中遇到Batchnorm问题! 总结
以上是内存溢出为你收集整理的python – 如何使用TensorFlow中的官方批量标准化层?全部内容,希望文章能够帮你解决python – 如何使用TensorFlow中的官方批量标准化层?所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)