环境:
ubuntu-18.04
nvidia驱动-470
cuda-10
cudnn-7.4
tensorflow1.13.1
1.版本信息
cuda和driver版本
tensorflow版本信息
2.安装nvidia驱动
2.1查看本机显卡硬件型号[ubuntu-drivers devices]
推荐安装为版本nvidia-driver-510,我安装的nvidia-driver-470
2.2命令行安装
sudo add-get-repository ppa:graphics-drivers/ppa
sudo apt-get update
ubuntu-drivers devices
sudo ubuntu-drivers autoinstall或者sudo apt-get install nvidia-driver-XX //推荐的版本号
2.3查看安装是否成功[nvidia-smi]
网上找的数据信息:
2.4如有需要设置使用nvidia驱动
我笔记本因为双显卡,安装nvidia驱动后会频繁卡死,好像是在自带的驱动和nvidia驱动之间切换导致,设置使用nvidia驱动就行了
[nvidia-settings]
一些先看查询使用命令:
查看现有显卡
sudo lshw -c display
检查现在正在使用的卡:
prime-select query
如果要使用Intel图形卡,请运行以下命令:
sudo prime-select intel
要切换回Nvidia卡,请运行
sudo prime-select nvidia
如果由于某种原因你不再需要专有驱动程序,可以通过运行以下命令将其删除:
sudo apt purge nvidia-*
sudo apt autoremove
要删除Nvidia驱动程序PPA,请运行:
sudo add-apt-repository --remove ppa:graphics-drivers/ppa
至此,删除Nvidia完成。
3.安装cuda10.0
3.1下载地址:https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda_10.0.130_410.48_linux
3.2安装
sudo chmod +x cuda_10.0.130_410.48_linux.run
sudo ./cuda_10.0.130_410.48_linux.run
安装过程或出现几次提示,大部分默认安装即可,只有在提示是否安装显卡驱动时,填写no,就好,另外将CUDA samples安装在用户的主目录中,方便后面的测试
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?----[选n]
安装结束后,添加系统变量
sudo vi ~./bashrc
然后在文件的最后添加上:
PATH=$PATH:/usr/local/cuda/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib6
export CUDA_HOME=/usr/local/cuda
保存修改后,在终端刷新系统变量
source ~/.bashrc
3.3测试
可以通过以下命令查看CUDA是否配置正确:
测试cuda是否安装成功,进入CUDA_sampmles目录中,然后执行下面的命令:
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
最后一行出现pass则安装成功,cuda可正常使用
如果报错:cudaGetDeviceCount returned 30
双显卡查看是否使用nvidia,找不到显卡,我是之前好的,突然出现这个报错,重启电脑解决
4.安装cudnn7.4
下载地址:https://developer.nvidia.com/rdp/cudnn-download没有账号需要注册
4.1安装cuDNN
安装cudnn比较简单,简单地说,就是复制几个文件:库文件和头文件
将cudnn的头文件复制到cuda安装路径的include路径下,将cudnn的库文件复制到cuda安装路径的lib64路径下。具体 *** 作如下
#解压文件
tar -xvf cudnn-10.0-linux-x64-v7.4.2.24.tgz
cd cuda/
#复制include里的头文件(记得转到include文件里执行下面命令)
sudo cp ./include/cudnn.h /usr/local/cuda/include/
#复制lib64下的lib文件到cuda安装路径下的lib64(记得转到lib64文件里执行下面命令)
sudo cp ./lib64/* /usr/local/cuda/lib64/
#设置权限
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
#以下命名看是否需要修改版本信息和阮链接,我自己安装时没用到
#======更新软连接======
cd /usr/local/cuda/lib64/
#删除原有动态文件,版本号注意变化,可在cudnn的lib64文件夹中查看
sudo rm -rf libcudnn.so libcudnn.so.7
#生成软衔接(注意这里要和自己下载的cudnn版本对应,可以在/usr/local/cuda/lib64下查看自己libcudnn的版本)
sudo ln -s libcudnn.so.7.0.2 libcudnn.so.7
#生成软链接
sudo ln -s libcudnn.so.7 libcudnn.so
sudo ldconfig -v #立刻生效
最后我们看看验证安装cudnn后cuda是否依旧可用
nvcc --version # or nvcc -V
4.2测试
通过cuDNN sample测试一下(https://developer.nvidia.com/rdp/cudnn-archive 页面中找到对应的cudnn版本
下载文件:libcudnn7-doc_7.4.2.24-1+cuda10.0_amd64.deb
执行命令:sudo dpkg -i libcudnn7-doc_7.4.2.24-1+cuda10.0_amd64.deb
默认安装目录:/usr/src/cudnn_samples_v7
#复制样本文件到local文件夹下
sudo cp -r /usr/src/cudnn_samples_v7/ /usr/local
#进入到样本目录
cd /usr/local/cudnn_samples_v7/mnisrCUDNN
#编译
make clean && make
#执行测试,看是否成功,出现Test Passed表示成功
./mnistCUDNN
5.安装anaconda
Anaconda是python的一个科学计算发行版,内置了数百个python经常会使用的库
也包括许多做机器学习或数据挖掘的库,这些库很多是TensorFlow的依赖库。安装好Anaconda可以提供一个好的环境直接安装TensorFlow。
下载地址:https://www.anaconda.com/download/
Anaconda3-2021.11-Linux-x86_64.sh
chmod +x #后面是下载的sh安装文件
./#后面是下载的sh文件名称,#比如
chmod +x Anaconda3-2021.11-Linux-x86_64.sh
./Anaconda3-2021.11-Linux-x86_64.sh
安装时安装在个人目录下即可,比如说~/anaconda3。提示是否修改./bashrc文件时,建议输入“yes”。
修改conda源,将其修改为国内源,提高下载安装速度。首先打开~/.condarc文件,没有的话,创建一个即可,修改文件内容如下:
channels:
- defaults
show_channel_urls: true
channel_alias: https://mirrors.tuna.tsinghua.edu.cn/anaconda
default_channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/pro
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
bioconda: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
simpleitk: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
如果清华的conda源不能使用的问题,如果再遇到这种情况,可以改用上交的源:
channels:
- https://mirrors.sjtug.sjtu.edu.cn/anaconda/pkgs/main/
- https://mirrors.sjtug.sjtu.edu.cn/anaconda/pkgs/free/
- defaults
show_channel_urls: true
5.1去掉base字段
重新打开终端,正常情况下,在用户名前会有“base”字样,说明已经进入了conda虚拟环境,如果没有则执行“conda init”命令即可
如果想去掉base,修改文件[~/.bashrc],添加conda deactivate
6.安装tensorflow1.13.1
6.1创建环境
conda create -n tensorflow python=3.7
conda activate tensorflow
#顺便安装相应的库
#sudo apt-get install python3-pip
#pip install numpy scipy matplotlib pylint
6.2安装tensorflow-gpu,1.13.1
sudo pip3 install tensorflow-gpu==1.13.1 -i https://mirror.baidu.com/pypi/simple
6.3测试
使用python指令进入python编译环境,测试能否使用gpu
python
import tensorflow as tf
#本步骤有警告属于正常现象,强迫症可以根据提示将响应文件后面括号中的“1”改为“(1,)”,这是因为python本班的问题引起的
#输出ture,表示可以使用GPU了
print(tf.test.is_gpu_available())
tf.__version__
我安装时候出错: module 'tensorflow' has no attribute '__version__'
重新安装解决问题:
pip3 uninstall tensorflow-gpu==1.13.1
pip3 install tensorflow-gpu==1.13.1
tf.__version__
>>> tf.__version__
6.4 include和lib地址
include:python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())'
/home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/include
lib:python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())'
/home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow
7.pointnet++模型
7.1下载地址:https://github.com/charlesq34/pointnet2解压
7.2编译tf_ops文件夹
该文件夹中有三个文件夹3d_interpolation,grouping和sampling,修改方式差不多以sampling为例:进入文件夹sampling,打开文件tf_sampling_compile.sh
a:因为我安装的1.13.1版本,所以将TF1.2注释调,打开TF1.4
b:nvcc修改为自己nvcc的地址
修改前:/usr/local/cuda-8.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
修改后:/usr/local/cuda/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
c:编译参数地址修改
修改前:# TF1.4
#g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I /usr/local/lib/python2.7/dist-packages/tensorflow/include -I /usr/local/cuda-8.0/include -I /usr/local/lib/python2.7/dist-packages/tensorflow/include/external/nsync/public -lcudart -L /usr/local/cuda-8.0/lib64/ -L/usr/local/lib/python2.7/dist-packages/tensorflow -ltensorflow_framework -O2 -D_GLIBCXX_USE_CXX11_ABI=0
修改后:# TF1.4
g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I /home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/include -I /usr/local/cuda/include -I /home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/include/external/nsync/public -lcudart -L /usr/local/cuda/lib64/ -L /home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow -l:libtensorflow_framework.so -O2 #-D_GLIBCXX_USE_CXX11_ABI=0
/home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow内如果有libtensflow文件,需要把-ltensorflow_framework改成l:文件名,如l:libtensorflow_framework.so.1,如果电脑gcc版本高于4.0,把 -D_GLIBCXX_USE_CXX11_ABI=0注释掉即可。
将代码中的CUDA路径和版本改成自己环境的版本,同时将路径用变量表示,自己修改的sh文件如下:
#/bin/bash
/usr/local/cuda/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
# TF1.2
#g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I /usr/local/lib/python2.7/dist-packages/tensorflow/include -I /usr/local/cuda-8.0/include -lcudart -L /usr/local/cuda-8.0/lib64/ -O2 -D_GLIBCXX_USE_CXX11_ABI=0
# TF1.4
g++ -std=c++11 tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -shared -fPIC -I /home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/include -I /usr/local/cuda/include -I /home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/include/external/nsync/public -lcudart -L /usr/local/cuda/lib64/ -L /home/mooe/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow -l:libtensorflow_framework.so -O2 #-D_GLIBCXX_USE_CXX11_ABI=0
d:编译sh文件
sh tf_sampling_compile.sh
会在当前文件夹下生成tf_sampling_so.so文件
7.3 因为python版本3.7,需要将xrange改为range,print函数后全部用()括起来
8.物体形状分类任务(Shape Classfication)
8.1下载数据解压到data文件夹下
训练数据hdf5格式:https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip
训练数据normal_resampled:https://shapenet.cs.stanford.edu/media/modelnet40_normal_resampled.zip
modelnet40_normal_resampled.zip 解压后下边有40个子文件是不同的物体类别,每一个类别里边是超级多的txt文件,一个txt文件代表的是每一个物体类别的一个点云数据。其中每一行代表的是这一个点的位置和法向量信息,(每一行中的前三个分别对应xyz的坐标值,当然都是经过归一化-1和1之间的值)然后是法向量的信息,也是三个值因为法向量是在这个点垂直与切平面的向量的信息,也是有3个值,所以每一行一共是6个值。
normal_resampled数据为包含法向量的点云可视数据,需要将modelnet40_normal_resampled/下的modelnet40_shape_names.txt文件改名为shape_names.txt,使用python train.py命令训练的时候不设置参数--normal将会使用hdf5格式数据进行训练模型。
8.2训练模型
如托提示显卡内存不够,需要设置batch_size大小
python train.py --max_epoch 11 --batch_size 8
训练完会自动生成log文件夹,里面保存了日志文件和网络参数。
checkpoint文件是检查点文件,文件保存了一个目录下所有模型文件列表。
model.ckpt.data文件保存了TensorFlow程序中每一个变量的取值
model.ckpt.index文件则保存了TensorFlow程序中变量的索引
model.ckpt.meta文件则保存了TensorFlow计算图的结构(可以简单理解为神经网络的网络结构),该文件可以被 tf.train.import_meta_graph 加载到当前默认的图来使用
log_train.txt文件保存训练过程的log
8.3可视化训练模型
a.log文件
tensorboard --logdir ./log
预览器进入地址:
loss会降低,accuracy会增高
graphs图
b.模型文件
python show3d_balls.py
报错:ModuleNotFoundError: No module named 'cv2'
解决:pip3 install opencv-python
8.4预测
python evaluate.py --num_votes 12
9. tensorflow1.13.1 c++编译及API调用
9.1安装对应版本bazel
地址:https://github.com/bazelbuild/bazel/releases
下载deb格式文件bazel_0.21.0-linux-x86_64.deb
sudo dpkg -i bazel_0.21.0-linux-x86_64.deb
9.2tensorflow1.13.1 c++编译
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.10
sudo ./configure
此步骤配置,我安装cuda版本,cuda和cudnn的版本根据之前的设置进行配置,需要根据自己的实际情况进行配置:
9.3编译c++ so文件
bazel build --config=opt --config=cuda //tensorflow:libtensorflow_cc.so
编译完成,会在bazel-bin/tensorflow中生成两个库文件:
libtensorflow_cc.so、libtensorflow_framework.so。
若没有libtensorflow_framework.so,则再执行以下命令:
bazel build --config=opt --config=cuda //tensorflow:libtensorflow_framework.so
9.4下载tensorflow相关依赖
下载相关依赖文件,运行:
./tensorflow/contrib/makefile/download_dependencies.sh
此步,极有可能因网速等其它原因无法下载,会提示以下错误:
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
若出现上述错误,打开文件.tensorflow/tensorflow/workspace.bzl文件中查询并下载对应文件,并且只能按照这个文件中的版本地址进行下载,不可用其它版本代替,不然会出现很多错误。分别下载相应文件并解压到目录tensorflow/tensorflow/contrib/makefile/downloads,下载的文件名如下:eigen,gemmlowp,googletest,nsync,protobuf,re2,fft2d,double_conversion,absl,cub
9.5 编译依赖文件
cd tensorflow/contrib/makefile
gedit build_all_linux.sh
注释掉30、33行,不然会自动删除第六步中已下载好的文件,注释如下所示:
#rm -rf tensorflow/contrib/makefile/downloads
#tensorflow/contrib/makefile/download_dependencies.sh
运行编译:./build_all_linux.sh,编译完成后,会生成一个gen文件夹
若编译中,出现错误:./autogen.sh: 4: ./autogen.sh: autoreconf: not found
则执行以下命令,再./build_all_linux.sh
sudo apt-get install autoconf
sudo apt-get install automake
sudo apt-get install libtool
10.pointnet++中ckpt.meta冻结pb文件
TensorFlow 为我们提供了 convert_variables_to_constants() 方法,该方法可以固化模型结构,将计算图中的变量取值以常量的形式保存,而且保存的模型可以移植到其它平台。
将CKPT转换成 PB格式的文件的过程如下:
a,通过传入 CKPT模型的路径得到模型的图和变量数据
b,通过 import_meta_graph 导入模型中的图
c,通过saver.restore 从模型中恢复图中各个变量的数据
d,通过 graph_util.convert_variables_to_constants 将模型持久化
函数 freeze_graph中,最重要的就是要确定“指定输出的节点名称”,这个节点名称必须是原模型中存在的节点,对于 freeze *** 作,我们需要定义输出节点的名字。因为网络其实是比较复杂的,定义了输出节点的名字,那么freeze *** 作的时候就只把输出该节点所需要的子图都固化下来,其他无关的就舍弃掉。因为我们 freeze 模型的目的是接下来做预测,所以 output_node_names 一般是网络模型最后一层输出的节点名称,或者说我们预测的目标。
cpkt模型冻结成pb模型的代码是一样的,就输入名称不同而已,通用的代码如下:
import tensorflow as tf
meta_path = 'model/Module.pd.meta' # Your .pd.meta file
#meta_path = 'model/Module.cpkt.meta' # Your .cpkt.meta file
output_node_names = ["output/Sigmoid"] # Output nodes
with tf.Session() as sess:
# Restore the graph
saver = tf.train.import_meta_graph(meta_path)
# Load weights
saver.restore(sess,tf.train.latest_checkpoint('model/'))
# Freeze the graph
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph_def,
output_node_names)
# Save the frozen graph
with open('Module.pb', 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
代码比较简单的,主要就是确定模型的位置路径(meta_path)和最后一层网络输出的节点名称(output_node_names)。
10.1冻结方法
往往容易在输出节点名称的确定中出问题。查看网络的输出节点总共有3种方法,包括:
10.2查看代码确定:
这种方法就是根据自己写的代码所设定的输出节点名称来确定的。但是,如果自己在代码中忘记设计节点名称,网络就会使用默认的名称,这时候就不好用这种方法。还有一种情况是自己填上了设置的名称还是报错,这种情况是没有考虑名称的嵌套。
分类器设置的输出节点名称为“output”,但是因为前面with tf.name_scope函数中设置了名称为“score”,则它的输出节点名称应该为“score/ output”,类似于路径一样。因此,在查看代码节点名称时应该注意此情况。
针对pointnet++模型,loss和pred为输出,feed_dict数组中为输入,将之打印出来就可以知道输出节点
loss_val, pred_val = sess.run([ops['loss'], ops['pred']], feed_dict=feed_dict)
print('')
print('loss:')
print(ops['loss'])
print(loss_val)
print('pred:')
print(ops['pred'])
print('pointclouds_pl:')
print(ops['pointclouds_pl'])
print('labels_pl:')
print(ops['labels_pl'])
print('is_training_pl:')
print(ops['is_training_pl'])
由于我使用batch训练的模型,python evaluate.py --num_votes 12 --batch_size 8
10.3通过代码输出相应的tensor名称查找:
通过下面的代码,读取pd或cpkt模型可以将模型中所以节点的tensor名称输出出来,以供查找。
from tensorflow.python import pywrap_tensorflow
def check_out_pb_name(checkpoint_path):
reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()
for key in var_to_shape_map:
res = reader.get_tensor(key)
print('tensor_name: ', key)
print('a.shape: %s'%[res.shape])
if __name__ == '__main__':
# 输入ckpt模型路径
checkpoint_path = './model.ckpt'
check_out_pb_name(checkpoint_path)
执行,python XXX
尝试了下,可以输出所有tensor名称,然后再一个个查找。但是当网络比较复杂,设置节点较多时,这种方法并不是太适用。不过此代码可以作为检查模型tensor名称或者获取所有模型tensor名称时使用。
10.4利用tensorboard查看图结构确定节点名称
利用tensorboard可以查看所训练模型的图结构,通过这个方法来确定最后一层网络输出节点名称,是最好的方法。
在终端中输入tensorboard --logdir=模型路径,按回车运行,可以获得打开tensorboard的路径:http://DESKTOP-QOOI0L9:6006/。复制该路径放入浏览器中,即可打开tensorboard。如果使用其他浏览器不能打开或者乱码,建议使用谷歌浏览器。如果谷歌浏览器出现无法访问此网站,可以断开网络再打开。打开tensorboard,切换到GRAPHS,可以看到模型的图结构。要注意,在图中如果是椭圆形框,则表示的是可以识别的tensor,如果是矩形框,则表示的是组合 *** 作,要继续点击到里面去查找。通常最后一层网络输出节点是在“output”中,这是个矩形框,点击进去如下图所示,找到最后输出的椭圆形框。将该最后一层网络输出节点名称写入前面代码中,运行后即可以将pd或cpkt模型冻结成pb模型。
10.5 pointnet++冻结pb代码
报错:KeyError: 'FarthestPointSample'
解决:没找到接口,pointnet++冻结模型需要添加上pointnet++的接口
from pointnet_util import pointnet_sa_module
报错:Cannot assign a device for operation bn_decay: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
解决:配置错误,添加训练时的配置项
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.allow_soft_placement = True
config.log_device_placement = False
sess = tf.Session(config=config)
C++调用pb模型报错:err:Invalid argument: Input 0 of node layer1/conv0/bn/cond_1/AssignMovingAvg/Switch was passed float from layer1/conv0/bn/moving_mean:0 incompatible with expected float_ref.
Create session failed.
解决:在运行freeze_graph.py之后,AssignSub和RefSwitch处理后的输入类型由float_ref变为了float,因此要额外地去转换一下,在图导入之前执行如下代码:
# fix nodes
for node in input_graph_def.node:
if node.op == 'RefSwitch':
node.op = 'Switch'
for index in range(len(node.input)):
if 'moving_' in node.input[index]:
node.input[index] = node.input[index] + '/read'
elif node.op == 'AssignSub':
node.op = 'Sub'
if 'use_locking' in node.attr: del node.attr['use_locking']
参考:https://github.com/onnx/tensorflow-onnx/issues/77
我自己冻结pb代码如下:
import tensorflow as tf
from tensorflow.python.framework import graph_util
import os
import sys
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
ROOT_DIR = BASE_DIR
sys.path.append(os.path.join(ROOT_DIR, '../utils'))
from pointnet_util import pointnet_sa_module
def load_graph_def(model_path, sess=None):
print('load_graph_def start!')
#with tf.device('/gpu:'+str(0)):
print('input path:' + model_path + '.meta')
saver = tf.train.import_meta_graph(model_path + '.meta')
saver.restore(sess, model_path)
print('load_graph_def end!')
def freeze_graph(sess, output_layer_name, output_graph):
print('freeze_graph start!')
#with tf.device('/gpu:'+str(0)):
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
# not add bug
# fix nodes
for node in input_graph_def.node:
if node.op == 'RefSwitch':
node.op = 'Switch'
for index in range(len(node.input)):
if 'moving_' in node.input[index]:
node.input[index] = node.input[index] + '/read'
elif node.op == 'AssignSub':
node.op = 'Sub'
if 'use_locking' in node.attr: del node.attr['use_locking']
# Exporting the graph
print("Exporting graph...")
output_graph_def = graph_util.convert_variables_to_constants(
sess,
input_graph_def,
output_layer_name.split(","))
with tf.gfile.GFile(output_graph, "wb") as f:
f.write(output_graph_def.SerializeToString())
print('freeze_graph end!')
def freeze_from_checkpoint(checkpoint_file, output_layer_name):
print('freeze_from_checkpoint start!')
model_folder = os.path.basename(checkpoint_file)
output_graph = os.path.join(model_folder, checkpoint_file + '.pb')
print('output_graph:' + output_graph)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.allow_soft_placement = True
config.log_device_placement = False
sess = tf.Session(config=config)
load_graph_def(checkpoint_file, sess)
freeze_graph(sess, output_layer_name, output_graph)
print('freeze_from_checkpoint end!')
if __name__ == '__main__':
freeze_from_checkpoint(
checkpoint_file='/home/mooe/lidar_obstacle_detection/11/pointnet2_tf/log_11/model.ckpt',
output_layer_name='total_loss,fc3/BiasAdd,Placeholder,Placeholder_1,Placeholder_2')
#total_loss,fc3/BiasAdd,Placeholder,Placeholder_1,Placeholder_2
10.6 pb文件node输出
import tensorflow as tf
from tensorflow.python.framework import graph_util
import os
import sys
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
ROOT_DIR = BASE_DIR
sys.path.append(BASE_DIR)
sys.path.append(os.path.join(ROOT_DIR, '../utils'))
from pointnet_util import pointnet_sa_module
def create_graph(out_pb_path):
print('create_graph start!')
# 读取并创建一个图graph来存放训练好的模型
with tf.gfile.FastGFile(out_pb_path, 'rb') as f:
# 使用tf.GraphDef() 定义一个空的Graph
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
# Imports the graph from graph_def into the current default Graph.
# fix nodes
for node in graph_def.node:
if node.op == 'RefSwitch':
node.op = 'Switch'
for index in range(len(node.input)):
if 'moving_' in node.input[index]:
node.input[index] = node.input[index] + '/read'
elif node.op == 'AssignSub':
node.op = 'Sub'
if 'use_locking' in node.attr: del node.attr['use_locking']
tf.import_graph_def(graph_def, name='')#
print('create_graph end!')
def check_pb_out_name(out_pb_path, result_file):
create_graph(out_pb_path)
tensor_name_list = [tensor.name for tensor in tf.get_default_graph().as_graph_def().node]
with open(result_file, 'w+') as f:
for tensor_name in tensor_name_list:
#f.write(tensor_name+'\n')
print(tensor_name)
if __name__ == '__main__':
check_pb_out_name('./model.ckpt.pb', './in_out.txt')
可以看到输入输出和中间节点都打印出来了
11.C++调用pointnet++训练的pb模型
11.1加载模型
string model_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2_tf/log_11/model.ckpt.pb";
tensorflow::GraphDef graph_def;
Status load_graph_status = ReadBinaryProto(tensorflow::Env::Default(), model_path, &graph_def);
if (!load_graph_status.ok()) {
std::cout << "err:" << load_graph_status.ToString() << std::endl;
std::cout << "Failed to load compute graph at:" << model_path << std::endl;
return 0;
} else {
std::cout << "Load graph ok!" << std::endl;
}
std::unique_ptr
tensorflow::SessionOptions sess_ops;
//需要这样定义,不用自己释放,否则报错
tensorflow::GPUOptions *gpu_options = new tensorflow::GPUOptions();
gpu_options->set_allow_growth(true);
sess_ops.config.set_allocated_gpu_options(gpu_options);
sess_ops.config.set_allow_soft_placement(true);
sess_ops.config.set_log_device_placement(false);
session.reset(tensorflow::NewSession(sess_ops));
Status session_create_status = session->Create(graph_def);
if (!session_create_status.ok()) {
std::cout << "err:" << session_create_status.ToString() << std::endl;
std::cout << "Create session failed." << std::endl;
return 0;
} else {
std::cout << "Create session ok." << std::endl;
}
11.2组织输入输出参数,需要根据自己训练时的batch修改
// build pointcloud start
string pcd_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2-master/data/modelnet40_normal_resampled/car/car_0002.txt";
// string pcd_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2-master/data/modelnet40_normal_resampled/chair/chair_0001.txt";
// string pcd_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2-master/data/modelnet40_normal_resampled/person/person_0017.txt";
pcl::PointCloud
std::ifstream in(pcd_path);
if (!in.is_open()) {
return 0;
}
std::string line;
while (getline(in, line)) {
double x, y, z, xx, yy, zz;
sscanf(line.c_str(), "%lf,%lf,%lf,%lf,%lf,%lf", &x, &y, &z, &xx, &yy, &zz);
cloud->points.emplace_back(pcl::PointXYZ(x, y, z));
}
std::cout << "Load point size:" << cloud->points.size() << std::endl;
max_pt_num = cloud->points.size();
Tensor placeholder_0(tensorflow::DT_FLOAT, tensorflow::TensorShape({8, max_pt_num, 3}));
auto data_mapped = placeholder_0.tensor
for (int idx = 0; idx < 8; ++idx) {
for(unsigned int i = 0; i < cloud->points.size(); ++i) {
data_mapped(idx, i, 0) = cloud->points[i].x;
data_mapped(idx, i, 1) = cloud->points[i].y;
data_mapped(idx, i, 2) = cloud->points[i].z;
}
}
for (int idx = 0; idx < 8; ++idx) {
for(unsigned int j = cloud->points.size(); j < max_pt_num; ++j) {
for(int k = 0; k < 3; ++k) {
data_mapped(idx, j, k) = 0;
}
}
}
tensorflow::Tensor placeholder_1(tensorflow::DT_INT32, tensorflow::TensorShape({8, }));
auto label_mapped = placeholder_1.tensor
for (int idx = 0; idx < 8; ++idx) {
label_mapped(idx) = 0;
}
tensorflow::Tensor placeholder_2(tensorflow::DT_BOOL, tensorflow::TensorShape());
placeholder_2.scalar
// cout << "Set phase_tensor ......" << endl;
vector
{"Placeholder", placeholder_0},
{"Placeholder_1", placeholder_1},
{"Placeholder_2", placeholder_2}
};
// Actually run the image through the model.
std::vector
11.3训练模型
std::vector
Status run_status = session->Run(inputs,output_layer, {}, &outputs);
if (!run_status.ok()) {
LOG(ERROR) << "Running model failed: " << run_status;
return 0;
} else {
std::cout << "sess run ok!" << std::endl;
}
11.4从模型输出tensor中获得各类别的概率
std::cout << "output size:" << outputs.size() << std::endl;
const auto& totoal_loss_output = outputs[0];
const auto& fc3_biasadd_output = outputs[1];
for (std::size_t i = 0; i < outputs.size(); i++) {
std::cout <<"result: "< }
const auto& total_loss_val = totoal_loss_output.scalar
std::cout << "total loss value:" << total_loss_val << std::endl;
auto fc3_biasadd_map = fc3_biasadd_output.tensor
fc3_biasadd_map(0, 0);
float class_pred[40] = { 0 };
for (int i = 0; i < 1; ++i) {
for (int j = 0; j < 40; ++j) {
std::cout << fc3_biasadd_map(i, j) << ", ";
class_pred[j] += fc3_biasadd_map(i, j);
}
std::cout << std::endl;
}
float max_pred = std::numeric_limits
int max_idx = -1;
for (int i = 0; i < 40; ++i) {
if (class_pred[i] > max_pred) {
max_idx = i;
max_pred = class_pred[i];
}
}
std::cout << "max_idx:" << max_idx << ", max_pred:" << max_pred << std::endl;
11.5所有的cpp文件代码
#include
#include
#include
#include "tensorflow/core/framework/op.h"
#include "tensorflow/cc/ops/const_op.h"
#include "tensorflow/cc/ops/image_ops.h"
#include "tensorflow/cc/ops/standard_ops.h"
#include "tensorflow/core/framework/graph.pb.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/graph/default_device.h"
#include "tensorflow/core/graph/graph_def_builder.h"
#include "tensorflow/core/lib/core/errors.h"
#include "tensorflow/core/lib/core/stringpiece.h"
#include "tensorflow/core/lib/core/threadpool.h"
#include "tensorflow/core/lib/io/path.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/lib/strings/stringprintf.h"
#include "tensorflow/core/platform/env.h"
#include "tensorflow/core/platform/init_main.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/types.h"
#include "tensorflow/core/public/session.h"
#include "tensorflow/core/util/command_line_flags.h"
#include
#include
#include
// These are all common classes it's handy to reference with no namespace.
using tensorflow::Tensor;
using tensorflow::Status;
using tensorflow::string;
using tensorflow::int32;
using namespace std;
int main(int argc, char* argv[])
{
string model_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2_tf/log_11/model.ckpt.pb";
tensorflow::GraphDef graph_def;
Status load_graph_status = ReadBinaryProto(tensorflow::Env::Default(), model_path, &graph_def);
if (!load_graph_status.ok()) {
std::cout << "err:" << load_graph_status.ToString() << std::endl;
std::cout << "Failed to load compute graph at:" << model_path << std::endl;
return 0;
} else {
std::cout << "Load graph ok!" << std::endl;
}
std::unique_ptr
tensorflow::SessionOptions sess_ops;
tensorflow::GPUOptions *gpu_options = new tensorflow::GPUOptions();
gpu_options->set_allow_growth(true);
sess_ops.config.set_allocated_gpu_options(gpu_options);
sess_ops.config.set_allow_soft_placement(true);
sess_ops.config.set_log_device_placement(false);
session.reset(tensorflow::NewSession(sess_ops));
Status session_create_status = session->Create(graph_def);
if (!session_create_status.ok()) {
std::cout << "err:" << session_create_status.ToString() << std::endl;
std::cout << "Create session failed." << std::endl;
return 0;
} else {
std::cout << "Create session ok." << std::endl;
}
// build pointcloud start
string pcd_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2-master/data/modelnet40_normal_resampled/car/car_0002.txt";
// string pcd_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2-master/data/modelnet40_normal_resampled/chair/chair_0001.txt";
// string pcd_path = "/home/mooe/lidar_obstacle_detection/11/pointnet2-master/data/modelnet40_normal_resampled/person/person_0017.txt";
pcl::PointCloud
std::ifstream in(pcd_path);
if (!in.is_open()) {
return 0;
}
std::string line;
while (getline(in, line)) {
double x, y, z, xx, yy, zz;
sscanf(line.c_str(), "%lf,%lf,%lf,%lf,%lf,%lf", &x, &y, &z, &xx, &yy, &zz);
cloud->points.emplace_back(pcl::PointXYZ(x, y, z));
}
std::cout << "Load point size:" << cloud->points.size() << std::endl;
int max_pt_num = cloud->points.size();
Tensor placeholder_0(tensorflow::DT_FLOAT, tensorflow::TensorShape({8, max_pt_num, 3}));
auto data_mapped = placeholder_0.tensor
for (int idx = 0; idx < 8; ++idx) {
for(unsigned int i = 0; i < cloud->points.size(); ++i) {
data_mapped(idx, i, 0) = cloud->points[i].x;
data_mapped(idx, i, 1) = cloud->points[i].y;
data_mapped(idx, i, 2) = cloud->points[i].z;
}
}
for (int idx = 0; idx < 8; ++idx) {
for(unsigned int j = cloud->points.size(); j < max_pt_num; ++j) {
for(int k = 0; k < 3; ++k) {
data_mapped(idx, j, k) = 0;
}
}
}
tensorflow::Tensor placeholder_1(tensorflow::DT_INT32, tensorflow::TensorShape({8, }));
auto label_mapped = placeholder_1.tensor
for (int idx = 0; idx < 8; ++idx) {
label_mapped(idx) = 0;
}
tensorflow::Tensor placeholder_2(tensorflow::DT_BOOL, tensorflow::TensorShape());
placeholder_2.scalar
// cout << "Set phase_tensor ......" << endl;
vector
{"Placeholder", placeholder_0},
{"Placeholder_1", placeholder_1},
{"Placeholder_2", placeholder_2}
};
// Actually run the image through the model.
std::vector
std::vector
Status run_status = session->Run(inputs,output_layer, {}, &outputs);
if (!run_status.ok()) {
LOG(ERROR) << "Running model failed: " << run_status;
return 0;
} else {
std::cout << "sess run ok!" << std::endl;
}
std::cout << "output size:" << outputs.size() << std::endl;
const auto& totoal_loss_output = outputs[0];
const auto& fc3_biasadd_output = outputs[1];
for (std::size_t i = 0; i < outputs.size(); i++) {
std::cout <<"result: "< }
const auto& total_loss_val = totoal_loss_output.scalar
std::cout << "total loss value:" << total_loss_val << std::endl;
auto fc3_biasadd_map = fc3_biasadd_output.tensor
fc3_biasadd_map(0, 0);
float class_pred[40] = { 0 };
for (int i = 0; i < 1; ++i) {
for (int j = 0; j < 40; ++j) {
std::cout << fc3_biasadd_map(i, j) << ", ";
class_pred[j] += fc3_biasadd_map(i, j);
}
std::cout << std::endl;
}
float max_pred = std::numeric_limits
int max_idx = -1;
for (int i = 0; i < 40; ++i) {
if (class_pred[i] > max_pred) {
max_idx = i;
max_pred = class_pred[i];
}
}
std::cout << "max_idx:" << max_idx << ", max_pred:" << max_pred << std::endl;
int kk = 0;
++kk;
}
11.6 CMakeList.txt文件
需要将pointnet++中的四个文件移动到工程中
tf_ops/tf_sampling_g.cu.o
tf_ops/tf_sampling.cpp
tf_ops/tf_grouping_g.cu.o
tf_ops/tf_grouping.cpp
cmake_minimum_required (VERSION 3.5.0)
project (pointnet_cls)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -std=c++11 -W")
find_package(PCL 1.8 REQUIRED)
include_directories(${PCL_INCLUDE_DIRS})
link_directories(${PCL_LIBRARY_DIRS})
include_directories(
/home/mooe/lidar_obstacle_detection/14/tensorflow_1.13.1_gpu
/home/mooe/lidar_obstacle_detection/14/tensorflow_1.13.1_gpu/bazel-genfiles
/home/mooe/lidar_obstacle_detection/14/tensorflow_1.13.1_gpu/tensorflow/contrib/makefile/downloads/absl
/home/mooe/lidar_obstacle_detection/14/tensorflow_1.13.1_gpu/tensorflow/contrib/makefile/gen/protobuf/include
)
link_directories(
/home/mooe/lidar_obstacle_detection/14/tensorflow_1.13.1_gpu/bazel-bin/tensorflow
/usr/local/cuda/lib64
)
add_executable(${PROJECT_NAME}
shape_classfication_node.cpp
tf_ops/tf_sampling_g.cu.o
tf_ops/tf_sampling.cpp
tf_ops/tf_grouping_g.cu.o
tf_ops/tf_grouping.cpp
)
ADD_DEFINITIONS(-DROOT_PATH=\"${PROJECT_SOURCE_DIR}\")
target_link_libraries(${PROJECT_NAME}
${PCL_LIBRARIES}
tensorflow_cc
tensorflow_framework
cudart
)
11.7执行程序,输入车的点云,正确输出7
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)