VS2019+Libtorch实现基于C++的图像语义分割_C

一. Libtorch部署

二. 预训练模型加载与使用

三. Semantic Segmentaion

总结：

基于深度学习框架的图像语义分割算法在近年成为主流研究方向，包括像DeepLab系列，U-Net等，在ImageNet，Cityscapes，PASCAL VOC 2012等库上，都获得了不错的性能。在之前的博客中，我们已经介绍了DeepLabV3+（链接：图像分割算法: DeepLabV3+）。我在最近的研究中，希望可以直接在C++上部署类似的图像语义分割程序，并在VS平台集成的项目中使用其强大的图像语义分析功能。然而，结果是令人失望的。深度学习的公开代码，大部分是基于Python平台，结合Pytorch或Tensorflow完成模型的构建和训练。直接能够拿来用的现成代码几乎没有。类似像OpenVino这样的解决方案，看看就让人头疼。没办法，回归到C++接口层面，我决定学习下pytorch的C++版本Libtorch，并尝试实现将python训练的模型直接在VS项目中使用。该博客记录一些相关内容的实现细节，方便大家一起学习。

一. Libtorch部署

Libtorch可以直接在pytorch官网上下载，配置基本类似于OpenCV。

参考博客：https://blog.csdn.net/yanfeng1022/article/details/106481312

下载地址：Start Locally | PyTorch

一般来说，按照默认的建议下载就可以。这里的CUDA对应最新版本为11.3。不过对我来说不是很重要，因为我并不希望用C++程序来训练。下载完成后，就是基础配置。

1）C++ include：你的目录\libtorch\include；你的目录\libtorch\include\torch\csrc\api\include

2) Linker->General-> Addtional Library Directories->\你的目录\libtorch\lib

3) Linker->General->Input->添加Lib：

asmjit.lib；c10.lib；c10_cuda.lib；caffe2_nvrtc.lib；clog.lib；cpuinfo.lib；dnnl.lib；fbgemm.lib；kineto.lib；libprotobufd.lib；libprotobuf-lited.lib；libprotocd.lib；pthreadpool.lib；torch.lib；torch_cpu.lib；torch_cuda.lib；torch_cuda_cpp.lib；torch_cuda_cu.lib；XNNPACK.lib；

4）环境变量：Path->你的路径\libtorch\lib

到这里，libtorch在VS上的部署就完成了，这里放一段测试代码，以检查功能实现是否正常.

#include 
#include  
int main()
{    
    torch::Tensor tensor = torch::rand({ 5,3 });
    std::cout << tensor << std::endl; 
    return EXIT_SUCCESS;
}

打印结果：

二. 预训练模型加载与使用

在完成部署后，接下来就是比较重要的内容了，即如何加载预训练模型到VS项目中，完成语义分析的功能。按照Libtorch文档介绍，我们需要首先完成对模型的训练，并生成.pt文件，以方便C++程序对预训练模型的加载。这里以一个图像语义识别程序为例，来展示一下具体实现。

首先下载Resnet需要使用的ImageNet标签对应文件：

https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt

之后，我们配置一个简单的python3.8环境，并且安装pytorch1.11，torchvision0.12版本。基于该环境，我们加载基于resnet34的预训练模型，并利用TorchScript转换为C++可识别文件格式：

from torchvision.models import resnet34
import torch

#define resnet34 and load ImageNet pretrained
model = resnet34(pretrained=True)
model.eval()
model=model.to(torch.device("cpu"))
model.eval()
var=torch.ones((1,3,224,224))
traced_script_module = torch.jit.trace(model, var)
traced_script_module.save("resnet34.pt")

PyTorch可以通过TorchScript的方式创建序列化和可优化的模型，继而导出相应的模型可以继续被优化，同时被C++所调用。上述代码相当于将预训练的resnet34模型转换为C++可识别的文件格式resnet34.pt。接下来，我们就可以直接利用resnet34.pt在C++中实现识别，代码如下：

#pragma once
#include 
#include 
#include 
#include 
#include 
#include  
#include 
#include 

//load label
std::vector loadResLabel(std::string fileName) {

	std::vector labelList;

	std::fstream out2;
	out2.open(fileName, std::ios::in);
	int numSum = 1000;	

	for (int i = 0; i < numSum; i++) {		
		char line[255];
		out2.getline(line, sizeof(line));
		std::string label(line);
		labelList.push_back(label);
	}
	out2.close();
	return labelList;

}

int main()
{	
	//load label
	std::vector Imagelabel = loadResLabel("Model\\imagenet_classes.txt");
	 
	//cuda or cpu
	auto device = torch::Device(torch::kCPU, 0);
	//read pic
	auto image = cv::imread("_Input\\1.png");
	//resize
	cv::resize(image, image, cv::Size(224, 224));
	//convert to tensor
	auto input_tensor = torch::from_blob(image.data, { image.rows, image.cols, 3 }, torch::kByte).permute({ 2, 0, 1 }).unsqueeze(0).to(torch::kFloat32) / 225.0;
	//load model
	auto model = torch::jit::load("Model\\resnet34.pt");
	model.to(device);
	model.eval();
	//forward the tensor
	auto output = model.forward({ input_tensor.to(device) }).toTensor();
	output = torch::softmax(output, 1);
	std::cout << "predicted class: " << torch::argmax(output) << std::endl;
	int labelNum = torch::argmax(output).item().toInt();
	std::cout << "predicted label: " << Imagelabel[labelNum] << std::endl;
	std::cout << ", prob is: " << output.max() << std::endl;
	return 0;
}

注: 由于我没有配置cuda，所以torch::Device被我修改了参数，以kCPU作为设备参数导入。输出结果如下：

三. Semantic Segmentaion

这里我选择比较流行的DeepLabV3+作为语义分割的主干网络结构，预训练模型可以直接在pytorch官网上下载。首先我们需要生成用于C++的预训练模型，得到deeplabv3.pt。

import torch
import torchvision
from torchvision import models


class wrapper(torch.nn.Module):
    def __init__(self, model):
        super(wrapper, self).__init__()
        self.model = model

    def forward(self, input):
        results = []
        output = self.model(input)
        for k, v in output.items():
            results.append(v)
        return tuple(results)


model = models.segmentation.deeplabv3_resnet101(pretrained=True)
model.eval()
#model = wrapper(model)
# An example input you would normally provide to your model's forward() method.
example = torch.rand(1, 3, 224, 224)

# Use torch.jit.trace to generate a torch.jit.ScriptModule via tracing.
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("deeplabv3.pt")

这里需要注意的是，如果不使用wrapper类会报错。具体原因我也不知道，应该语义分割的数据结构与识别任务有一些不同，为了调整数据格式而使用的。基于得到的pt文件，编写C++程序：

#pragma once
#include 
#include 
#include 
#include 
#include 
#include  
#include 
#include 
#include 

int main() {

	std::string fileName = "1";

    //这里输入你的pt文件路径
	torch::jit::script::Module module = torch::jit::load("Model/deeplabv3.pt");
	module.to(torch::kCPU);
	assert(&module != nullptr);
	std::cout << "ok\n";

	cv::Mat image;

    //输入你要处理的图片
	image = cv::imread("_Input\"+ fileName+".png", 1);
	cv::Mat image_resized;
	cv::resize(image, image_resized, cv::Size(224, 224));
	cv::Mat image_resized_float;
	image_resized.convertTo(image_resized_float, CV_32F, 1.0 / 255);

	auto img_tensor = torch::from_blob(image_resized_float.data, { 1, 224, 224, 3 }).to(torch::kCPU);
	std::cout << "img tensor loaded..\n";


	img_tensor = img_tensor.permute({ 0, 3, 1, 2 });
	img_tensor[0][0] = img_tensor[0][0].sub(0.485).div(0.229);
	img_tensor[0][1] = img_tensor[0][1].sub(0.456).div(0.224);
	img_tensor[0][2] = img_tensor[0][2].sub(0.406).div(0.225);
	auto img_var = torch::autograd::make_variable(img_tensor, false);

	std::vector inputs;
	inputs.push_back(img_var);
	//torch::Tensor out_tensor = module.forward(inputs).toTensor(); //fault

	torch::Tensor result = module.forward(inputs).toTuple()->elements()[0].toTensor();
	torch::Tensor result1 = result[0];
	torch::Tensor result2 = result1.argmax(0);

	std::cout << result.sizes() << std::endl;
	std::cout << result1.sizes() << std::endl;
	std::cout << result2.sizes() << std::endl;
	

	std::vector> imagePred(224);
	for (int i = 0; i < imagePred.size();i++) {
		std::vector imagePred_i(224);
		for (int j = 0; j < imagePred_i.size(); j++) {
			imagePred_i[j] = result2[i][j].item().toFloat();			
		}
		imagePred[i] = imagePred_i;
	}

	std::cout << "transfer to vector, finished!" << std::endl;

	int imgrow = image.rows;
	int imgcol = image.cols;

	for (int i = 0; i < imgrow; i++) {
		int index_row = (float)(224.0 * ((float)i / (float)imgrow));
		for (int j = 0; j < imgcol; j++) {			
			int index_col = (float)(224.0 * ((float)j / (float)imgcol));
			if (index_row >= 224) {
				index_row = 223;			
			}
			if (index_col >= 224) {
				index_col = 223;			
			}
			float class_Label = imagePred[index_row][index_col];
			if (class_Label == 0) {
				continue;			
			}
			else {
				image.at(i, j)[2] = image.at(i, j)[2] / 2;
				int t = image.at(i, j)[1] + 50;
                if (t > 255) {
                    t = 255;                
                }
                image.at(i, j)[1] = t;
				image.at(i, j)[0] = image.at(i, j)[0] / 2;;
			}
		}	
	}

    //输出你的语义分割结果
	std::string savefileName2 = "./_Output/" + fileName + "_Seg" + ".jpg";
	cv::imwrite(savefileName2, image);
	//std::cout << out_tensor.slice(1, 0, 10) << '\n';
	std::cout << "finished" << std::endl;

	return 0;

}

输出结果：

上面这段代码，是我花了整整一天的时间攒出来的。网上的资料确实很少，我读了原文档，又参考了很多人的程序，向各位求个赞吧，调通代码是真心不容易......

我把参考的博客一起贴出来，供参考：

如何在C++使用deeplabv3_resnet101

Libtorch：pytorch分类和语义分割模型在C++工程上的应用

C++部署libtorch时对Tensor块的常用 *** 作API

C2440: “初始化”: 无法从“torch::jit::script::Module”转换...的问题

总结：

Libtorch对于C++程序使用深度学习算法，确实提供了重要的帮助。基于预训练模型的载入与计算，确实为C++工程项目提供了功能强大的语义分析函数。但是，不得不说，相关文档真的少的可怜，我连一个像样的样例都找不到。基本上都是东拼西凑，才勉强跑通上面的程序。希望未来可以多看到一些完整的教程和样例代码。之前我还尝试调一个专门用于分割的，基于Libtorch的分割开源项目，Lintorch-segmentation。试了一下，确实不知道怎么调通。文档里只介绍了在VS上配置Libtorch，但是没有介绍怎么配置segmentation的代码。有兴趣了解的同学可以尝试看看，如果能够调通，应该比我这种笨办法要好。项目链接：LibtorchSegmentation

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/1324490.html

VS2019+Libtorch实现基于C++的图像语义分割

发表评论

评论列表（0条）