如何使用caffe进行简单的分类，有哪些方式_教程

有两种方式

使用c++的方式

在caffe根目录下的 examples/cpp-classification/ 文件夹下面，有个classification.cpp文件，就是用来分类的。当然编译后，放在/build/examples/cpp_classification/ 下面

我们就直接运行命令：

# sudo ./build/examples/cpp_classification/classification.bin \ models/bvlc_reference_caffenet/deploy.prototxt \ models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel \ data/ilsvrc12/imagenet_mean.binaryproto \ data/ilsvrc12/synset_words.txt \ examples/images/cat.jpg

命令很长，用了很多的\符号来换行。可以看出，从第二行开始就是参数，每行一个，共需要4个参数

运行成功后，输出top-5结果：

---------- Prediction for examples/images/cat.jpg ----------0.3134 - "n02123045 tabby, tabby cat"0.2380 - "n02123159 tiger cat"0.1235 - "n02124075 Egyptian cat"0.1003 - "n02119022 red fox, Vulpes vulpes"0.0715 - "n02127052 lynx, catamount"

即有0.3134的概率为tabby cat, 有0.2380的概率为tiger cat ......

使用python的方式

运行这个文件必需两个参数，一个输入图片文件，一个输出结果文件。而且运行必须在python目录下。假设当前目录是caffe根目录，则运行：

# cd python# sudo python classify.py ../examples/images/cat.jpg result.npy

何Caffe配置每层结构近刚电脑装Caffe由于神经中国络同层结构同类型层同参数所根据Caffe官中国说明文档做简单总结 1. Vision Layers 1.1 卷积层(Convolution) 类型：CONVOLUTION 例 layers { name: "conv1" type: CONVOLUTION bottom: "data" top: "conv1" blobs_lr: 1 # learning rate multiplier for the filters blobs_lr: 2 # learning rate multiplier for the biases weight_decay: 1 # weight decay multiplier for the filters weight_decay: 0 # weight decay multiplier for the biases convolution_param { num_output: 96 # learn 96 filters kernel_size: 11# each filter is 11x11 stride: 4 # step 4 pixels between each filter application weight_filler { type: "gaussian" # initialize the filters from a Gaussian std: 0.01# distribution with stdev 0.01 (default mean: 0) } bias_filler { type: "constant" # initialize the biases to zero (0) value: 0 } } } blobs_lr: 习率调整参数面例设置权重习率运行求解器给习率同偏置习率权重两倍 weight_decay：卷积层重要参数必须参数： num_output (c_o)：滤器数 kernel_size (or kernel_h and kernel_w)：滤器选参数： weight_filler [default type: 'constant' value: 0]：参数初始化 bias_filler：偏置初始化 bias_term [default true]：指定否否启偏置项 pad (or pad_h and pad_w) [default 0]：指定输入每边加少像素 stride (or stride_h and stride_w) [default 1]：指定滤器步 group (g) [default 1]: If g >1, we restrict the connectivityof each filter to a subset of the input. Specifically, the input and outputchannels are separated into g groups, and the ith output group channels will beonly connected to the ith input group channels. 通卷积变化：输入：n * c_i * h_i * w_i 输：n * c_o * h_o * w_o其h_o = (h_i + 2 * pad_h - kernel_h) /stride_h + 1w_o通同计算 1.2 池化层（Pooling）类型：POOLING 例 layers { name: "pool1" type: POOLING bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 # pool over a 3x3 region stride: 2 # step two pixels (in the bottom blob) between pooling regions } } 卷积层重要参数必需参数： kernel_size (or kernel_h and kernel_w)：滤器选参数： pool [default MAX]：pooling目前MAX, AVE, STOCHASTIC三种 pad (or pad_h and pad_w) [default 0]：指定输入每遍加少像素 stride (or stride_h and stride_w) [default1]：指定滤器步通池化变化：输入：n * c_i * h_i * w_i 输：n * c_o * h_o * w_o其h_o = (h_i + 2 * pad_h - kernel_h) /stride_h + 1w_o通同计算 1.3 Local Response Normalization (LRN) 类型：LRN Local ResponseNormalization局部输入区域进行归化（激a加归化权重（母部）新激b）两种同形式种输入区域相邻channels（cross channel LRN）另种同channel内空间区域（within channel LRN）计算公式：每输入除选参数： local_size [default 5]：于cross channel LRN需要求邻近channel数量；于within channel LRN需要求空间区域边 alpha [default 1]：scaling参数 beta [default 5]：指数 norm_region [default ACROSS_CHANNELS]: 选择哪种LRNACROSS_CHANNELS 或者WITHIN_CHANNEL 2. Loss Layers 深度习通化输目标Loss驱习 2.1 Softmax 类型: SOFTMAX_LOSS 2.2 Sum-of-Squares / Euclidean 类型: EUCLIDEAN_LOSS 2.3 Hinge / Margin 类型: HINGE_LOSS 例： # L1 Norm layers { name: "loss" type: HINGE_LOSS bottom: "pred" bottom: "label" } # L2 Norm layers { name: "loss" type: HINGE_LOSS bottom: "pred" bottom: "label" top: "loss" hinge_loss_param { norm: L2 } } 选参数： norm [default L1]: 选择L1或者 L2范数输入： n * c * h * wPredictions n * 1 * 1 * 1Labels 输 1 * 1 * 1 * 1Computed Loss 2.4 Sigmoid Cross-Entropy 类型：SIGMOID_CROSS_ENTROPY_LOSS 2.5 Infogain 类型：INFOGAIN_LOSS 2.6 Accuracy and Top-k 类型：ACCURACY 用计算输目标确率事实loss且没backward步 3. 激励层（Activation / Neuron Layers）般说激励层element-wise *** 作输入输相同般情况非线性函数 3.1 ReLU / Rectified-Linear and Leaky-ReLU 类型: RELU 例: layers { name: "relu1" type: RELU bottom: "conv1" top: "conv1" } 选参数： negative_slope [default 0]:指定输入值于零输 ReLU目前使用做激励函数主要其收敛更快并且能保持同效标准ReLU函数max(x, 0)般x >0输xx <= 0输negative_slopeRELU层支持in-place计算意味着bottom输输入相同避免内存消耗 3.2 Sigmoid 类型: SIGMOID 例: layers { name: "encode1neuron" bottom: "encode1" top: "encode1neuron" type: SIGMOID } SIGMOID 层通 sigmoid(x) 计算每输入x输函数图 3.3 TanH / Hyperbolic Tangent 类型: TANH 例: layers { name: "encode1neuron" bottom: "encode1" top: "encode1neuron" type: SIGMOID } TANH层通 tanh(x) 计算每输入x输函数图 3.3 Absolute Value 类型: ABSVAL 例: layers { name: "layer" bottom: "in" top: "out" type: ABSVAL } ABSVAL层通 abs(x) 计算每输入x输 3.4 Power 类型: POWER 例： layers { name: "layer" bottom: "in" top: "out" type: POWER power_param { power: 1 scale: 1 shift: 0 } } 选参数： power [default 1] scale [default 1] shift [default 0] POWER层通 (shift + scale * x) ^ power计算每输入x输 3.5 BNLL 类型: BNLL 例： layers { name: "layer" bottom: "in" top: "out" type: BNLL } BNLL (binomial normal log likelihood) 层通 log(1 + exp(x)) 计算每输入x输 4. 数据层（Data Layers）数据通数据层进入Caffe数据层整中国络底部数据自高效数据库（LevelDB 或者 LMDB）直接自内存追求高效性HDF5或者般图像格式硬盘读取数据 4.1 Database 类型：DATA 必须参数： source:包含数据目录名称 batch_size:处理输入数量选参数： rand_skip:始候输入跳数值异步随机梯度降（SGD）候非用 backend [default LEVELDB]: 选择使用 LEVELDB 或者 LMDB 4.2 In-Memory 类型: MEMORY_DATA 必需参数： batch_size, channels, height, width: 指定内存读取数据 The memory data layer reads data directly from memory, without copying it. In order to use it, one must call MemoryDataLayer::Reset (from C++) or Net.set_input_arrays (from Python) in order to specify a source of contiguous data (as 4D row major array), which is read one batch-sized chunk at a time. 4.3 HDF5 Input 类型: HDF5_DATA 必要参数： source:需要读取文件名 batch_size：处理输入数量 4.4 HDF5 Output 类型: HDF5_OUTPUT 必要参数： file_name: 输文件名 HDF5作用节其层输入blobs写硬盘 4.5 Images 类型: IMAGE_DATA 必要参数： source: text文件名字每行给张图片文件名label batch_size: batch图片数量选参数： rand_skip：始候输入跳数值异步随机梯度降（SGD）候非用 shuffle [default false] new_height, new_width: 所图像resize 4.6 Windows 类型：WINDOW_DATA 4.7 Dummy 类型：DUMMY_DATA Dummy 层用于development debugging具体参数DummyDataParameter 5. 般层（Common Layers） 5.1 全连接层Inner Product 类型：INNER_PRODUCT 例： layers { name: "fc8" type: INNER_PRODUCT blobs_lr: 1 # learning rate multiplier for the filters blobs_lr: 2 # learning rate multiplier for the biases weight_decay: 1 # weight decay multiplier for the filters weight_decay: 0 # weight decay multiplier for the biases inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } bottom: "fc7" top: "fc8" } 必要参数： num_output (c_o)：滤器数选参数： weight_filler [default type: 'constant' value: 0]：参数初始化 bias_filler：偏置初始化 bias_term [default true]：指定否否启偏置项通全连接层变化：输入：n * c_i * h_i * w_i 输：n * c_o * 1 *1 5.2 Splitting 类型：SPLIT Splitting层输入blob离输blobs用需要blob输入输层候 5.3 Flattening 类型：FLATTEN Flattening输入n * c * h * w变简单向量其 n * (c*h*w) * 1 * 1 5.4 Concatenation 类型：CONCAT 例： layers { name: "concat" bottom: "in1" bottom: "in2" top: "out" type: CONCAT concat_param { concat_dim: 1 } } 选参数： concat_dim [default 1]：0代表链接num1代表链接channels 通全连接层变化：输入：1K每blobn_i * c_i * h * w 输： concat_dim = 0: (n_1 + n_2 + + n_K) *c_1 * h * w需要保证所输入c_i 相同 concat_dim = 1: n_1 * (c_1 + c_2 + +c_K) * h * w需要保证所输入n_i 相同通Concatenation层blobs链接blob 5.5 Slicing The SLICE layer is a utility layer that slices an input layer to multiple output layers along a given dimension (currently num or channel only) with given slice indices. 5.6 Elementwise Operations 类型：ELTWISE 5.7 Argmax 类型：ARGMAX 5.8 Softmax 类型：SOFTMAX 5.9 Mean-Variance Normalization 类型：MVN 6. 参考 Caffe

1.下载好来自ImageNet的training和validation数据集合；分别存放在如下的格式：

/path/to/imagenet/train/n01440764/n01440764_10026.JPEG

/path/to/imagenet/val/ILSVRC2012_val_00000001.JPEG

2. 进行一些预处理 *** 作：

cd $CAFFE_ROOT/data/ilsvrc12/

./get_ilsvrc_aux.sh

3.训练数据和测试数据分别放在train.txt和val.txt中，里面有他们的文件和相对应的标签；

4. 最后作者把1000类的类名用0--999表示，他们相对应的类别名称则用synset_words.txt 来存储他们之间的映射。

5.作者提到怎么去是否应该先把图像都归一化到256*256中，作者提到用Mapreduce去加快这种过程；

也可以直接这么做：

for name in /path/to/imagenet/val/*.JPEGdo

convert -resize 256x256\! $name $name

Done

6.在 create_imagenet.sh中设置训练的参数，并在里面指定训练和测试的数据库路径，如果图像没有提前归一化到相同的大小，则需要加”RESIZE=true“，设置”GLOG_logtostderr=1 “表示了可以参考更多的信息，

在执行 ./create_imagenet.sh 之后会有新的数据文件生成：

ilsvrc12_train_leveldb 和 ilsvrc12_val_leveldb

7. 因为模型需要我们减去图像的均值，所以我们需要计算图像均值，在工具

tools/compute_image_mean.cpp 实现了这种 *** 作，

或者可以直接用：

./make_imagenet_mean.sh 脚本来进行计算图像均值，并生成：

data/ilsvrc12/imagenet_mean.binaryproto 文件

8.定义网络的结构：imagenet_train_val.prototxt .

里面有两行指定了数据库和图像的路径

source: "ilvsrc12_train_leveldb"

mean_file:"../../data/ilsvrc12/imagenet_mean.binaryproto"

并且指定了 include { phase: TRAIN } or include { phase: TEST } .来区分训练和测试

9.关于输入层的不同：

训练数据中，，data项来自 ilsvrc12_train_leveldb 并且进行了随机镜像 *** 作，测试数据中data项来自于ilsvrc12_val_leveldb 而没有进行随机镜像 *** 作；

10.输出层的不同：

输出层都为 softmax_loss 层，在训练网络当中，用来计算损失函数，并且用来初始化BP过程，测试网络同样有一个第二个输出层，accuracy，它用来报告测试的精度，在训练的过程中，测试网络将实例化并且测试准确率，产成的命令行为：Test score #0: xxx and Test score #1: xxx 等。

11.运行网络，其中设置

每批batch为256个，运行450000次迭代，接近90次epoch；

每1000次迭代，就在用测试集进行测试；

设置初始的学习率为0.01，并且每100000次迭代中进行学习率下降，大概进行20次epoch；

每20次epoch就显示出一些数据信息；

网络训练的动量为0.9，权重衰减因子为0.0005，

每10000次迭代中，就生成当前状态的快照；

这些设置在 examples/imagenet/imagenet_solver.prototxt .中进行设置，并且同样我们需要指定文件的路径：

net: "imagenet_train_val.prototxt"

12.开始训练网络：

./train_imagenet.sh

13. 在K20中，每20个迭代花费36s，所以，一幅图像的一次前馈+反馈（FW+BW）大概需要7ms，前馈花费2.5ms，剩下的是反馈，

可以在 examples/net_speed_benchmark.cpp 中进行时间的查看；

14.因为我们有保存了快照，所以我们可以通过

./resume_training.sh 来进行resume恢复，脚本caffe_imagenet_train_1000.solverstate 保留了要恢复的所有信息，

15.总结，Caffe可以很方便进行通过设置文件的方式来进行设置不同的网络结构。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/tougao/11307433.html

如何使用caffe进行简单的分类，有哪些方式

发表评论

评论列表（0条）