USIP中的KITTI detector代码阅读_随笔

USIP中的KITTI detector代码阅读一、数据准备（KITTI数据集）

这部分代码在USIP/data/kitti_detector_loader.py中。

1.1 数据读取

首先读取数据的sequence列表，文件路径列表，每个sequence中的scan数量，scans的前项和。

self.seq_list, self.folder_list, self.sample_num_list,self.accumulated_sample_num_list = make_dataset_kitti(root, mode, opt)

其中，seq_list代表了sequence的编号，folder_list代表每个sequence的文件路径，sample_num_list代表每个sequence的scan数量，accumulated_sample_num_list代表了sample_num_list的前n项和。
然后读取数据。

src_pc_np, src_sn_np, src_node_np = self.get_instance_unaugmented_np(index)
dst_pc_np, dst_sn_np, dst_node_np = self.get_instance_unaugmented_np(index)

具体包括：
（1）将index的全局索引转换为在某个sequence中的局部索引；

for i, accumulated_sample_num in enumerate(self.accumulated_sample_num_list):
    if index < accumulated_sample_num:
        break
folder = self.folder_list[i]  # index对应的sequence文件路径
seq = self.seq_list[i]  # index对应sequence编号
# index对应的sequence中的索引，即全局索引转向局部索引
if i == 0:
    index_in_seq = index
else:
    index_in_seq = index - self.accumulated_sample_num_list[i-1]

（2）按照设定的半径阈值对点进行过滤：

if self.opt.radius_threshold < 90:  # 半径阈值
    # camera coordinate
    pc_xz_norm_np = np.linalg.norm(pc_np[:, [0, 2]], axis=1)  # 每个点的径向距离
    pc_radius_mask_np = pc_xz_norm_np <= self.opt.radius_threshold
    pc_np = pc_np[pc_radius_mask_np, :]

（3）对输入点的数量进行检查，若多于规定的输入值，则进行随机采样；否则进行复制：

# random sample 如果scan中的点数多于规定的输入点数，进行随机采样
if pc_np.shape[0] >= self.opt.input_pc_num:
    choice_idx = np.random.choice(pc_np.shape[0], self.opt.input_pc_num, replace=False)
else:  # 如果scan中的点数小于规定的输入点数，复制index，直到再加一次大于规定输入点数，在随机补充剩余的点
    fix_idx = np.asarray(range(pc_np.shape[0])) # 固定的index
    while pc_np.shape[0] + fix_idx.shape[0] < self.opt.input_pc_num:
        fix_idx = np.concatenate((fix_idx, np.asarray(range(pc_np.shape[0]))), axis=0)
    random_idx = np.random.choice(pc_np.shape[0], self.opt.input_pc_num - fix_idx.shape[0], replace=False)
    choice_idx = np.concatenate((fix_idx, random_idx), axis=0)  # 最终的输入点的index
pc_np = pc_np[choice_idx, :]  # 输入的点 input_pc_num x 8

（4）在随机选择的1/3的点上进行最远点采样，得到结点坐标：

node_np = self.farthest_sampler.sample(pc_np[np.random.choice(pc_np.shape[0], int(self.opt.input_pc_num/3), replace=False)],self.opt.node_num)

（5）返回点的坐标，法向量+曲率以及节点坐标

1.2 数据增强

得到数据后对数据进行增强，主要包括旋转、尺度和位移变换。

if self.mode == 'train':  # 数据增强
    [[src_pc_np, src_sn_np, src_node_np], [dst_pc_np, dst_sn_np, dst_node_np]] = self.augment(
        [[src_pc_np, src_sn_np, src_node_np], [dst_pc_np, dst_sn_np, dst_node_np]])

1.3 点云变换

将dst利用随机生成的旋转、尺度和位移变换进行点云变换，得到新的dst点云，以及对应的旋转矩阵、尺度因子和位移向量。

dst_pc, dst_sn, dst_node, R, scale, shift = transform_pc_pytorch(dst_pc, dst_sn, dst_node,rot_type=rot_type, scale_thre=0, shift_thre=0.5,rot_perturbation=rot_perturbation)

最终返回对应的src和dst点云相关信息以及他们之间的变换。

return src_pc, src_sn, src_node, dst_pc, dst_sn, dst_node, R, scale, shift

二、网络模型

这部分的代码主要在USIP/models/keypoint _detector.py中。主要有两个函数：set_input和optimize。set_input主要是将数据放入到GPU中，不再赘述。optimize包含了训练过过程，详细展开。
由于src和dst使用共享的网络进行特征点检测，因此首先将二者合并作为网络的输入。

node_recomputed, keypoints, sigmas, descriptors =self.detector(torch.cat(pc_tuple, dim=0),torch.cat(sn_tuple, dim=0), torch.cat(node_tuple, dim=0),is_train, epoch)

self.dectector是在USIP/models/network.py中定义的的RPN_Detector。在forward函数中，其实现的功能如下：
（1)建立结点和点的对应关系。

mask, mask_row_max, min_idx = som.query_topk(node, x, node.size()[2], k=self.opt.k)

query_topk在USIP/utils/som.py中定义，其作用在SO-Net中分类器(classifier)的实现过程中已经说明，不再重复。
（2）PointNet特征提取
这部分代码在USIP/models/layers.py中。first_pointnet是一个输出通道数为(64,64,64)的MLP。得到每个点的特征后，对每个结点对应的点特征进行max_pooling，这是通过CUDA扩展程序实现的，在SO-Net中的index_max的功能及具体实现中也具体说明过。max_pooling后的特征作为局部特征，与每个点的特征连接，进入第二个PointNet。second_pointnet是输出通道为(128,128)的MLP，然后同样进行max_pooling，得到每个结点的特征。
（3）KNN特征提取
这部分代码在USIP/models/layers.py的GeneralKNNFusionModule中。首先计算每个节点的k近邻节点坐标和特征，进行拼接：

coordinate_tensor = coordinate.data  # Bx3xM
if precomputed_knn_I is not None:
    assert precomputed_knn_I.size()[2] >= K
    knn_I = precomputed_knn_I[:, :, 0:K]
else:
    coordinate_Mx1 = coordinate_tensor.unsqueeze(3)  # Bx3xMx1
    coordinate_1xM = coordinate_tensor.unsqueeze(2)  # Bx3x1xM
    norm = torch.sum((coordinate_Mx1 - coordinate_1xM) ** 2, dim=1)  # BxMxM, each row corresponds to each coordinate - other coordinates
    knn_D, knn_I = torch.topk(norm, k=K, dim=2, largest=False, sorted=True)  # BxMxK

neighbors = operations.knn_gather_wrapper(coordinate_tensor, knn_I)  # Bx3xMxK
if center_type == 'avg':
    neighbors_center = torch.mean(neighbors, dim=3, keepdim=True)  # Bx3xMx1
elif center_type == 'center':
    neighbors_center = coordinate_tensor.unsqueeze(3)  # Bx3xMx1
neighbors_decentered = (neighbors - neighbors_center).detach()
neighbors_center = neighbors_center.squeeze(3).detach()
x_neighbors = operations.knn_gather_by_indexing(x, knn_I)  # BxCxMxK
x_augmented = torch.cat((neighbors_decentered, x_neighbors), dim=1)  # Bx(3+C)xMxK

通过一个二维卷积网络和max_pooling得到新的KNN特征：

x_neighbors = operations.knn_gather_by_indexing(x, knn_I)  # BxCxMxK
x_augmented = torch.cat((neighbors_decentered, x_neighbors), dim=1)  # Bx(3+C)xMxK

for layer in self.layers_before:
    x_augmented = layer(x_augmented, epoch)
feature, _ = torch.max(x_augmented, dim=3, keepdim=True)  # BxCxMx1

y = torch.cat((feature.expand_as(x_augmented), x_augmented), dim=1)  # Bx2CxMxK
for layer in self.layers_after:
    y = layer(y, epoch)
feature, _ = torch.max(y, dim=3, keepdim=False)  # BxCxM

然后将KNN特征与PointNet特征拼接：

node_feature_aggregated = torch.cat((second_pn_out_masked_max, knn_feature_1), dim=1)

（4）特征点生成
将拼接后的特征送入MLP中，生成M个特征点的三维坐标以及 σ sigma σ：

y = self.mlp1(node_feature_aggregated)
point_descriptor = self.mlp2(y)
keypoint_sigma = self.mlp3(point_descriptor)  # Bx(3+1)xM

（5）误差计算
将生成的src的keypoint进行变换，并计算损失：

self.src_keypoints_transformed = torch.matmul(self.src_R_dst, self.src_keypoints)  # Bx3x3 * Bx3xM -> Bx3xM
self.src_keypoints_transformed = self.src_keypoints_transformed * self.src_scale_dst.unsqueeze(1).unsqueeze(2)  # Bx3xM * Bx1x1 -> Bx3xM
self.src_keypoints_transformed = self.src_keypoints_transformed + self.src_shift_dst  # Bx3xM + Bx3x1 -> Bx3xM
self.loss_chamfer, self.chamfer_pure, self.chamfer_weighted = self.chamfer_criteria(self.src_keypoints_transformed, self.dst_keypoints,self.src_sigmas, self.dst_sigmas)

损失函数的代码在USIP/models.loaaes.py中。包括论文中提到的概率chamfer loss，原始的chamfer loss和加权chamfer loss三个部分。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5495847.html

USIP中的KITTI detector代码阅读

发表评论

评论列表（0条）