深度学习里常见的debug以及调试_java

常见的debug以及调试

**
1.错误一：RuntimeError:Input Type(torch.cuda.FloatTensor) and weight type(torch.FloatTensor) should be the same

class GCN(nn.Module):
    def __init__(self, num_classes, k=15):
        super(GCN, self).__init__()
        self.num_class = num_classes
        self.k = k

        self.layer0 = nn.Sequential(resnet152_pretrained.conv1, resnet152_pretrained.bn1, resnet152_pretrained.relu)
        self.layer1 = nn.Sequential(resnet152_pretrained.maxpool, resnet152_pretrained.layer1)
        self.layer2 = resnet152_pretrained.layer2
        self.layer3 = resnet152_pretrained.layer3
        self.layer4 = resnet152_pretrained.layer4

        self.br = BR(self.num_class)
        self.deconv = nn.ConvTranspose2d(self.num_class, self.num_class, 4, 2, 1, bias=False)

    def forward(self, input):
        x0 = self.layer0(input); print('x0:', x0.size())    # x0: torch.Size([1, 64, 176, 240])
        x1 = self.layer1(x0); print('x1:', x1.size())       # x1: torch.Size([1, 256, 88, 120])
        x2 = self.layer2(x1); print('x2:', x2.size())       # x2: torch.Size([1, 512, 44, 60])
        x3 = self.layer3(x2); print('x3:', x3.size())       # x3: torch.Size([1, 1024, 22, 30])
        x4 = self.layer4(x3); print('x4:', x4.size())       # x4: torch.Size([1, 2048, 11, 15])

        branch4 = GCN_BR_BR_Deconv(x4.shape[1], self.num_class, self.k)
        branch3 = GCN_BR_BR_Deconv(x3.shape[1], self.num_class, self.k)
        branch2 = GCN_BR_BR_Deconv(x2.shape[1], self.num_class, self.k)
        branch1 = GCN_BR_BR_Deconv(x1.shape[1], self.num_class, self.k)

        branch4 = branch4(x4); print('branch4:', branch4.size())    # torch.Size([1, 12, 22, 30])
        branch3 = branch3(x3, branch4); print('branch3:', branch3.size())   # torch.Size([1, 12, 44, 60])
        branch2 = branch2(x2, branch3); print('branch2:', branch2.size())   # torch.Size([1, 12, 88, 120])
        branch1 = branch1(x1, branch2); print('branch1:', branch1.size())   # torch.Size([1, 12, 176, 240])

        x = self.br(branch1)
        x = self.deconv(x)
        x = self.br(x)

        return x

现象描述；单纯对模型调试是显示正常的，但是在train的过程中，出现上面的错误。
搜查问题时：模型一些参数没有挂在GPU上，但是自己已经把模型to(‘‘cuda’’)上了，数据也放在gpu上了，于是不知道怎么解决，其实这个是模型定义的原因导致的。模型在前向传播时候

branch4 = GCN_BR_BR_Deconv(x4.shape[1], self.num_class, self.k)
        branch3 = GCN_BR_BR_Deconv(x3.shape[1], self.num_class, self.k)
        branch2 = GCN_BR_BR_Deconv(x2.shape[1], self.num_class, self.k)
        branch1 = GCN_BR_BR_Deconv(x1.shape[1], self.num_class, self.k)

没有调用了class类里面的定义，就是在self定义的过程中，没有定义，而是在前向传播中直接调用了。才会导致这个错误。
在正确的写法

问题解决：写的比较随意

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/langs/736396.html

深度学习里常见的debug以及调试

发表评论

评论列表（0条）