DGCNN 和 GAT 的思想结合
问题: 解决基于图卷积技术细粒度点云分类中,特征污染的问题。降低不同类点,对源点的特征污染问题。
方法: 对于动态图卷积网络,已知其能够很好地提取点云的语义特征,然而其提取到的特征是将所有点的重要程度都考虑为相似,而对于细粒度分类来说,其局部几何特征很重要,所以如何将源点重要的点的特征聚合,而非重要点的特征原理,成为了细粒度分类的重要工作。
方法:
- DGCNN的两个核心思想: 动态图: Dynamic Graph , 边缘卷积 EdgeConv其核心就是,用 “ [源点原始特征,源点特征 - 邻点特征’] 组合作为一个邻点的新特征。
- Graph Attention Convolution for Point Cloud Semantic Segmentation : 图通道注意力机制: 核心是利用源点与邻点“原始坐标的差值” 与 “隐含特征的差值” 组合,作为一个点的新特征,同时利用这个组合去做一个通道的注意力机制。
每个源点,其邻域点对其的每一个通道都具有不同的注意力系数,所以对于一个点的输出,其各个通道都是通过注意力系数聚合邻点信息后的结果。
- Graph Attention Network: 对于各个点的邻点,通过MLP建立一个注意力机制,通过两个隐含特征之间的相似度,来确定注意力系数的大小,是节点级别的注意力机制。
对于点云分类来说,能不能结合1与3,让注意力机制衡量相似度,不仅从隐含层空间出发,也从原始三维空间出发,同时考虑差值,即从[p,h, ]得到注意力系数,考虑原始三维坐标,同时也考虑差值。主要就是这些怎么组合,实验部分。
- 有一个问题,注意力机制都是建立在比较相似度的基础上的,也就是说,源点特征与邻点特征用来进行 Attention mechanism的特征应该是相同的,如何建立这个相同的特征?
- 对于Edge [x_i , xi-xj] 虽然能够很好地表示源点特征,但是对于attention mechanism, 这个特征对于邻点如何求? GAT简单地将进行相似度比较,但是这样不是很适合点云等具有几何特征的数据,所以如何引入点云的原始特征?
-
将(pi, h(x_i ) pj, h(x_j) )等形式进行建立注意力机制?
如果纯Pytorch实现,不好扩展,写起来麻烦,pytorch geometric 比较方便。
下面是pyG的实现
实现过程
计算Att系数:
利用xi,xj-xi 计算一个相似度系数。【这个相似度计算可以有很多类型,应该是可以研究一个比较好的方法的】
然后,利用一个MLP,in out out 两层的,将
[Xi | xj-xi] 映射到高维,利用相似度系数进行特征聚合。 |
(目前 bias没有加入)
最后通过max和mean得到点云的graph embedding,
是否可以通过att得到呢,通过构造一个中心节点,
例如mean,计算
配置pyg,主要是要装一个vc++14的编译器,这样一些依赖库才能编译好。
然后git clone pyg的安装包,接着install。
# 保存整个网络 torch.save(net, PATH) # 保存网络中的参数, 速度快,占空间少 torch.save(net.state_dict(),PATH) #————————————————– #针对上面一般的保存方法,加载的方法分别是: model_dict=torch.load(PATH) model_dict=model.load_state_dict(torch.load(PATH))
来自 https://zhuanlan.zhihu.com/p/38056115
Windows 下有一个bug,num_workers >0 总是报出bug,无法运行。
Modelnet40:
实验有:
原始网络结构 + sgd是否能达到 92.9的效果,全部设置与原文一致。
只要是网络结构 64 64 128 256 -> 512维
然后 将512维度 -> 1024
两次不同的pool, max. avg
得到2048维度向量
通过 2048->512->256->out 得到输出
实验:
Modelenet10:
Modelnet40:
原始论文复现:92.9 但是我复现目前最高是92.0%
(自己的复现没有加入数据增强,pyg下的数据增强方式目前有点没弄明白)
加入原始GAT的复现:
每层都加入一个GAT, batch要改为16
加入改进GAT的结果:
实验记录:
class Net(torch.nn.Module):
def __init__(self, out_channels, k=20, aggr='max'):
super().__init__()
self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)
self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)
self.pointGATconv1 = PointGATConv(64,64)
self.pointGATconv2 = PointGATConv(64,64)
self.pointGATconv3 = PointGATConv(64,64)
self.pointGATconv4 = PointGATConv(128,128)
self.lin1 = MLP([128 + 64 + 64 + 64, 1024])
self.mlp = Seq(
MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),
Lin(256, out_channels))
def forward(self, data):
pos, batch = data.pos, data.batch
\#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])
\#print('pos shape is :{}'.format(pos.shape))
x1,edge_index = self.edgeconv1(pos, batch)
\#print('layer1 out shape :{}'.format(x1.shape))
x1 = self.pointGATconv1(x1, pos, edge_index)
x2,edge_index = self.edgeconv2(x1, batch)
x2 = self.pointGATconv2(x2,pos, edge_index)
x3,edge_index = self.edgeconv3(x2, batch)
x3 = self.pointGATconv3(x3, pos,edge_index)
x4,edge_index = self.edgeconv4(x3, batch)
x4 = self.pointGATconv4(x4, pos,edge_index)
\#x2 = self.conv2(x1, batch)
out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))
out = global_max_pool(out, batch)
out = self.mlp(out)
return F.log_softmax(out, dim=1)
Modelnet10:
对于这个网络结构,使用adam进行训练,lr = 0.001
加入att,与不加具有明显的提升,2%
但是原始网络并没有达到92.9,只有90.2%左右, 加入后达到了92.2%
import os.path as osp
import torch
import torch.nn.functional as F
from torch.nn import Sequential as Seq, Dropout, Linear as Lin
from torch_geometric.datasets import ModelNet
import torch_geometric.transforms as T
from torch_geometric.data import DataLoader
from torch_geometric.nn import global_max_pool #DynamicEdgeConv,
from pointnet2_classification import MLP
from tqdm import tqdm
from torch_scatter import scatter_add
from torch_geometric.nn.conv import MessagePassing
from torch_geometric.utils import add_remaining_self_loops
from torch_geometric.nn.inits import glorot, zeros,reset
from torch_cluster import knn_graph
from torch.nn import Parameter
from torch_geometric.utils import remove_self_loops, add_self_loops, softmax
class EdgeConv(MessagePassing):
r"""The edge convolutional operator from the `"Dynamic Graph CNN for
Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper
.. math::
\mathbf{x}^{\prime}_i = \sum_{j \in \mathcal{N}(i)}
h_{\mathbf{\Theta}}(\mathbf{x}_i \, \Vert \,
\mathbf{x}_j - \mathbf{x}_i),
where :math:`h_{\mathbf{\Theta}}` denotes a neural network, *.i.e.* a MLP.
Args:
nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that
maps pair-wise concatenated node features :obj:`x` of shape
:obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,
*e.g.*, defined by :class:`torch.nn.Sequential`.
aggr (string, optional): The aggregation scheme to use
(:obj:`"add"`, :obj:`"mean"`, :obj:`"max"`).
(default: :obj:`"max"`)
**kwargs (optional): Additional arguments of
:class:`torch_geometric.nn.conv.MessagePassing`.
"""
def __init__(self, nn, aggr='max', **kwargs):
super(EdgeConv, self).__init__(aggr=aggr, **kwargs)
self.nn = nn
self.reset_parameters()
def reset_parameters(self):
reset(self.nn)
def forward(self, x, edge_index):
""""""
x = x.unsqueeze(-1) if x.dim() == 1 else x
\#print(x.shape)
return self.propagate(edge_index, x=x)
def message(self, x_i, x_j):
return self.nn(torch.cat([x_i, x_j - x_i], dim=1))
def __repr__(self):
return '{}(nn={})'.format(self.__class__.__name__, self.nn)
class DynamicEdgeConv(EdgeConv):
r"""The dynamic edge convolutional operator from the `"Dynamic Graph CNN
for Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper
(see :class:`torch_geometric.nn.conv.EdgeConv`), where the graph is
dynamically constructed using nearest neighbors in the feature space.
Args:
nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that
maps pair-wise concatenated node features :obj:`x` of shape
`:obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,
*e.g.* defined by :class:`torch.nn.Sequential`.
k (int): Number of nearest neighbors.
aggr (string): The aggregation operator to use (:obj:`"add"`,
:obj:`"mean"`, :obj:`"max"`). (default: :obj:`"max"`)
**kwargs (optional): Additional arguments of
:class:`torch_geometric.nn.conv.MessagePassing`.
"""
def __init__(self, nn, k, aggr='max', **kwargs):
super(DynamicEdgeConv, self).__init__(nn=nn, aggr=aggr, **kwargs)
self.k = k
def forward(self, x, batch=None):
""""""
edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)
\#print(edge_index)
return super(DynamicEdgeConv, self).forward(x, edge_index),edge_index
def __repr__(self):
return '{}(nn={}, k={})'.format(self.__class__.__name__, self.nn,
self.k)
class PointGATConv(MessagePassing):
r"""The graph attentional operator from the `"Graph Attention Networks"
<https://arxiv.org/abs/1710.10903>`_ paper
.. math::
\mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} +
\sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j},
where the attention coefficients :math:`\alpha_{i,j}` are computed as
.. math::
\alpha_{i,j} =
\frac{
\exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}
[\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j]
\right)\right)}
{\sum_{k \in \mathcal{N}(i) \cup \{ i \}}
\exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}
[\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k]
\right)\right)}.
Args:
in_channels (int): Size of each input sample.
out_channels (int): Size of each output sample.
heads (int, optional): Number of multi-head-attentions.
(default: :obj:`1`)
concat (bool, optional): If set to :obj:`False`, the multi-head
attentions are averaged instead of concatenated.
(default: :obj:`True`)
negative_slope (float, optional): LeakyReLU angle of the negative
slope. (default: :obj:`0.2`)
dropout (float, optional): Dropout probability of the normalized
attention coefficients which exposes each node to a stochastically
sampled neighborhood during training. (default: :obj:`0`)
bias (bool, optional): If set to :obj:`False`, the layer will not learn
an additive bias. (default: :obj:`True`)
**kwargs (optional): Additional arguments of
:class:`torch_geometric.nn.conv.MessagePassing`.
"""
def __init__(self, in_channels, out_channels, heads=1, concat=True,
negative_slope=0.2, dropout=0, bias=True, **kwargs):
super(PointGATConv, self).__init__(aggr='add', **kwargs)
self.in_channels = in_channels
self.out_channels = out_channels
self.heads = heads
self.concat = concat
self.negative_slope = negative_slope
self.dropout = dropout
self.weight = Parameter(
torch.Tensor(in_channels, heads * out_channels))
self.att = Parameter(torch.Tensor(1, heads, 2 * out_channels))
if bias and concat:
self.bias = Parameter(torch.Tensor(heads * out_channels))
elif bias and not concat:
self.bias = Parameter(torch.Tensor(out_channels))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def reset_parameters(self):
glorot(self.weight)
glorot(self.att)
zeros(self.bias)
def forward(self, x, pos, edge_index, size=None):
""""""
if size is None and torch.is_tensor(x):
edge_index, _ = remove_self_loops(edge_index)
edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))
if torch.is_tensor(x):
x = torch.matmul(x, self.weight)
else:
x = (None if x[0] is None else torch.matmul(x[0], self.weight),
None if x[1] is None else torch.matmul(x[1], self.weight))
return self.propagate(edge_index, size=size, x=x)
def message(self, edge_index_i, x_i, x_j, size_i):
\# Compute attention coefficients.
x_j = x_j.view(-1, self.heads, self.out_channels)
if x_i is None:
alpha = (x_j * self.att[:, :, self.out_channels:]).sum(dim=-1)
else:
x_i = x_i.view(-1, self.heads, self.out_channels)
alpha = (torch.cat([x_i, x_j], dim=-1) * self.att).sum(dim=-1)
alpha = F.leaky_relu(alpha, self.negative_slope)
alpha = softmax(alpha, edge_index_i, size_i)
\# Sample attention coefficients stochastically.
alpha = F.dropout(alpha, p=self.dropout, training=self.training)
return x_j * alpha.view(-1, self.heads, 1)
def update(self, aggr_out):
if self.concat is True:
aggr_out = aggr_out.view(-1, self.heads * self.out_channels)
else:
aggr_out = aggr_out.mean(dim=1)
if self.bias is not None:
aggr_out = aggr_out + self.bias
return aggr_out
def __repr__(self):
return '{}({}, {}, heads={})'.format(self.__class__.__name__,
self.in_channels,
self.out_channels, self.heads)
class Net(torch.nn.Module):
def __init__(self, out_channels, k=20, aggr='max'):
super().__init__()
self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)
self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)
self.pointGATconv1 = PointGATConv(64,64)
self.pointGATconv2 = PointGATConv(64,64)
self.pointGATconv3 = PointGATConv(64,64)
self.pointGATconv4 = PointGATConv(128,128)
self.lin1 = MLP([128 + 64 + 64 + 64, 1024])
self.mlp = Seq(
MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),
Lin(256, out_channels))
def forward(self, data):
pos, batch = data.pos, data.batch
\#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])
\#print('pos shape is :{}'.format(pos.shape))
x1,edge_index = self.edgeconv1(pos, batch)
\#print('layer1 out shape :{}'.format(x1.shape))
x1 = self.pointGATconv1(x1, pos, edge_index)
x2,edge_index = self.edgeconv2(x1, batch)
x2 = self.pointGATconv2(x2,pos, edge_index)
x3,edge_index = self.edgeconv3(x2, batch)
x3 = self.pointGATconv3(x3, pos,edge_index)
x4,edge_index = self.edgeconv4(x3, batch)
x4 = self.pointGATconv4(x4, pos,edge_index)
\#x2 = self.conv2(x1, batch)
out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))
out = global_max_pool(out, batch)
out = self.mlp(out)
return F.log_softmax(out, dim=1)
class PointAttentionNet(torch.nn.Module):
def __init__(self, out_channels, k):
super().__init__()
self.k = k
self.in_channels = 3
self.conv = PointGATConv(self.in_channels,out_channels)
def forward(x, pos, batch):
edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)
return self.conv(x,pos, edge_index)
def train():
model.train()
total_loss = 0
for data in tqdm(train_loader,ascii=True):
data = data.to(device)
optimizer.zero_grad()
out = model(data)
break
loss = F.nll_loss(out, data.y)
loss.backward()
total_loss += loss.item() * data.num_graphs
optimizer.step()
return total_loss / len(train_dataset)
def test(loader):
model.eval()
correct = 0
for data in tqdm(loader,ascii=True):
data = data.to(device)
with torch.no_grad():
pred = model(data).max(dim=1)[1]
correct += pred.eq(data.y).sum().item()
return correct / len(loader.dataset)
if __name__ == '__main__':
sample_points = 1024
path = osp.join(osp.dirname(osp.realpath(__file__)), '..', 'data/ModelNet10')
pre_transform, transform = T.NormalizeScale(), T.SamplePoints(sample_points)
train_dataset = ModelNet(path, '10', True, transform, pre_transform)
test_dataset = ModelNet(path, '10', False, transform, pre_transform)
train_loader = DataLoader(
train_dataset, batch_size=8, shuffle=True)#, num_workers=6
test_loader = DataLoader(
test_dataset, batch_size=8, shuffle=False)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = PointAttentionNet(train_dataset.num_classes, k=20).to(device)
train()
\# model = Net(train_dataset.num_classes, k=20).to(device)
\# optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
\# scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)
\# for epoch in range(1, 201):
\# loss = train()
\# test_acc = test(test_loader)
\# print('Epoch {:03d}, Loss: {:.4f}, Test: {:.4f}'.format(
\# epoch, loss, test_acc))
\# scheduler.step()