DGCNN 和 GAT 的思想结合
问题: 解决基于图卷积技术细粒度点云分类中,特征污染的问题。降低不同类点,对源点的特征污染问题。
方法: 对于动态图卷积网络,已知其能够很好地提取点云的语义特征,然而其提取到的特征是将所有点的重要程度都考虑为相似,而对于细粒度分类来说,其局部几何特征很重要,所以如何将源点重要的点的特征聚合,而非重要点的特征原理,成为了细粒度分类的重要工作。
方法:
- DGCNN的两个核心思想: 动态图: Dynamic Graph , 边缘卷积 EdgeConv其核心就是,用 “ [源点原始特征,源点特征 - 邻点特征’] 组合作为一个邻点的新特征。
- Graph Attention Convolution for Point Cloud Semantic Segmentation : 图通道注意力机制: 核心是利用源点与邻点“原始坐标的差值” 与 “隐含特征的差值” 组合,作为一个点的新特征,同时利用这个组合去做一个通道的注意力机制。
每个源点,其邻域点对其的每一个通道都具有不同的注意力系数,所以对于一个点的输出,其各个通道都是通过注意力系数聚合邻点信息后的结果。
- Graph Attention Network: 对于各个点的邻点,通过MLP建立一个注意力机制,通过两个隐含特征之间的相似度,来确定注意力系数的大小,是节点级别的注意力机制。
对于点云分类来说,能不能结合1与3,让注意力机制衡量相似度,不仅从隐含层空间出发,也从原始三维空间出发,同时考虑差值,即从[p,h, ]得到注意力系数,考虑原始三维坐标,同时也考虑差值。主要就是这些怎么组合,实验部分。
- 有一个问题,注意力机制都是建立在比较相似度的基础上的,也就是说,源点特征与邻点特征用来进行 Attention mechanism的特征应该是相同的,如何建立这个相同的特征?
- 对于Edge [x_i , xi-xj] 虽然能够很好地表示源点特征,但是对于attention mechanism, 这个特征对于邻点如何求? GAT简单地将进行相似度比较,但是这样不是很适合点云等具有几何特征的数据,所以如何引入点云的原始特征?
- 
    将(pi, h(x_i ) pj, h(x_j) )等形式进行建立注意力机制? 
如果纯Pytorch实现,不好扩展,写起来麻烦,pytorch geometric 比较方便。
下面是pyG的实现
实现过程
计算Att系数:
利用xi,xj-xi 计算一个相似度系数。【这个相似度计算可以有很多类型,应该是可以研究一个比较好的方法的】
然后,利用一个MLP,in out out 两层的,将
| [Xi | xj-xi] 映射到高维,利用相似度系数进行特征聚合。 | 
(目前 bias没有加入)
最后通过max和mean得到点云的graph embedding,
是否可以通过att得到呢,通过构造一个中心节点,
例如mean,计算
配置pyg,主要是要装一个vc++14的编译器,这样一些依赖库才能编译好。
然后git clone pyg的安装包,接着install。
# 保存整个网络 torch.save(net, PATH) # 保存网络中的参数, 速度快,占空间少 torch.save(net.state_dict(),PATH) #————————————————– #针对上面一般的保存方法,加载的方法分别是: model_dict=torch.load(PATH) model_dict=model.load_state_dict(torch.load(PATH))
来自 https://zhuanlan.zhihu.com/p/38056115
Windows 下有一个bug,num_workers >0 总是报出bug,无法运行。
Modelnet40:
实验有:
原始网络结构 + sgd是否能达到 92.9的效果,全部设置与原文一致。
只要是网络结构 64 64 128 256 -> 512维
然后 将512维度 -> 1024
两次不同的pool, max. avg
得到2048维度向量
通过 2048->512->256->out 得到输出
实验:
Modelenet10:
Modelnet40:
原始论文复现:92.9 但是我复现目前最高是92.0%
(自己的复现没有加入数据增强,pyg下的数据增强方式目前有点没弄明白)
加入原始GAT的复现:
每层都加入一个GAT, batch要改为16
加入改进GAT的结果:
实验记录:
class Net(torch.nn.Module):
  def __init__(self, out_channels, k=20, aggr='max'):
    super().__init__()
    
    self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)
    self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
    self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
    self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)
    
    self.pointGATconv1 = PointGATConv(64,64)
    self.pointGATconv2 = PointGATConv(64,64)
    self.pointGATconv3 = PointGATConv(64,64)
    self.pointGATconv4 = PointGATConv(128,128)
    
    self.lin1 = MLP([128 + 64 + 64 + 64, 1024])
 
    self.mlp = Seq(
      MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),
      Lin(256, out_channels))
    
 
  def forward(self, data):
    pos, batch = data.pos, data.batch
    \#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])
    \#print('pos shape is :{}'.format(pos.shape))
    x1,edge_index = self.edgeconv1(pos, batch)
    \#print('layer1 out shape :{}'.format(x1.shape))
    x1 = self.pointGATconv1(x1, pos, edge_index)
    
    x2,edge_index = self.edgeconv2(x1, batch)
    x2 = self.pointGATconv2(x2,pos, edge_index)
    
    x3,edge_index = self.edgeconv3(x2, batch)
    x3 = self.pointGATconv3(x3, pos,edge_index)
    
    x4,edge_index = self.edgeconv4(x3, batch)
    x4 = self.pointGATconv4(x4, pos,edge_index)
    
    \#x2 = self.conv2(x1, batch)
    out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))
    out = global_max_pool(out, batch)
    out = self.mlp(out)
    return F.log_softmax(out, dim=1)
 
Modelnet10:
对于这个网络结构,使用adam进行训练,lr = 0.001
加入att,与不加具有明显的提升,2%
但是原始网络并没有达到92.9,只有90.2%左右, 加入后达到了92.2%
 
import os.path as osp
 
import torch
import torch.nn.functional as F
from torch.nn import Sequential as Seq, Dropout, Linear as Lin
from torch_geometric.datasets import ModelNet
import torch_geometric.transforms as T
from torch_geometric.data import DataLoader
from torch_geometric.nn import global_max_pool #DynamicEdgeConv, 
 
from pointnet2_classification import MLP
from tqdm import tqdm
 
from torch_scatter import scatter_add
from torch_geometric.nn.conv import MessagePassing
from torch_geometric.utils import add_remaining_self_loops
from torch_geometric.nn.inits import glorot, zeros,reset
from torch_cluster import knn_graph
from torch.nn import Parameter
from torch_geometric.utils import remove_self_loops, add_self_loops, softmax
class EdgeConv(MessagePassing):
  r"""The edge convolutional operator from the `"Dynamic Graph CNN for
  Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper
 
  .. math::
    \mathbf{x}^{\prime}_i = \sum_{j \in \mathcal{N}(i)}
    h_{\mathbf{\Theta}}(\mathbf{x}_i \, \Vert \,
    \mathbf{x}_j - \mathbf{x}_i),
 
  where :math:`h_{\mathbf{\Theta}}` denotes a neural network, *.i.e.* a MLP.
 
  Args:
    nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that
      maps pair-wise concatenated node features :obj:`x` of shape
      :obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,
      *e.g.*, defined by :class:`torch.nn.Sequential`.
    aggr (string, optional): The aggregation scheme to use
      (:obj:`"add"`, :obj:`"mean"`, :obj:`"max"`).
      (default: :obj:`"max"`)
    **kwargs (optional): Additional arguments of
      :class:`torch_geometric.nn.conv.MessagePassing`.
  """
 
  def __init__(self, nn, aggr='max', **kwargs):
    super(EdgeConv, self).__init__(aggr=aggr, **kwargs)
    self.nn = nn
    self.reset_parameters()
 
  def reset_parameters(self):
    reset(self.nn)
 
  def forward(self, x, edge_index):
    """"""
    x = x.unsqueeze(-1) if x.dim() == 1 else x
    \#print(x.shape)
    return self.propagate(edge_index, x=x)
 
  def message(self, x_i, x_j):
    return self.nn(torch.cat([x_i, x_j - x_i], dim=1))
 
  def __repr__(self):
    return '{}(nn={})'.format(self.__class__.__name__, self.nn)
 
 
class DynamicEdgeConv(EdgeConv):
  r"""The dynamic edge convolutional operator from the `"Dynamic Graph CNN
  for Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper
  (see :class:`torch_geometric.nn.conv.EdgeConv`), where the graph is
  dynamically constructed using nearest neighbors in the feature space.
 
  Args:
    nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that
      maps pair-wise concatenated node features :obj:`x` of shape
      `:obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,
      *e.g.* defined by :class:`torch.nn.Sequential`.
    k (int): Number of nearest neighbors.
    aggr (string): The aggregation operator to use (:obj:`"add"`,
      :obj:`"mean"`, :obj:`"max"`). (default: :obj:`"max"`)
    **kwargs (optional): Additional arguments of
      :class:`torch_geometric.nn.conv.MessagePassing`.
  """
 
  def __init__(self, nn, k, aggr='max', **kwargs):
    super(DynamicEdgeConv, self).__init__(nn=nn, aggr=aggr, **kwargs)
    self.k = k
 
  def forward(self, x, batch=None):
    """"""
    edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)
    \#print(edge_index)
    return super(DynamicEdgeConv, self).forward(x, edge_index),edge_index
 
  def __repr__(self):
    return '{}(nn={}, k={})'.format(self.__class__.__name__, self.nn,
                    self.k)
 
class PointGATConv(MessagePassing):
  r"""The graph attentional operator from the `"Graph Attention Networks"
  <https://arxiv.org/abs/1710.10903>`_ paper
 
  .. math::
    \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} +
    \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j},
 
  where the attention coefficients :math:`\alpha_{i,j}` are computed as
 
  .. math::
    \alpha_{i,j} =
    \frac{
    \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}
    [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j]
    \right)\right)}
    {\sum_{k \in \mathcal{N}(i) \cup \{ i \}}
    \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}
    [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k]
    \right)\right)}.
 
  Args:
    in_channels (int): Size of each input sample.
    out_channels (int): Size of each output sample.
    heads (int, optional): Number of multi-head-attentions.
      (default: :obj:`1`)
    concat (bool, optional): If set to :obj:`False`, the multi-head
      attentions are averaged instead of concatenated.
      (default: :obj:`True`)
    negative_slope (float, optional): LeakyReLU angle of the negative
      slope. (default: :obj:`0.2`)
    dropout (float, optional): Dropout probability of the normalized
      attention coefficients which exposes each node to a stochastically
      sampled neighborhood during training. (default: :obj:`0`)
    bias (bool, optional): If set to :obj:`False`, the layer will not learn
      an additive bias. (default: :obj:`True`)
    **kwargs (optional): Additional arguments of
      :class:`torch_geometric.nn.conv.MessagePassing`.
  """
 
  def __init__(self, in_channels, out_channels, heads=1, concat=True,
         negative_slope=0.2, dropout=0, bias=True, **kwargs):
    super(PointGATConv, self).__init__(aggr='add', **kwargs)
 
    self.in_channels = in_channels
    self.out_channels = out_channels
    self.heads = heads
    self.concat = concat
    self.negative_slope = negative_slope
    self.dropout = dropout
 
    self.weight = Parameter(
      torch.Tensor(in_channels, heads * out_channels))
    self.att = Parameter(torch.Tensor(1, heads, 2 * out_channels))
 
    if bias and concat:
      self.bias = Parameter(torch.Tensor(heads * out_channels))
    elif bias and not concat:
      self.bias = Parameter(torch.Tensor(out_channels))
    else:
      self.register_parameter('bias', None)
 
    self.reset_parameters()
 
  def reset_parameters(self):
    glorot(self.weight)
    glorot(self.att)
    zeros(self.bias)
 
  def forward(self, x, pos, edge_index, size=None):
    """"""
    if size is None and torch.is_tensor(x):
      edge_index, _ = remove_self_loops(edge_index)
      edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))
 
    if torch.is_tensor(x):
      x = torch.matmul(x, self.weight)
    else:
      x = (None if x[0] is None else torch.matmul(x[0], self.weight),
         None if x[1] is None else torch.matmul(x[1], self.weight))
 
    return self.propagate(edge_index, size=size, x=x)
 
  def message(self, edge_index_i, x_i, x_j, size_i):
    \# Compute attention coefficients.
    x_j = x_j.view(-1, self.heads, self.out_channels)
    if x_i is None:
      alpha = (x_j * self.att[:, :, self.out_channels:]).sum(dim=-1)
    else:
      x_i = x_i.view(-1, self.heads, self.out_channels)
      alpha = (torch.cat([x_i, x_j], dim=-1) * self.att).sum(dim=-1)
 
    alpha = F.leaky_relu(alpha, self.negative_slope)
    alpha = softmax(alpha, edge_index_i, size_i)
 
    \# Sample attention coefficients stochastically.
    alpha = F.dropout(alpha, p=self.dropout, training=self.training)
 
    return x_j * alpha.view(-1, self.heads, 1)
 
  def update(self, aggr_out):
    if self.concat is True:
      aggr_out = aggr_out.view(-1, self.heads * self.out_channels)
    else:
      aggr_out = aggr_out.mean(dim=1)
 
    if self.bias is not None:
      aggr_out = aggr_out + self.bias
    return aggr_out
 
  def __repr__(self):
    return '{}({}, {}, heads={})'.format(self.__class__.__name__,
                       self.in_channels,
                       self.out_channels, self.heads)
           
           
           
class Net(torch.nn.Module):
  def __init__(self, out_channels, k=20, aggr='max'):
    super().__init__()
    
    self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)
    self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
    self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)
    self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)
    
    self.pointGATconv1 = PointGATConv(64,64)
    self.pointGATconv2 = PointGATConv(64,64)
    self.pointGATconv3 = PointGATConv(64,64)
    self.pointGATconv4 = PointGATConv(128,128)
    
    self.lin1 = MLP([128 + 64 + 64 + 64, 1024])
 
    self.mlp = Seq(
      MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),
      Lin(256, out_channels))
    
 
  def forward(self, data):
    pos, batch = data.pos, data.batch
    \#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])
    \#print('pos shape is :{}'.format(pos.shape))
    x1,edge_index = self.edgeconv1(pos, batch)
    \#print('layer1 out shape :{}'.format(x1.shape))
    x1 = self.pointGATconv1(x1, pos, edge_index)
    
    x2,edge_index = self.edgeconv2(x1, batch)
    x2 = self.pointGATconv2(x2,pos, edge_index)
    
    x3,edge_index = self.edgeconv3(x2, batch)
    x3 = self.pointGATconv3(x3, pos,edge_index)
    
    x4,edge_index = self.edgeconv4(x3, batch)
    x4 = self.pointGATconv4(x4, pos,edge_index)
    
    \#x2 = self.conv2(x1, batch)
    out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))
    out = global_max_pool(out, batch)
    out = self.mlp(out)
    return F.log_softmax(out, dim=1)
 
class PointAttentionNet(torch.nn.Module):
  def __init__(self, out_channels, k):
    super().__init__()
    self.k = k
    self.in_channels = 3
    self.conv = PointGATConv(self.in_channels,out_channels)
  def forward(x, pos, batch):
    edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)
    return self.conv(x,pos, edge_index)
    
 
 
 
def train():
  model.train()
 
  total_loss = 0
  for data in tqdm(train_loader,ascii=True):
    data = data.to(device)
    optimizer.zero_grad()
    out = model(data)
    break
    loss = F.nll_loss(out, data.y)
    loss.backward()
    total_loss += loss.item() * data.num_graphs
    optimizer.step()
  return total_loss / len(train_dataset)
 
 
def test(loader):
  model.eval()
 
  correct = 0
  for data in tqdm(loader,ascii=True):
    data = data.to(device)
    with torch.no_grad():
      pred = model(data).max(dim=1)[1]
    correct += pred.eq(data.y).sum().item()
  return correct / len(loader.dataset)
 
if __name__ == '__main__':
  sample_points = 1024
  path = osp.join(osp.dirname(osp.realpath(__file__)), '..', 'data/ModelNet10')
  pre_transform, transform = T.NormalizeScale(), T.SamplePoints(sample_points)
  train_dataset = ModelNet(path, '10', True, transform, pre_transform)
  test_dataset = ModelNet(path, '10', False, transform, pre_transform)
  train_loader = DataLoader(
    train_dataset, batch_size=8, shuffle=True)#, num_workers=6
  test_loader = DataLoader(
    test_dataset, batch_size=8, shuffle=False)
  
  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
  
  model = PointAttentionNet(train_dataset.num_classes, k=20).to(device)
  
  train()
  
  \# model = Net(train_dataset.num_classes, k=20).to(device)
  \# optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
  \# scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)
  \# for epoch in range(1, 201):
    \# loss = train()
    \# test_acc = test(test_loader)
    \# print('Epoch {:03d}, Loss: {:.4f}, Test: {:.4f}'.format(
      \# epoch, loss, test_acc))
    \# scheduler.step()