Dgcnn+gat

Posted by Packy on December 9, 2019

DGCNN 和 GAT 的思想结合

问题: 解决基于图卷积技术细粒度点云分类中,特征污染的问题。降低不同类点,对源点的特征污染问题。

方法: 对于动态图卷积网络,已知其能够很好地提取点云的语义特征,然而其提取到的特征是将所有点的重要程度都考虑为相似,而对于细粒度分类来说,其局部几何特征很重要,所以如何将源点重要的点的特征聚合,而非重要点的特征原理,成为了细粒度分类的重要工作。

方法:

  1. DGCNN的两个核心思想: 动态图: Dynamic Graph , 边缘卷积 EdgeConv其核心就是,用 “ [源点原始特征,源点特征 - 邻点特征’] 组合作为一个邻点的新特征。img
  2. Graph Attention Convolution for Point Cloud Semantic Segmentation : 图通道注意力机制: 核心是利用源点与邻点“原始坐标的差值” 与 “隐含特征的差值” 组合,作为一个点的新特征,同时利用这个组合去做一个通道的注意力机制。

每个源点,其邻域点对其的每一个通道都具有不同的注意力系数,所以对于一个点的输出,其各个通道都是通过注意力系数聚合邻点信息后的结果。

img

(3)  leN(i) e  where äij,k is the attentional weight of vertex j to vertex i  at the k-th feature channel.  Therefore, the final output of the proposed GAC can be  formulated as follows:  (4)

  1. Graph Attention Network: 对于各个点的邻点,通过MLP建立一个注意力机制,通过两个隐含特征之间的相似度,来确定注意力系数的大小,是节点级别的注意力机制。img

对于点云分类来说,能不能结合1与3,让注意力机制衡量相似度,不仅从隐含层空间出发,也从原始三维空间出发,同时考虑差值,即从[p,h, ]得到注意力系数,考虑原始三维坐标,同时也考虑差值。主要就是这些怎么组合,实验部分。img

  • 有一个问题,注意力机制都是建立在比较相似度的基础上的,也就是说,源点特征与邻点特征用来进行 Attention mechanism的特征应该是相同的,如何建立这个相同的特征?
  • 对于Edge [x_i , xi-xj] 虽然能够很好地表示源点特征,但是对于attention mechanism, 这个特征对于邻点如何求? GAT简单地将进行相似度比较,但是这样不是很适合点云等具有几何特征的数据,所以如何引入点云的原始特征?img
  • 将(pi, h(x_i )   pj, h(x_j) )等形式进行建立注意力机制?

如果纯Pytorch实现,不好扩展,写起来麻烦,pytorch geometric 比较方便。

下面是pyG的实现

实现过程

计算Att系数:

利用xi,xj-xi 计算一个相似度系数。【这个相似度计算可以有很多类型,应该是可以研究一个比较好的方法的】

然后,利用一个MLP,in out out 两层的,将

[Xi xj-xi] 映射到高维,利用相似度系数进行特征聚合。

(目前 bias没有加入)

最后通过max和mean得到点云的graph embedding,

是否可以通过att得到呢,通过构造一个中心节点,

例如mean,计算

配置pyg,主要是要装一个vc++14的编译器,这样一些依赖库才能编译好。

然后git clone pyg的安装包,接着install。

# 保存整个网络 torch.save(net, PATH) # 保存网络中的参数, 速度快,占空间少 torch.save(net.state_dict(),PATH) #————————————————– #针对上面一般的保存方法,加载的方法分别是: model_dict=torch.load(PATH) model_dict=model.load_state_dict(torch.load(PATH))

来自 https://zhuanlan.zhihu.com/p/38056115

Windows 下有一个bug,num_workers >0 总是报出bug,无法运行。

Modelnet40:

实验有:

原始网络结构 + sgd是否能达到 92.9的效果,全部设置与原文一致。

只要是网络结构 64 64 128 256 -> 512维

然后 将512维度 -> 1024

两次不同的pool, max. avg

得到2048维度向量

通过 2048->512->256->out 得到输出

实验:

Modelenet10:

Modelnet40:

原始论文复现:92.9 但是我复现目前最高是92.0%

(自己的复现没有加入数据增强,pyg下的数据增强方式目前有点没弄明白)

o. 920178  (134, 2)  DGCI,m 1024  DGCNN 1024  Epochs

加入原始GAT的复现:

每层都加入一个GAT, batch要改为16

加入改进GAT的结果:

实验记录:

class Net(torch.nn.Module):

  def __init__(self, out_channels, k=20, aggr='max'):

    super().__init__()

    

    self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)

    self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

    self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

    self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)

    

    self.pointGATconv1 = PointGATConv(64,64)

    self.pointGATconv2 = PointGATConv(64,64)

    self.pointGATconv3 = PointGATConv(64,64)

    self.pointGATconv4 = PointGATConv(128,128)

    

    self.lin1 = MLP([128 + 64 + 64 + 64, 1024])

 

    self.mlp = Seq(

      MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),

      Lin(256, out_channels))

    

 

  def forward(self, data):

    pos, batch = data.pos, data.batch

    \#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])

    \#print('pos shape is :{}'.format(pos.shape))

    x1,edge_index = self.edgeconv1(pos, batch)

    \#print('layer1 out shape :{}'.format(x1.shape))

    x1 = self.pointGATconv1(x1, pos, edge_index)

    

    x2,edge_index = self.edgeconv2(x1, batch)

    x2 = self.pointGATconv2(x2,pos, edge_index)

    

    x3,edge_index = self.edgeconv3(x2, batch)

    x3 = self.pointGATconv3(x3, pos,edge_index)

    

    x4,edge_index = self.edgeconv4(x3, batch)

    x4 = self.pointGATconv4(x4, pos,edge_index)

    

    \#x2 = self.conv2(x1, batch)

    out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))

    out = global_max_pool(out, batch)

    out = self.mlp(out)

    return F.log_softmax(out, dim=1)

 

Modelnet10

对于这个网络结构使用adam进行训练lr = 0.001

加入att与不加具有明显的提升2%

但是原始网络并没有达到92.9,只有90.2%左右 加入后达到了92.2%

 

import os.path as osp

 

import torch

import torch.nn.functional as F

from torch.nn import Sequential as Seq, Dropout, Linear as Lin

from torch_geometric.datasets import ModelNet

import torch_geometric.transforms as T

from torch_geometric.data import DataLoader

from torch_geometric.nn import global_max_pool #DynamicEdgeConv, 

 

from pointnet2_classification import MLP

from tqdm import tqdm

 

from torch_scatter import scatter_add

from torch_geometric.nn.conv import MessagePassing

from torch_geometric.utils import add_remaining_self_loops

from torch_geometric.nn.inits import glorot, zeros,reset

from torch_cluster import knn_graph

from torch.nn import Parameter

from torch_geometric.utils import remove_self_loops, add_self_loops, softmax

class EdgeConv(MessagePassing):

  r"""The edge convolutional operator from the `"Dynamic Graph CNN for

  Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper

 

  .. math::

​    \mathbf{x}^{\prime}_i = \sum_{j \in \mathcal{N}(i)}

​    h_{\mathbf{\Theta}}(\mathbf{x}_i \, \Vert \,

​    \mathbf{x}_j - \mathbf{x}_i),

 

  where :math:`h_{\mathbf{\Theta}}` denotes a neural network, *.i.e.* a MLP.

 

  Args:

​    nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that

​      maps pair-wise concatenated node features :obj:`x` of shape

​      :obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,

​      *e.g.*, defined by :class:`torch.nn.Sequential`.

​    aggr (string, optional): The aggregation scheme to use

​      (:obj:`"add"`, :obj:`"mean"`, :obj:`"max"`).

​      (default: :obj:`"max"`)

​    **kwargs (optional): Additional arguments of

​      :class:`torch_geometric.nn.conv.MessagePassing`.

  """

 

  def __init__(self, nn, aggr='max', **kwargs):

    super(EdgeConv, self).__init__(aggr=aggr, **kwargs)

    self.nn = nn

    self.reset_parameters()

 

  def reset_parameters(self):

    reset(self.nn)

 

  def forward(self, x, edge_index):

    """"""

    x = x.unsqueeze(-1) if x.dim() == 1 else x

    \#print(x.shape)

    return self.propagate(edge_index, x=x)

 

  def message(self, x_i, x_j):

    return self.nn(torch.cat([x_i, x_j - x_i], dim=1))

 

  def __repr__(self):

    return '{}(nn={})'.format(self.__class__.__name__, self.nn)

 

 

class DynamicEdgeConv(EdgeConv):

  r"""The dynamic edge convolutional operator from the `"Dynamic Graph CNN

  for Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper

  (see :class:`torch_geometric.nn.conv.EdgeConv`), where the graph is

  dynamically constructed using nearest neighbors in the feature space.

 

  Args:

​    nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that

​      maps pair-wise concatenated node features :obj:`x` of shape

​      `:obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,

​      *e.g.* defined by :class:`torch.nn.Sequential`.

​    k (int): Number of nearest neighbors.

​    aggr (string): The aggregation operator to use (:obj:`"add"`,

​      :obj:`"mean"`, :obj:`"max"`). (default: :obj:`"max"`)

​    **kwargs (optional): Additional arguments of

​      :class:`torch_geometric.nn.conv.MessagePassing`.

  """

 

  def __init__(self, nn, k, aggr='max', **kwargs):

    super(DynamicEdgeConv, self).__init__(nn=nn, aggr=aggr, **kwargs)

    self.k = k

 

  def forward(self, x, batch=None):

    """"""

    edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)

    \#print(edge_index)

    return super(DynamicEdgeConv, self).forward(x, edge_index),edge_index

 

  def __repr__(self):

    return '{}(nn={}, k={})'.format(self.__class__.__name__, self.nn,

                    self.k)

 

class PointGATConv(MessagePassing):

  r"""The graph attentional operator from the `"Graph Attention Networks"

  <https://arxiv.org/abs/1710.10903>`_ paper

 

  .. math::

​    \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} +

​    \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j},

 

  where the attention coefficients :math:`\alpha_{i,j}` are computed as

 

  .. math::

​    \alpha_{i,j} =

​    \frac{

​    \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}

​    [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j]

​    \right)\right)}

​    {\sum_{k \in \mathcal{N}(i) \cup \{ i \}}

​    \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}

​    [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k]

​    \right)\right)}.

 

  Args:

​    in_channels (int): Size of each input sample.

​    out_channels (int): Size of each output sample.

​    heads (int, optional): Number of multi-head-attentions.

​      (default: :obj:`1`)

​    concat (bool, optional): If set to :obj:`False`, the multi-head

​      attentions are averaged instead of concatenated.

​      (default: :obj:`True`)

​    negative_slope (float, optional): LeakyReLU angle of the negative

​      slope. (default: :obj:`0.2`)

​    dropout (float, optional): Dropout probability of the normalized

​      attention coefficients which exposes each node to a stochastically

​      sampled neighborhood during training. (default: :obj:`0`)

​    bias (bool, optional): If set to :obj:`False`, the layer will not learn

​      an additive bias. (default: :obj:`True`)

​    **kwargs (optional): Additional arguments of

​      :class:`torch_geometric.nn.conv.MessagePassing`.

  """

 

  def __init__(self, in_channels, out_channels, heads=1, concat=True,

         negative_slope=0.2, dropout=0, bias=True, **kwargs):

    super(PointGATConv, self).__init__(aggr='add', **kwargs)

 

    self.in_channels = in_channels

    self.out_channels = out_channels

    self.heads = heads

    self.concat = concat

    self.negative_slope = negative_slope

    self.dropout = dropout

 

    self.weight = Parameter(

      torch.Tensor(in_channels, heads * out_channels))

    self.att = Parameter(torch.Tensor(1, heads, 2 * out_channels))

 

    if bias and concat:

      self.bias = Parameter(torch.Tensor(heads * out_channels))

    elif bias and not concat:

      self.bias = Parameter(torch.Tensor(out_channels))

    else:

      self.register_parameter('bias', None)

 

    self.reset_parameters()

 

  def reset_parameters(self):

    glorot(self.weight)

    glorot(self.att)

    zeros(self.bias)

 

  def forward(self, x, pos, edge_index, size=None):

    """"""

    if size is None and torch.is_tensor(x):

      edge_index, _ = remove_self_loops(edge_index)

      edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

 

    if torch.is_tensor(x):

      x = torch.matmul(x, self.weight)

    else:

      x = (None if x[0] is None else torch.matmul(x[0], self.weight),

         None if x[1] is None else torch.matmul(x[1], self.weight))

 

    return self.propagate(edge_index, size=size, x=x)

 

  def message(self, edge_index_i, x_i, x_j, size_i):

    \# Compute attention coefficients.

    x_j = x_j.view(-1, self.heads, self.out_channels)

    if x_i is None:

      alpha = (x_j * self.att[:, :, self.out_channels:]).sum(dim=-1)

    else:

      x_i = x_i.view(-1, self.heads, self.out_channels)

      alpha = (torch.cat([x_i, x_j], dim=-1) * self.att).sum(dim=-1)

 

    alpha = F.leaky_relu(alpha, self.negative_slope)

    alpha = softmax(alpha, edge_index_i, size_i)

 

    \# Sample attention coefficients stochastically.

    alpha = F.dropout(alpha, p=self.dropout, training=self.training)

 

    return x_j * alpha.view(-1, self.heads, 1)

 

  def update(self, aggr_out):

    if self.concat is True:

      aggr_out = aggr_out.view(-1, self.heads * self.out_channels)

    else:

      aggr_out = aggr_out.mean(dim=1)

 

    if self.bias is not None:

      aggr_out = aggr_out + self.bias

    return aggr_out

 

  def __repr__(self):

    return '{}({}, {}, heads={})'.format(self.__class__.__name__,

                       self.in_channels,

                       self.out_channels, self.heads)

           

           

           

class Net(torch.nn.Module):

  def __init__(self, out_channels, k=20, aggr='max'):

    super().__init__()

    

    self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)

    self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

    self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

    self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)

    

    self.pointGATconv1 = PointGATConv(64,64)

    self.pointGATconv2 = PointGATConv(64,64)

    self.pointGATconv3 = PointGATConv(64,64)

    self.pointGATconv4 = PointGATConv(128,128)

    

    self.lin1 = MLP([128 + 64 + 64 + 64, 1024])

 

    self.mlp = Seq(

      MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),

      Lin(256, out_channels))

    

 

  def forward(self, data):

    pos, batch = data.pos, data.batch

    \#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])

    \#print('pos shape is :{}'.format(pos.shape))

    x1,edge_index = self.edgeconv1(pos, batch)

    \#print('layer1 out shape :{}'.format(x1.shape))

    x1 = self.pointGATconv1(x1, pos, edge_index)

    

    x2,edge_index = self.edgeconv2(x1, batch)

    x2 = self.pointGATconv2(x2,pos, edge_index)

    

    x3,edge_index = self.edgeconv3(x2, batch)

    x3 = self.pointGATconv3(x3, pos,edge_index)

    

    x4,edge_index = self.edgeconv4(x3, batch)

    x4 = self.pointGATconv4(x4, pos,edge_index)

    

    \#x2 = self.conv2(x1, batch)

    out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))

    out = global_max_pool(out, batch)

    out = self.mlp(out)

    return F.log_softmax(out, dim=1)

 

class PointAttentionNet(torch.nn.Module):

  def __init__(self, out_channels, k):

    super().__init__()

    self.k = k

    self.in_channels = 3

    self.conv = PointGATConv(self.in_channels,out_channels)

  def forward(x, pos, batch):

    edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)

    return self.conv(x,pos, edge_index)

    

 

 

 

def train():

  model.train()

 

  total_loss = 0

  for data in tqdm(train_loader,ascii=True):

    data = data.to(device)

    optimizer.zero_grad()

    out = model(data)

    break

    loss = F.nll_loss(out, data.y)

    loss.backward()

    total_loss += loss.item() * data.num_graphs

    optimizer.step()

  return total_loss / len(train_dataset)

 

 

def test(loader):

  model.eval()

 

  correct = 0

  for data in tqdm(loader,ascii=True):

    data = data.to(device)

    with torch.no_grad():

      pred = model(data).max(dim=1)[1]

    correct += pred.eq(data.y).sum().item()

  return correct / len(loader.dataset)

 

if __name__ == '__main__':

  sample_points = 1024

  path = osp.join(osp.dirname(osp.realpath(__file__)), '..', 'data/ModelNet10')

  pre_transform, transform = T.NormalizeScale(), T.SamplePoints(sample_points)

  train_dataset = ModelNet(path, '10', True, transform, pre_transform)

  test_dataset = ModelNet(path, '10', False, transform, pre_transform)

  train_loader = DataLoader(

    train_dataset, batch_size=8, shuffle=True)#, num_workers=6

  test_loader = DataLoader(

    test_dataset, batch_size=8, shuffle=False)

  

  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

  

  model = PointAttentionNet(train_dataset.num_classes, k=20).to(device)

  

  train()

  

  \# model = Net(train_dataset.num_classes, k=20).to(device)

  \# optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

  \# scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)

  \# for epoch in range(1, 201):

    \# loss = train()

    \# test_acc = test(test_loader)

    \# print('Epoch {:03d}, Loss: {:.4f}, Test: {:.4f}'.format(

      \# epoch, loss, test_acc))

    \# scheduler.step()