DGCNN 和 GAT 的思想结合

问题：解决基于图卷积技术细粒度点云分类中，特征污染的问题。降低不同类点，对源点的特征污染问题。

方法：对于动态图卷积网络，已知其能够很好地提取点云的语义特征，然而其提取到的特征是将所有点的重要程度都考虑为相似，而对于细粒度分类来说，其局部几何特征很重要，所以如何将源点重要的点的特征聚合，而非重要点的特征原理，成为了细粒度分类的重要工作。

方法：

DGCNN的两个核心思想：动态图： Dynamic Graph ，边缘卷积 EdgeConv其核心就是，用 “ [源点原始特征,源点特征 - 邻点特征’] 组合作为一个邻点的新特征。
Graph Attention Convolution for Point Cloud Semantic Segmentation : 图通道注意力机制：核心是利用源点与邻点“原始坐标的差值” 与 “隐含特征的差值” 组合，作为一个点的新特征，同时利用这个组合去做一个通道的注意力机制。

每个源点，其邻域点对其的每一个通道都具有不同的注意力系数，所以对于一个点的输出，其各个通道都是通过注意力系数聚合邻点信息后的结果。

(3) leN(i) e where äij,k is the attentional weight of vertex j to vertex i at the k-th feature channel. Therefore, the final output of the proposed GAC can be formulated as follows: (4)

Graph Attention Network：对于各个点的邻点，通过MLP建立一个注意力机制，通过两个隐含特征之间的相似度，来确定注意力系数的大小，是节点级别的注意力机制。

对于点云分类来说，能不能结合1与3，让注意力机制衡量相似度，不仅从隐含层空间出发，也从原始三维空间出发，同时考虑差值，即从[p,h, ]得到注意力系数,考虑原始三维坐标，同时也考虑差值。主要就是这些怎么组合，实验部分。

有一个问题，注意力机制都是建立在比较相似度的基础上的，也就是说，源点特征与邻点特征用来进行 Attention mechanism的特征应该是相同的，如何建立这个相同的特征？
对于Edge [x_i , xi-xj] 虽然能够很好地表示源点特征，但是对于attention mechanism， 这个特征对于邻点如何求？ GAT简单地将进行相似度比较，但是这样不是很适合点云等具有几何特征的数据，所以如何引入点云的原始特征？
将(pi, h(x_i ) pj, h(x_j) )等形式进行建立注意力机制？

如果纯Pytorch实现，不好扩展，写起来麻烦，pytorch geometric 比较方便。

下面是pyG的实现

实现过程

计算Att系数：

利用xi,xj-xi 计算一个相似度系数。【这个相似度计算可以有很多类型，应该是可以研究一个比较好的方法的】

然后，利用一个MLP，in out out 两层的，将

[Xi	xj-xi] 映射到高维，利用相似度系数进行特征聚合。

（目前 bias没有加入）

最后通过max和mean得到点云的graph embedding，

是否可以通过att得到呢，通过构造一个中心节点，

例如mean，计算

配置pyg，主要是要装一个vc++14的编译器，这样一些依赖库才能编译好。

然后git clone pyg的安装包，接着install。

# 保存整个网络 torch.save(net, PATH) # 保存网络中的参数, 速度快，占空间少 torch.save(net.state_dict(),PATH) #————————————————– #针对上面一般的保存方法，加载的方法分别是： model_dict=torch.load(PATH) model_dict=model.load_state_dict(torch.load(PATH))

来自 https://zhuanlan.zhihu.com/p/38056115

Windows 下有一个bug，num_workers >0 总是报出bug，无法运行。

Modelnet40：

实验有：

原始网络结构 + sgd是否能达到 92.9的效果，全部设置与原文一致。

只要是网络结构 64 64 128 256 -> 512维

然后将512维度 -> 1024

两次不同的pool, max. avg

得到2048维度向量

通过 2048->512->256->out 得到输出

实验：

Modelenet10：

Modelnet40：

原始论文复现：92.9 但是我复现目前最高是92.0%

（自己的复现没有加入数据增强，pyg下的数据增强方式目前有点没弄明白）

o. 920178 (134, 2) DGCI,m 1024 DGCNN 1024 Epochs

加入原始GAT的复现：

每层都加入一个GAT， batch要改为16

加入改进GAT的结果：

实验记录：

class Net(torch.nn.Module):

  def __init__(self, out_channels, k=20, aggr='max'):

​    super().__init__()

​    

​    self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)

​    self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

​    self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

​    self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)

​    

​    self.pointGATconv1 = PointGATConv(64,64)

​    self.pointGATconv2 = PointGATConv(64,64)

​    self.pointGATconv3 = PointGATConv(64,64)

​    self.pointGATconv4 = PointGATConv(128,128)

​    

​    self.lin1 = MLP([128 + 64 + 64 + 64, 1024])

​    self.mlp = Seq(

​      MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),

​      Lin(256, out_channels))

​    

  def forward(self, data):

​    pos, batch = data.pos, data.batch

​    \#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])

​    \#print('pos shape is :{}'.format(pos.shape))

​    x1,edge_index = self.edgeconv1(pos, batch)

​    \#print('layer1 out shape :{}'.format(x1.shape))

​    x1 = self.pointGATconv1(x1, pos, edge_index)

​    

​    x2,edge_index = self.edgeconv2(x1, batch)

​    x2 = self.pointGATconv2(x2,pos, edge_index)

​    

​    x3,edge_index = self.edgeconv3(x2, batch)

​    x3 = self.pointGATconv3(x3, pos,edge_index)

​    

​    x4,edge_index = self.edgeconv4(x3, batch)

​    x4 = self.pointGATconv4(x4, pos,edge_index)

​    

​    \#x2 = self.conv2(x1, batch)

​    out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))

​    out = global_max_pool(out, batch)

​    out = self.mlp(out)

​    return F.log_softmax(out, dim=1)

Modelnet10：

对于这个网络结构，使用adam进行训练，lr = 0.001

加入att，与不加具有明显的提升，2%

但是原始网络并没有达到92.9,只有90.2%左右， 加入后达到了92.2%

import os.path as osp

import torch

import torch.nn.functional as F

from torch.nn import Sequential as Seq, Dropout, Linear as Lin

from torch_geometric.datasets import ModelNet

import torch_geometric.transforms as T

from torch_geometric.data import DataLoader

from torch_geometric.nn import global_max_pool #DynamicEdgeConv, 

from pointnet2_classification import MLP

from tqdm import tqdm

from torch_scatter import scatter_add

from torch_geometric.nn.conv import MessagePassing

from torch_geometric.utils import add_remaining_self_loops

from torch_geometric.nn.inits import glorot, zeros,reset

from torch_cluster import knn_graph

from torch.nn import Parameter

from torch_geometric.utils import remove_self_loops, add_self_loops, softmax

class EdgeConv(MessagePassing):

  r"""The edge convolutional operator from the `"Dynamic Graph CNN for

  Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper

  .. math::

​    \mathbf{x}^{\prime}_i = \sum_{j \in \mathcal{N}(i)}

​    h_{\mathbf{\Theta}}(\mathbf{x}_i \, \Vert \,

​    \mathbf{x}_j - \mathbf{x}_i),

  where :math:`h_{\mathbf{\Theta}}` denotes a neural network, *.i.e.* a MLP.

  Args:

​    nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that

​      maps pair-wise concatenated node features :obj:`x` of shape

​      :obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,

​      *e.g.*, defined by :class:`torch.nn.Sequential`.

​    aggr (string, optional): The aggregation scheme to use

​      (:obj:`"add"`, :obj:`"mean"`, :obj:`"max"`).

​      (default: :obj:`"max"`)

​    **kwargs (optional): Additional arguments of

​      :class:`torch_geometric.nn.conv.MessagePassing`.

  """

  def __init__(self, nn, aggr='max', **kwargs):

​    super(EdgeConv, self).__init__(aggr=aggr, **kwargs)

​    self.nn = nn

​    self.reset_parameters()

  def reset_parameters(self):

​    reset(self.nn)

  def forward(self, x, edge_index):

​    """"""

​    x = x.unsqueeze(-1) if x.dim() == 1 else x

​    \#print(x.shape)

​    return self.propagate(edge_index, x=x)

  def message(self, x_i, x_j):

​    return self.nn(torch.cat([x_i, x_j - x_i], dim=1))

  def __repr__(self):

​    return '{}(nn={})'.format(self.__class__.__name__, self.nn)

class DynamicEdgeConv(EdgeConv):

  r"""The dynamic edge convolutional operator from the `"Dynamic Graph CNN

  for Learning on Point Clouds" <https://arxiv.org/abs/1801.07829>`_ paper

  (see :class:`torch_geometric.nn.conv.EdgeConv`), where the graph is

  dynamically constructed using nearest neighbors in the feature space.

  Args:

​    nn (torch.nn.Module): A neural network :math:`h_{\mathbf{\Theta}}` that

​      maps pair-wise concatenated node features :obj:`x` of shape

​      `:obj:`[-1, 2 * in_channels]` to shape :obj:`[-1, out_channels]`,

​      *e.g.* defined by :class:`torch.nn.Sequential`.

​    k (int): Number of nearest neighbors.

​    aggr (string): The aggregation operator to use (:obj:`"add"`,

​      :obj:`"mean"`, :obj:`"max"`). (default: :obj:`"max"`)

​    **kwargs (optional): Additional arguments of

​      :class:`torch_geometric.nn.conv.MessagePassing`.

  """

  def __init__(self, nn, k, aggr='max', **kwargs):

​    super(DynamicEdgeConv, self).__init__(nn=nn, aggr=aggr, **kwargs)

​    self.k = k

  def forward(self, x, batch=None):

​    """"""

​    edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)

​    \#print(edge_index)

​    return super(DynamicEdgeConv, self).forward(x, edge_index),edge_index

  def __repr__(self):

​    return '{}(nn={}, k={})'.format(self.__class__.__name__, self.nn,

​                    self.k)

class PointGATConv(MessagePassing):

  r"""The graph attentional operator from the `"Graph Attention Networks"

  <https://arxiv.org/abs/1710.10903>`_ paper

  .. math::

​    \mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} +

​    \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j},

  where the attention coefficients :math:`\alpha_{i,j}` are computed as

  .. math::

​    \alpha_{i,j} =

​    \frac{

​    \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}

​    [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j]

​    \right)\right)}

​    {\sum_{k \in \mathcal{N}(i) \cup \{ i \}}

​    \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top}

​    [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k]

​    \right)\right)}.

  Args:

​    in_channels (int): Size of each input sample.

​    out_channels (int): Size of each output sample.

​    heads (int, optional): Number of multi-head-attentions.

​      (default: :obj:`1`)

​    concat (bool, optional): If set to :obj:`False`, the multi-head

​      attentions are averaged instead of concatenated.

​      (default: :obj:`True`)

​    negative_slope (float, optional): LeakyReLU angle of the negative

​      slope. (default: :obj:`0.2`)

​    dropout (float, optional): Dropout probability of the normalized

​      attention coefficients which exposes each node to a stochastically

​      sampled neighborhood during training. (default: :obj:`0`)

​    bias (bool, optional): If set to :obj:`False`, the layer will not learn

​      an additive bias. (default: :obj:`True`)

​    **kwargs (optional): Additional arguments of

​      :class:`torch_geometric.nn.conv.MessagePassing`.

  """

  def __init__(self, in_channels, out_channels, heads=1, concat=True,

​         negative_slope=0.2, dropout=0, bias=True, **kwargs):

​    super(PointGATConv, self).__init__(aggr='add', **kwargs)

​    self.in_channels = in_channels

​    self.out_channels = out_channels

​    self.heads = heads

​    self.concat = concat

​    self.negative_slope = negative_slope

​    self.dropout = dropout

​    self.weight = Parameter(

​      torch.Tensor(in_channels, heads * out_channels))

​    self.att = Parameter(torch.Tensor(1, heads, 2 * out_channels))

​    if bias and concat:

​      self.bias = Parameter(torch.Tensor(heads * out_channels))

​    elif bias and not concat:

​      self.bias = Parameter(torch.Tensor(out_channels))

​    else:

​      self.register_parameter('bias', None)

​    self.reset_parameters()

  def reset_parameters(self):

​    glorot(self.weight)

​    glorot(self.att)

​    zeros(self.bias)

  def forward(self, x, pos, edge_index, size=None):

​    """"""

​    if size is None and torch.is_tensor(x):

​      edge_index, _ = remove_self_loops(edge_index)

​      edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

​    if torch.is_tensor(x):

​      x = torch.matmul(x, self.weight)

​    else:

​      x = (None if x[0] is None else torch.matmul(x[0], self.weight),

​         None if x[1] is None else torch.matmul(x[1], self.weight))

​    return self.propagate(edge_index, size=size, x=x)

  def message(self, edge_index_i, x_i, x_j, size_i):

​    \# Compute attention coefficients.

​    x_j = x_j.view(-1, self.heads, self.out_channels)

​    if x_i is None:

​      alpha = (x_j * self.att[:, :, self.out_channels:]).sum(dim=-1)

​    else:

​      x_i = x_i.view(-1, self.heads, self.out_channels)

​      alpha = (torch.cat([x_i, x_j], dim=-1) * self.att).sum(dim=-1)

​    alpha = F.leaky_relu(alpha, self.negative_slope)

​    alpha = softmax(alpha, edge_index_i, size_i)

​    \# Sample attention coefficients stochastically.

​    alpha = F.dropout(alpha, p=self.dropout, training=self.training)

​    return x_j * alpha.view(-1, self.heads, 1)

  def update(self, aggr_out):

​    if self.concat is True:

​      aggr_out = aggr_out.view(-1, self.heads * self.out_channels)

​    else:

​      aggr_out = aggr_out.mean(dim=1)

​    if self.bias is not None:

​      aggr_out = aggr_out + self.bias

​    return aggr_out

  def __repr__(self):

​    return '{}({}, {}, heads={})'.format(self.__class__.__name__,

​                       self.in_channels,

​                       self.out_channels, self.heads)

​           

​           

​           

class Net(torch.nn.Module):

  def __init__(self, out_channels, k=20, aggr='max'):

​    super().__init__()

​    

​    self.edgeconv1 = DynamicEdgeConv(MLP([2 * 3, 64]), k, aggr)

​    self.edgeconv2 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

​    self.edgeconv3 = DynamicEdgeConv(MLP([2 * 64, 64]), k, aggr)

​    self.edgeconv4 = DynamicEdgeConv(MLP([2 * 64, 128]), k, aggr)

​    

​    self.pointGATconv1 = PointGATConv(64,64)

​    self.pointGATconv2 = PointGATConv(64,64)

​    self.pointGATconv3 = PointGATConv(64,64)

​    self.pointGATconv4 = PointGATConv(128,128)

​    

​    self.lin1 = MLP([128 + 64 + 64 + 64, 1024])

​    self.mlp = Seq(

​      MLP([1024, 512]), Dropout(0.5), MLP([512, 256]), Dropout(0.5),

​      Lin(256, out_channels))

​    

  def forward(self, data):

​    pos, batch = data.pos, data.batch

​    \#print(batch[0], batch[sample_points-1], batch[sample_points+1], batch[sample_points*2-1], batch[sample_points*2])

​    \#print('pos shape is :{}'.format(pos.shape))

​    x1,edge_index = self.edgeconv1(pos, batch)

​    \#print('layer1 out shape :{}'.format(x1.shape))

​    x1 = self.pointGATconv1(x1, pos, edge_index)

​    

​    x2,edge_index = self.edgeconv2(x1, batch)

​    x2 = self.pointGATconv2(x2,pos, edge_index)

​    

​    x3,edge_index = self.edgeconv3(x2, batch)

​    x3 = self.pointGATconv3(x3, pos,edge_index)

​    

​    x4,edge_index = self.edgeconv4(x3, batch)

​    x4 = self.pointGATconv4(x4, pos,edge_index)

​    

​    \#x2 = self.conv2(x1, batch)

​    out = self.lin1(torch.cat([x1, x2, x3, x4], dim=1))

​    out = global_max_pool(out, batch)

​    out = self.mlp(out)

​    return F.log_softmax(out, dim=1)

class PointAttentionNet(torch.nn.Module):

  def __init__(self, out_channels, k):

​    super().__init__()

​    self.k = k

​    self.in_channels = 3

​    self.conv = PointGATConv(self.in_channels,out_channels)

  def forward(x, pos, batch):

​    edge_index = knn_graph(x, self.k, batch, loop=False, flow=self.flow)

​    return self.conv(x,pos, edge_index)

​    

def train():

  model.train()

  total_loss = 0

  for data in tqdm(train_loader,ascii=True):

​    data = data.to(device)

​    optimizer.zero_grad()

​    out = model(data)

​    break

​    loss = F.nll_loss(out, data.y)

​    loss.backward()

​    total_loss += loss.item() * data.num_graphs

​    optimizer.step()

  return total_loss / len(train_dataset)

def test(loader):

  model.eval()

  correct = 0

  for data in tqdm(loader,ascii=True):

​    data = data.to(device)

​    with torch.no_grad():

​      pred = model(data).max(dim=1)[1]

​    correct += pred.eq(data.y).sum().item()

  return correct / len(loader.dataset)

if __name__ == '__main__':

  sample_points = 1024

  path = osp.join(osp.dirname(osp.realpath(__file__)), '..', 'data/ModelNet10')

  pre_transform, transform = T.NormalizeScale(), T.SamplePoints(sample_points)

  train_dataset = ModelNet(path, '10', True, transform, pre_transform)

  test_dataset = ModelNet(path, '10', False, transform, pre_transform)

  train_loader = DataLoader(

​    train_dataset, batch_size=8, shuffle=True)#, num_workers=6

  test_loader = DataLoader(

​    test_dataset, batch_size=8, shuffle=False)

  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

  model = PointAttentionNet(train_dataset.num_classes, k=20).to(device)

  train()

  \# model = Net(train_dataset.num_classes, k=20).to(device)

  \# optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

  \# scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)

  \# for epoch in range(1, 201):

​    \# loss = train()

​    \# test_acc = test(test_loader)

​    \# print('Epoch {:03d}, Loss: {:.4f}, Test: {:.4f}'.format(

​      \# epoch, loss, test_acc))

​    \# scheduler.step()