Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action基于双流结构的自适应图卷积网络的骨骼姿态识别

本论文主要解决两个问题:

  1. 以往的人体骨架图的拓扑结构是人为固定的,仅表示人体物理结构→自适应图卷积
  2. ST-GCN每个顶点的特征向量仅包含关节的2D或3D坐标(一阶信息),人体二阶信息没有充分利用(骨头的方向和长度)→双流结构

作者证明了几点:

  1. 图的拓扑应随动作不同而不同(比如摸头,头和手的连接关系更强;其他动作不一定)
  2. 不同的样本需要不同的图拓扑,不同的层需要具有不同拓扑结构的图

每一帧的骨架数据是一个而向量序列(每个关节点数据用向量表示,有多个关节点)。论文沿用了ST-GCN(时空图卷积)对人体骨架的建模方式:空间边(关节之间的连接)+时间边(每个关节和自己相邻帧的连接)。

(空间维度的图卷积公式)在空间维度上,对每一个顶点vi,计算它在下一层的向量值的方式为:所有跟vi相邻为1

### Skeleton-Based Action Recognition Using Adaptive Cross-Form Learning In the realm of skeleton-based action recognition, adaptive cross-form learning represents a sophisticated approach that integrates multiple modalities to enhance performance. This method leverages both spatial and temporal information from skeletal data while adapting dynamically across different forms or representations. The core concept involves constructing an end-to-end trainable framework where features extracted from joint coordinates are transformed into various intermediate representations such as graphs or sequences[^1]. These diverse forms capture distinct aspects of human motion patterns effectively: - **Graph Representation**: Models interactions between joints by treating them as nodes connected via edges representing bones. - **Sequence Modeling**: Treats each frame's pose estimation results as elements within time-series data suitable for recurrent neural networks (RNN). Adaptive mechanisms allow seamless switching among these forms based on their suitability at different stages during training/inference processes. Specifically designed modules learn when and how much weight should be assigned to specific transformations ensuring optimal utilization of available cues without overfitting any single modality. For implementation purposes, one might consider employing Graph Convolutional Networks (GCNs) alongside Long Short-Term Memory units (LSTMs). GCNs excel in capturing structural dependencies present within graph structures derived from skeletons; meanwhile LSTMs handle sequential modeling tasks efficiently handling long-range dependencies found along video frames' timelines. ```python import torch.nn as nn class AdaptiveCrossFormModule(nn.Module): def __init__(self): super(AdaptiveCrossFormModule, self).__init__() # Define components responsible for processing individual form types here def forward(self, input_data): # Implement logic determining which transformation path(s) will process 'input_data' pass def train_model(model, dataset_loader): criterion = nn.CrossEntropyLoss() optimizer = ... # Initialize appropriate optimization algorithm for epoch in range(num_epochs): running_loss = 0.0 for inputs, labels in dataset_loader: outputs = model(inputs) loss = criterion(outputs, labels) optimizer.zero_grad() loss.backward() optimizer.step() running_loss += loss.item() ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值