We present the detailed performance on various datasets and modalities.
All the checkpoints and training logs are provided in the Google Drive. We sincerely hope that this repo could be helpful for your research.
The detailed results for pretrained models are displayed below:
| Modality | NTU 60 X-Sub | NTU 60 X-View | NTU 120 X-Sub | NTU 120 X-Set | Kinetics-Skeleton | FineGYM |
|---|---|---|---|---|---|---|
| Joint | 91.54 | 96.33 | 85.52 | 88.35 | 48.02 | 93.28 |
| Bone | 91.98 | 96.15 | 88.96 | 90.01 | 47.06 | 94.84 |
| K-Bone | 91.59 | 96.61 | 88.30 | 89.65 | 45.86 | 94.44 |
| 2-ensemble | 92.96 | 97.23 | 89.75 | 91.23 | 49.85 | 95.35 |
| 4-ensemble | 93.53 | 97.49 | 90.43 | 91.86 | 51.33 | 95.62 |
| 6-ensemble | 93.81 | 97.76 | 90.92 | 92.16 | 51.85 | 95.94 |
We adopt the widely-used six-stream ensemble strategy introduced in InfoGCN. Here K-Bone denotes the newly skeleton representation proposed by InfoGCN. Interestingly, we find that the improvement of multi-stream ensemble method mainly comes from complementarity and stochasticity. For well-performing models, stochastic boosting of single-modality is more efficient than complementary boosting of motion-modality. The detailed comparisons for various datasets are provided in {dataset}_ensemble.py.
In addition, we use two augmentation techniques Flip and Part Drop that could provide performance gains. Notably, due to randomness, they may also make the performance fluctuate. You could choose whether or not to use these augmentations based on the actual needs.