Multi-Group Multi-Attention: Towards Discriminative Spatiotemporal Representation

We release the code of our paper MGMA. (ACM MM 2020)

@inproceedings{shi2020multi,
title={Multi-Group Multi-Attention: Towards Discriminative Spatiotemporal Representation},
author={Shi, Zhensheng and Cao, Liangjie and Guan, Cheng and Liang, Ju and Li, Qianqian and Gu, Zhaorui and Zheng, Haiyong and Zheng, Bing},
booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
pages={2057--2066},
year={2020}
}

Introduction

MGMA (Multi-Group Multi-Attention) is proposed for learning discriminative spatiotemporal representation in videos, by paying more attention to "where and when" the action happens. We devise a lightweight and flexible yet effective MGMA module for 3D-CNNs, which models spatiotemporal separable attention (T-ATTN and S-ATTN), and adopts multi-group structure for capturing multi-attention information.

This code is based on the PySlowFast codebase. The core implementation for MGMA are lib/models/mgma_builder.py, lib/models/mgma_helper.py, and an efficient implementation lib/models/mgma_shufflenet_builder.py.

Requirements

Python >= 3.7
Numpy
PyTorch 1.8
fvcore: pip install 'git+https://github.com/facebookresearch/fvcore'
torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this.
simplejson: pip install simplejson
GCC >= 4.9
PyAV: conda install av -c conda-forge
ffmpeg (4.0 is prefereed, will be installed along with PyAV)
PyYaml: (will be installed along with fvcore)
tqdm: (will be installed along with fvcore)
iopath: pip install -U iopath or conda install -c iopath iopath
psutil: pip install psutil
OpenCV: pip install opencv-python
torchvision: pip install torchvision or conda install torchvision -c pytorch

Datasets

UCF-101

Download the dataset and annotations from dataset provider.
Prepare the csv files for training, validation, and testing set as train.csv, val.csv, test.csv. The format of the csv file is:

path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N

Put all csv files in the same folder, and set DATA.PATH_TO_DATA_DIR to the path.

Running

To train and test a MGMA-ResNet-34 model from scratch on UCF-101. You can build variant MGMA-Nets via setting MGMA.TYPE and MGMA.NUM_GROUPS.
```
python tools/run_net.py \
  --cfg configs/Ucf101/UCF101_MGMA.yaml \
  DATA.PATH_TO_DATA_DIR path_to_csv_files \
  MGMA.TYPE TSA \
  MGMA.NUM_GROUPS 8
```
You can also set the variables (DATA_PATH, MGMA_TYPE, MGMA_NUM_GROUPS) in scripts/run_ucf101_r3d_34.sh, and then run the script.
```
bash scripts/run_ucf101_r3d_34.sh
```

Models

We will provide the models and results later.

Acknowledgement

We really appreciate the contributors of following codebases.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs/Ucf101		configs/Ucf101
lib		lib
scripts		scripts
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
network.png		network.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Group Multi-Attention: Towards Discriminative Spatiotemporal Representation

Introduction

Requirements

Datasets

UCF-101

Running

Models

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

zhenglab/mgma

Folders and files

Latest commit

History

Repository files navigation

Multi-Group Multi-Attention: Towards Discriminative Spatiotemporal Representation

Introduction

Requirements

Datasets

UCF-101

Running

Models

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages