Skip to content

Official implement for SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

License

Notifications You must be signed in to change notification settings

4DVLab/SemGeoMo

Repository files navigation

Semgeomo

Official implement for: (CVPR 2025) Semgeomo: Dynamic contextual human motion generation with semantic and geometric guidance.


Semgeomo: Dynamic contextual human motion generation with semantic and geometric guidance

📖 Project Page | 📄 Arxiv Paper |

Table of Contents

Installation

Prerequisites

  • Python 3.8+, CUDA 11.0+, PyTorch 1.8+

Environment Setup 🛠️

  1. Clone the repository:
git clone https://github.com/your-repo/SemGeoMo.git
cd SemGeoMo
  1. Create and create environment:
conda env create -f environment.yml

Download Long_CLIP and Pytorch3d package.

Data Preparation

  • Preprocessed Data (.pkl Files)

We provide preprocessed .pkl files for the FullBodyManipulation dataset. Each .pkl file contains a dictionary with the following structure:

motion_dict[name] = {
    "motion": T x 263,           # Motion features
    "length": len(motion),        # Number of frames
    "text": str,                  # Annotated text description
    "fine_text": str,             # Fine-grained texts from LLM
    "joint": T x 22 x 3,         # Joint positions
    "seq_name": str,              # Sequence name
    "id": int,                    # Sequence ID
    "obj_name": str,              # Object name
    "pc": T x 1024 x 3,          # Point cloud
    "root_trans": T x 3,         # Root translation
    "dis": T x 1024 x 2,         # Distance features
    "pc_bps": T x 1024 x 3,      # BPS features
    "obj_rot_mat": T x 3 x 3,    # Object rotation matrix
    "obj_trans": T x 3,          # Object translation
    "obj_scale": T                # Object scale
}

Note: If this is your first time training Stage 1, the manip/data/hand_contact_data.py will automatically process the original data into .pkl files. Preprocessed .pkl files are already available in the /data_pkl folder.

  • Download relative package and models from the Google Drive.

  • Data Structure

The project expects the following directory structure:

SemGeoMo/
├── data_pkl/          # Preprocessed .pkl files
  ├── omomo_fps15/
├── pretrain/          # Pretrained models
├── bps/
├── glove/
├── body_models/
  ├──smpl
├──smpl_all_models/
  ├──smplx
├── Long_CLIP/
├── pytorch3d/
├── exp/               # Training outputs
├── semgeomo/  
  ├── t2m/       

Training

Stage 1: Geometric guidance Training

Train the first stage using hand-object contact data:

# Option 1: Use the provided script
bash scripts/train_stage1_omomo.sh

# Option 2: Run directly
python trainer_contact.py \
  --window=100 \
  --batch_size=64 \
  --project="./exp" \
  --exp_name="omomo-stage1" \
  --dataset_name="omomo" \
  --text=True

Stage 2: Full-Body Motion Training

Train the second stage using pretrained MDM models:

# Option 1: Use the provided script
bash scripts/train_stage2_omomo.sh

# Option 2: Run directly
python trainer_fullbody.py \
  --window=100 \
  --batch_size=16 \
  --project="./exp" \
  --exp_name="omomo-stage2" \
  --dataset_name="omomo" \
  --save_dir="path/to/save" \
  --pretrained_path="path/to/pretrain/models"

Note: Ensure you have the correct path for --pretrained_path, which should point to the pretrained MDM models stored in the /pretrain folder.

Testing

Stage 1

# Option 1: Use the provided script
bash scripts/test_stage1.sh

# Option 2: Run directly
python sample_stage1.py \
  --window=100 \
  --batch_size=64 \
  --project="./exp" \
  --exp_name="omomo-stage1" \
  --add_hand_processing \
  --test_sample_res \
  --dataset_name="omomo" \
  --checkpoint="path/to/stage1/checkpoint"\
  --for_quant_eval \
  --text=True \
  --joint_together=True

Full Pipeline Testing

# Option 1: Use the provided script
bash scripts/test_pipeline.sh

# Option 2: Run directly
python sample_pipeline.py \
  --window=100 \
  --batch_size=1 \
  --project="./exp" \
  --exp_name="omomo-test" \
  --dataset_name="omomo" \
  --run_whole_pipeline \
  --test_sample_res \
  --checkpoint="path/to/stage1/checkpoint" \
  --model_path="path/to/model" \
  --text=True \
  --for_quant_eval \
  --use_posterior

Note:

  • Fill in the appropriate paths for --checkpoint and --model_path when running the test pipeline
  • The test pipeline evaluates 100 randomly selected samples
  • For evaluation on the entire test set, run bash scripts/test_all.sh

Citation 🖊️

If you find this work useful, please cite our paper:

@inproceedings{cong2025semgeomo,
  title={Semgeomo: Dynamic contextual human motion generation with semantic and geometric guidance},
  author={Cong, Peishan and Wang, Ziyi and Ma, Yuexin and Yue, Xiangyu},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={17561--17570},
  year={2025}
}

License 🎫

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

We thank the authors of MDM, OMOMO, InterControl and other related works for their contributions to the field.

About

Official implement for SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published