Official implement for: (CVPR 2025) Semgeomo: Dynamic contextual human motion generation with semantic and geometric guidance.
📖 Project Page | 📄 Arxiv Paper |
- Python 3.8+, CUDA 11.0+, PyTorch 1.8+
- Clone the repository:
git clone https://github.com/your-repo/SemGeoMo.git
cd SemGeoMo- Create and create environment:
conda env create -f environment.yml
Download Long_CLIP and Pytorch3d package.
- Preprocessed Data (.pkl Files)
We provide preprocessed .pkl files for the FullBodyManipulation dataset. Each .pkl file contains a dictionary with the following structure:
motion_dict[name] = {
"motion": T x 263, # Motion features
"length": len(motion), # Number of frames
"text": str, # Annotated text description
"fine_text": str, # Fine-grained texts from LLM
"joint": T x 22 x 3, # Joint positions
"seq_name": str, # Sequence name
"id": int, # Sequence ID
"obj_name": str, # Object name
"pc": T x 1024 x 3, # Point cloud
"root_trans": T x 3, # Root translation
"dis": T x 1024 x 2, # Distance features
"pc_bps": T x 1024 x 3, # BPS features
"obj_rot_mat": T x 3 x 3, # Object rotation matrix
"obj_trans": T x 3, # Object translation
"obj_scale": T # Object scale
}Note: If this is your first time training Stage 1, the manip/data/hand_contact_data.py will automatically process the original data into .pkl files. Preprocessed .pkl files are already available in the /data_pkl folder.
-
Download relative package and models from the Google Drive.
-
Data Structure
The project expects the following directory structure:
SemGeoMo/
├── data_pkl/ # Preprocessed .pkl files
├── omomo_fps15/
├── pretrain/ # Pretrained models
├── bps/
├── glove/
├── body_models/
├──smpl
├──smpl_all_models/
├──smplx
├── Long_CLIP/
├── pytorch3d/
├── exp/ # Training outputs
├── semgeomo/
├── t2m/
Train the first stage using hand-object contact data:
# Option 1: Use the provided script
bash scripts/train_stage1_omomo.sh
# Option 2: Run directly
python trainer_contact.py \
--window=100 \
--batch_size=64 \
--project="./exp" \
--exp_name="omomo-stage1" \
--dataset_name="omomo" \
--text=TrueTrain the second stage using pretrained MDM models:
# Option 1: Use the provided script
bash scripts/train_stage2_omomo.sh
# Option 2: Run directly
python trainer_fullbody.py \
--window=100 \
--batch_size=16 \
--project="./exp" \
--exp_name="omomo-stage2" \
--dataset_name="omomo" \
--save_dir="path/to/save" \
--pretrained_path="path/to/pretrain/models"Note: Ensure you have the correct path for --pretrained_path, which should point to the pretrained MDM models stored in the /pretrain folder.
# Option 1: Use the provided script
bash scripts/test_stage1.sh
# Option 2: Run directly
python sample_stage1.py \
--window=100 \
--batch_size=64 \
--project="./exp" \
--exp_name="omomo-stage1" \
--add_hand_processing \
--test_sample_res \
--dataset_name="omomo" \
--checkpoint="path/to/stage1/checkpoint"\
--for_quant_eval \
--text=True \
--joint_together=True# Option 1: Use the provided script
bash scripts/test_pipeline.sh
# Option 2: Run directly
python sample_pipeline.py \
--window=100 \
--batch_size=1 \
--project="./exp" \
--exp_name="omomo-test" \
--dataset_name="omomo" \
--run_whole_pipeline \
--test_sample_res \
--checkpoint="path/to/stage1/checkpoint" \
--model_path="path/to/model" \
--text=True \
--for_quant_eval \
--use_posteriorNote:
- Fill in the appropriate paths for
--checkpointand--model_pathwhen running the test pipeline - The test pipeline evaluates 100 randomly selected samples
- For evaluation on the entire test set, run
bash scripts/test_all.sh
If you find this work useful, please cite our paper:
@inproceedings{cong2025semgeomo,
title={Semgeomo: Dynamic contextual human motion generation with semantic and geometric guidance},
author={Cong, Peishan and Wang, Ziyi and Ma, Yuexin and Yue, Xiangyu},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={17561--17570},
year={2025}
}This project is licensed under the MIT License - see the LICENSE file for details.
We thank the authors of MDM, OMOMO, InterControl and other related works for their contributions to the field.