Chen-Hsuan Lin,
Oliver Wang,
Bryan C. Russell,
Eli Shechtman,
Vladimir G. Kim,
Matthew Fisher,
and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Project page: https://chenhsuanlin.bitbucket.io/photometric-mesh-optim
Paper: https://chenhsuanlin.bitbucket.io/photometric-mesh-optim/paper.pdf
arXiv preprint: https://arxiv.org/abs/1903.08642
We provide PyTorch code for the following experiments:
- ShapeNet+SUN360 sequences
- (coming soon!) Real-world videos
This code is developed with Python3 (python3). PyTorch 1.0+ is required.
(If you wish to run using PyTorch 0.4, please switch to the pytorch-0.4 branch.)
First, create a Python virtual environment by running
virtualenv -p python3 PMO
Before installing dependencies and/or running the code, activate the virtual environment by running
source PMO/bin/activate
The dependencies can be installed by running (within the virtual environment)
pip3 install --upgrade -r requirements.txt
The test sequences are composited from the leave-out set of ShapeNet and SUN360.
To download the dataset (64GB), run the script file download_sequences.sh under the directory data.
After downloading, run tar -zxf sequences.tar.gz under the directory data. The files will be extracted to a directory sequences.
We render images from ShapeNet with higher quality (with resolution of 224x224).
To download the dataset (33GB), run the script file download_rendering.sh under the directory data.
After downloading, run tar -xf rendering.tar. The files will be extracted to a directory rendering.
Please follow the instructions in the AtlasNet repository to download the ground-truth point clouds.
The directory customShapeNet should be placed under the directory data.
The cropped background images from SUN360 (92GB) can be downloaded by running the script file download_background.sh under the directory data.
After downloading, run tar -xf background.tar. The files will be extracted to a directory background.
The pretrained models (626MB each) can be downloaded by running the command
wget https://cmu.box.com/shared/static/oryysitkhn2eldgb90qkr3lh7j469sj1.npz # airplane
wget https://cmu.box.com/shared/static/jgif23ytibtektwwcji8wiv0jbubzs08.npz # car
wget https://cmu.box.com/shared/static/zakir5pi9xma4l3d5c2g74i8r0lggp36.npz # chair
The meshrender library can be compiled by running python3 setup.py install under the directory meshrender.
The chamfer library can be compiled by running python3 setup.py install under the directory chamfer.
The source code is taken from the AtlasNet repository.
When compiling CUDA code, you may need to modify CUDA_PATH accordingly.
To try a demo of the photometric mesh optimization, download our pretrained model for cars.
Then run (setting the model variable to the downloaded checkpoint)
model=pretrained/02958343_atl25.npz
python3 main.py --load=${model} --code=5e-2 --scale=2e-2 --lr-pmo=3e-3 --noise=0.1 --video
This will create the following output files:
- the optimized object mesh (saved into the directory
optimized_mesh), - the input video sequence with the overlayed 3D mesh (saved to
video), and - (coming soon!) a 3D mesh model (in
.objformat) with textures estimated from the input RGB sequence.
The flags --log-tb and --log-vis toggles visualization of the optimization process.
More optional arguments can be found by running python3 main.py --help.
To pretrain AtlasNet with our new dataset (high-resolution ShapeNet rendering + SUN360 cropped backgrounds), run the following command (taking the airplane category for example)
cat=02691156
python3 main_pretrain.py --category=${cat} --name=${cat}_pretrain \
--imagenet-enc --pretrained-dec=pretrained/ae_atlasnet_25.pth
By default, we initialize the encoder with an ImageNet-pretrained ResNet-18 and the decoder with the pretrained AtlasNet (Please refer to the AtlasNet repository for downloading their pretrained models).
More optional arguments can be found by running python3 main_pretrain.py --help.
We've included code to visualize the training over TensorBoard(X). To execute, run
tensorboard --logdir=summary/GROUP --port=6006
where GROUP is specified in the pretraining arguments.
For pretraining, we provide three types of data visualization:
- SCALARS: training and test loss curves over epochs
- IMAGES: sample input images
If you find our code useful for your research, please cite
@inproceedings{lin2019photometric,
title={Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction},
author={Lin, Chen-Hsuan and Wang, Oliver and Russell, Bryan C and Shechtman, Eli and Kim, Vladimir G and Fisher, Matthew and Lucey, Simon},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
year={2019}
}
Please contact me ([email protected]) if you have any questions!
