-
Notifications
You must be signed in to change notification settings - Fork 236
Description
Hi MuJoCo Playground team,
Thanks for bring such exciting GPU-accelerated MuJoCo environment to us! I hope the learned policy in mujoco_playground will have a better sim2real performance. However, when I tried out the examples in learning folder, both train_rsl_rl.py and train_jax_ppo.py gives me bug/unexpected loss warnings.
I am using Ubuntu 22.04 + NVIDIA-SMI 550 (CUDA12.4) + CUDA-toolkit 12.1. And this is how I setup the virtual environment via Conda:
conda create -n mjx python=3.10 -y
conda activate mjx
conda install nvidia/label/cuda-12.4.0::cuda
conda install "jaxlib=*=*cuda*" jax -c conda-forge
pip install torch==2.3.1 torchvision==0.18.1
pip install mujoco
pip install mujoco_mjx
pip install brax
cd mujoco_playground
pip install -e .After running python learning/train_rsl_rl.py --env_name=G1JoystickRoughTerrain --use_wandb, the main concerning message is pasted below and the xla_dump_to folder is attached to this issue as well.
report_bug_rsrl_rl.zip
E0116 21:53:35.782328 160376 buffer_comparator.cc:157] Difference at 2: 0.857876, expected 1.0906
E0116 21:53:35.782335 160376 buffer_comparator.cc:157] Difference at 3: 1.05171, expected 0.555048
E0116 21:53:35.782336 160376 buffer_comparator.cc:157] Difference at 4: 1.00491, expected 0.702288
E0116 21:53:35.782338 160376 buffer_comparator.cc:157] Difference at 5: 0.839983, expected 1.15234
E0116 21:53:35.782339 160376 buffer_comparator.cc:157] Difference at 8: 0.697264, expected 0.893167
E0116 21:53:35.782340 160376 buffer_comparator.cc:157] Difference at 9: 0.751796, expected 0.312863
E0116 21:53:35.782341 160376 buffer_comparator.cc:157] Difference at 10: 0.914679, expected 0.354761
E0116 21:53:35.782343 160376 buffer_comparator.cc:157] Difference at 13: 0.676378, expected 0.405283
E0116 21:53:35.782344 160376 buffer_comparator.cc:157] Difference at 15: 0.317772, expected 1.22127
E0116 21:53:35.782345 160376 buffer_comparator.cc:157] Difference at 16: 0.297798, expected 1.3303
2025-01-16 21:53:35.783104: E external/xla/xla/service/gpu/autotuning/gemm_fusion_autotuner.cc:1180] Results do not match the reference. This is likely a bug/unexpected loss of precision.
E0116 21:53:35.786688 160376 buffer_comparator.cc:157] Difference at 16: -nan, expected 0
E0116 21:53:35.786699 160376 buffer_comparator.cc:157] Difference at 17: -nan, expected 0
E0116 21:53:35.786701 160376 buffer_comparator.cc:157] Difference at 18: -nan, expected 0
E0116 21:53:35.786702 160376 buffer_comparator.cc:157] Difference at 19: -nan, expected 0
E0116 21:53:35.786703 160376 buffer_comparator.cc:157] Difference at 20: -nan, expected 0
E0116 21:53:35.786704 160376 buffer_comparator.cc:157] Difference at 21: -nan, expected 0
E0116 21:53:35.786705 160376 buffer_comparator.cc:157] Difference at 22: -nan, expected 0
E0116 21:53:35.786706 160376 buffer_comparator.cc:157] Difference at 23: -nan, expected 0
E0116 21:53:35.786708 160376 buffer_comparator.cc:157] Difference at 24: -nan, expected 0
E0116 21:53:35.786709 160376 buffer_comparator.cc:157] Difference at 25: -nan, expected 0
2025-01-16 21:53:35.786711: E external/xla/xla/service/gpu/autotuning/gemm_fusion_autotuner.cc:1180] Results do not match the reference. This is likely a bug/unexpected loss of precision.Similarly, the main error message and related xla_dump_to folder after executing python learning/train_jax_ppo --env_name=G1JoystickRoughTerrain --use_wandb are provided below.
report_bug_jax_ppo.zip
2025-01-16 21:59:15.164089: E external/xla/xla/service/gpu/autotuning/gemm_fusion_autotuner.cc:1180] Results do not match the reference. This is likely a bug/unexpected loss of precision.
E0116 21:59:15.166167 165887 buffer_comparator.cc:157] Difference at 16: 0.806099, expected 8.41456
E0116 21:59:15.166175 165887 buffer_comparator.cc:157] Difference at 17: 0.669318, expected 10.7305
E0116 21:59:15.166177 165887 buffer_comparator.cc:157] Difference at 18: 0.618111, expected 8.46751
E0116 21:59:15.166179 165887 buffer_comparator.cc:157] Difference at 19: 1.59979, expected 11.3751
E0116 21:59:15.166181 165887 buffer_comparator.cc:157] Difference at 20: 1.41715, expected 9.13166
E0116 21:59:15.166182 165887 buffer_comparator.cc:157] Difference at 21: 1.41268, expected 9.03136
E0116 21:59:15.166184 165887 buffer_comparator.cc:157] Difference at 22: 0.627852, expected 9.2793
E0116 21:59:15.166186 165887 buffer_comparator.cc:157] Difference at 23: 0.627606, expected 9.35567
E0116 21:59:15.166189 165887 buffer_comparator.cc:157] Difference at 24: 0.660828, expected 9.69482
E0116 21:59:15.166191 165887 buffer_comparator.cc:157] Difference at 25: 1.15741, expected 10.8501
2025-01-16 21:59:15.166194: E external/xla/xla/service/gpu/autotuning/gemm_fusion_autotuner.cc:1180] Results do not match the reference. This is likely a bug/unexpected loss of precision.
E0116 21:59:15.168032 165887 buffer_comparator.cc:157] Difference at 16: 0.806099, expected 8.41456
E0116 21:59:15.168036 165887 buffer_comparator.cc:157] Difference at 17: 0.669318, expected 10.7305
E0116 21:59:15.168039 165887 buffer_comparator.cc:157] Difference at 18: 0.618111, expected 8.46751
E0116 21:59:15.168040 165887 buffer_comparator.cc:157] Difference at 19: 1.59979, expected 11.3751
E0116 21:59:15.168042 165887 buffer_comparator.cc:157] Difference at 20: 1.41715, expected 9.13166
E0116 21:59:15.168044 165887 buffer_comparator.cc:157] Difference at 21: 1.41268, expected 9.03136
E0116 21:59:15.168045 165887 buffer_comparator.cc:157] Difference at 22: 0.627852, expected 9.2793
E0116 21:59:15.168047 165887 buffer_comparator.cc:157] Difference at 23: 0.627606, expected 9.35567
E0116 21:59:15.168048 165887 buffer_comparator.cc:157] Difference at 24: 0.660828, expected 9.69482
E0116 21:59:15.168051 165887 buffer_comparator.cc:157] Difference at 25: 1.15741, expected 10.8501
2025-01-16 21:59:15.168053: E external/xla/xla/service/gpu/autotuning/gemm_fusion_autotuner.cc:1180] Results do not match the reference. This is likely a bug/unexpected loss of precision.