-
Notifications
You must be signed in to change notification settings - Fork 412
Closed
Description
I am conducting reinforcement learning for a robot using rsl_rl and isaac lab. While it works fine with simple settings, when I switch to more complex settings (such as Domain Randomization), the following error occurs during training(After some progress in training), indicating that the actor's standard deviation does not meet the condition of being ≥ 0. Has anyone experienced a similar error?
num_env is 3600
Traceback (most recent call last):
File "/root/IsaacLab/source/standalone/workflows/rsl_rl/train.py", line 131, in <module>
main()
File "/root/IsaacLab/source/standalone/workflows/rsl_rl/train.py", line 123, in main
runner.learn(num_learning_iterations=agent_cfg.max_iterations, init_at_random_ep_len=True)
File "/isaac-sim/kit/python/lib/python3.10/site-packages/rsl_rl/runners/on_policy_runner.py", line 153, in learn
mean_value_loss, mean_surrogate_loss = self.alg.update()
File "/isaac-sim/kit/python/lib/python3.10/site-packages/rsl_rl/algorithms/ppo.py", line 121, in update
self.actor_critic.act(obs_batch, masks=masks_batch, hidden_states=hid_states_batch[0])
File "/isaac-sim/kit/python/lib/python3.10/site-packages/rsl_rl/modules/actor_critic.py", line 105, in act
File "/isaac-sim/exts/omni.isaac.ml_archive/pip_prebundle/torch/distributions/normal.py", line 74, in sample
return torch.normal(self.loc.expand(shape), self.scale.expand(shape))
RuntimeError: normal expects all elements of std >= 0.0
I investigated the value of std(self.scale) and found that the std value in a certain environment appears to be nan. (The number of columns represents the action dimensions for the robot.)
self.scale: tensor([[0.1926, 0.2051, 0.1785, ..., 0.7033, 0.8655, 0.8500],
[0.1926, 0.2051, 0.1785, ..., 0.7033, 0.8655, 0.8500],
[0.1926, 0.2051, 0.1785, ..., 0.7033, 0.8655, 0.8500],
...,
[0.1926, 0.2051, 0.1785, ..., 0.7033, 0.8655, 0.8500],
[0.1926, 0.2051, 0.1785, ..., 0.7033, 0.8655, 0.8500],
[0.1926, 0.2051, 0.1785, ..., 0.7033, 0.8655, 0.8500]],
device='cuda:0')
env_id: 1111, row: tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],
device='cuda:0')
Metadata
Metadata
Assignees
Labels
No labels