Fix a bug of unintentionally using same process indices #455

muupan · 2019-05-08T16:25:21Z

I noticed some of batch training examples have a bug that unintentionally uses same process indices (thus same random seeds!) across env processes. This PR fixes the bug.

You can see the current and new behaviors by running the code below.

import functools
import chainerrl
import gym

num_envs = 4


def make_env(process_idx, test):
    print(process_idx, test)
    return gym.make('Pendulum-v0')


def make_batch_env_old(test):
    return chainerrl.envs.MultiprocessVectorEnv(
        [(lambda: make_env(idx, test))
         for idx, env in enumerate(range(num_envs))])


def make_batch_env_new(test):
    return chainerrl.envs.MultiprocessVectorEnv(
        [functools.partial(make_env, idx, test)
         for idx, env in enumerate(range(num_envs))])


print('make_batch_env_old')
make_batch_env_old(test=False)
make_batch_env_old(test=True)

print('make_batch_env_new')
make_batch_env_new(test=False)
make_batch_env_new(test=True)

This code will outputs:

make_batch_env_old
3 False
3 False
3 False
3 False
3 True
3 True
3 True
3 True
make_batch_env_new
0 False
1 False
2 False
3 False
0 True
1 True
2 True
3 True

Even when same random seeds are used in env processes, actions sent by the agent are usually different due to the stochasticity of policy or eplorer, so this may not result in noticeable difference in learning results, but this is definitely a bug that needs to be fixed.

toslunar

LGTM

toslunar

Could you fix tests with the same kind of bugs?

chainerrl/tests/wrappers_tests/test_vector_frame_stack.py

Lines 69 to 80 in a63eb14

    
           # Wrap by FrameStack and MultiprocessVectorEnv 
        
           fs_env = chainerrl.envs.MultiprocessVectorEnv( 
        
               [(lambda: FrameStack( 
        
                   make_env(idx), k=self.k, channel_order='chw')) 
        
                for idx, env in enumerate(range(self.num_envs))]) 
        
           # Wrap by MultiprocessVectorEnv and VectorFrameStack 
        
           vfs_env = VectorFrameStack( 
        
               chainerrl.envs.MultiprocessVectorEnv( 
        
                   [(lambda: make_env(idx)) 
        
                    for idx, env in enumerate(range(self.num_envs))]), 
        
               k=self.k, stack_axis=0)

muupan · 2019-05-09T03:05:21Z

Thanks for the review. I fixed tests/wrappers_tests/test_vector_frame_stack.py as well.

Fix a bug of unintentionally using same process idx

4f80b76

muupan added the bug label May 8, 2019

toslunar approved these changes May 9, 2019

View reviewed changes

toslunar requested changes May 9, 2019

View reviewed changes

Fix a bug of unintentionally using same process idx in tests

3b3c16d

toslunar approved these changes May 9, 2019

View reviewed changes

toslunar merged commit 940ae01 into chainer:master May 9, 2019

muupan deleted the fix-same-process-idx branch May 9, 2019 04:06

muupan added this to the v0.7 milestone Jun 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix a bug of unintentionally using same process indices #455

Fix a bug of unintentionally using same process indices #455

Uh oh!

muupan commented May 8, 2019

Uh oh!

toslunar left a comment

Uh oh!

toslunar left a comment

Uh oh!

muupan commented May 9, 2019

Uh oh!

Uh oh!

	# Wrap by FrameStack and MultiprocessVectorEnv
	fs_env = chainerrl.envs.MultiprocessVectorEnv(
	[(lambda: FrameStack(
	make_env(idx), k=self.k, channel_order='chw'))
	for idx, env in enumerate(range(self.num_envs))])

	# Wrap by MultiprocessVectorEnv and VectorFrameStack
	vfs_env = VectorFrameStack(
	chainerrl.envs.MultiprocessVectorEnv(
	[(lambda: make_env(idx))
	for idx, env in enumerate(range(self.num_envs))]),
	k=self.k, stack_axis=0)

Fix a bug of unintentionally using same process indices #455

Fix a bug of unintentionally using same process indices #455

Uh oh!

Conversation

muupan commented May 8, 2019

Uh oh!

toslunar left a comment

Choose a reason for hiding this comment

Uh oh!

toslunar left a comment

Choose a reason for hiding this comment

Uh oh!

muupan commented May 9, 2019

Uh oh!

Uh oh!