Fix bootstrapping on episode timeout #574

awjuliani · 2018-04-04T00:38:17Z

Value estimate bootstrapping upon episode timeout (ie max_steps reached) was incorrectly using observation from new episode, rather than final observation from current episode. This PR addresses this, and improves performance for environments in which the "end of the episode" is an arbitrary timeout (such as 3DBall, Reacher, and Crawler).

vincentpierre

Not happy about changing process_experiences, might break the trainers someone else implemented

Fix bootstrapping on episode timeout

ab63caf

awjuliani requested review from eshvk, mmattar and vincentpierre April 4, 2018 00:38

Use correct BrainInfo in behavioral cloning

cb67bd2

vincentpierre approved these changes Apr 4, 2018

View reviewed changes

mmattar approved these changes Apr 6, 2018

View reviewed changes

awjuliani merged commit da56ddd into hotfix-0.3.0c Apr 6, 2018

awjuliani deleted the hotfix-bootstrapping branch April 6, 2018 23:25

github-actions bot locked as resolved and limited conversation to collaborators May 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bootstrapping on episode timeout #574

Fix bootstrapping on episode timeout #574

Uh oh!

awjuliani commented Apr 4, 2018 •

edited

Loading

Uh oh!

vincentpierre left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix bootstrapping on episode timeout #574

Fix bootstrapping on episode timeout #574

Uh oh!

Conversation

awjuliani commented Apr 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentpierre left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

awjuliani commented Apr 4, 2018 •

edited

Loading