-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add stochastic taxi (rainy+fickle) #1315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stochastic taxi (rainy+fickle) #1315
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @foreverska, thanks for the PR.
Generally the PR looks good.
Could you change the np.random
to self.np_random
and revert the environment version increment (new features that don't affect default behaviour shouldn't require version bumps)
and a couple of questions before approving and merging
- Is this backward compatible with default parameters? If it does currently, could we make it backward compatible.
- Is the probability of the behaviour reported in
step
andreset
correctly?
Addressed Comments.
Yes, default values for rainy and fickle are both False. This aligns functionality with pre-commit default behavior.
It matches the other ToyText environments (Cliff/Frozen) with step returning the probability of the taken transition and reset always returning 1. I added lines in the unit test to guard against regression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the tests and documentation updates, then should be good to merge.
Thanks for making the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me,
@pseudo-rnd-thoughts I think this needs one more review since there was a commit after your last one. Thanks. |
Hey @foreverska, sorry I'm on holiday currently. Looking over the PR again, I'm a tad worried about the |
Please enjoy vacation and ignore this until you're good and rested. Is the ask to restore the original code and switch to it when it's not rainy and switch to this new code if it is? Or is there a more pointed change you'd like to see? |
hi @foreverska, I'm back. Looking over the PR with a week or two rest, I agree that I would prefer if the new code was "disabled" by default and wouldn't run at some, unlike the current solution. |
@pseudo-rnd-thoughts Not a problem at all. Pushed up a change that restores the old code when dry. Let me know what you think. |
gymnasium/envs/toy_text/taxi.py
Outdated
@@ -220,11 +212,148 @@ def __init__(self, render_mode: Optional[str] = None): | |||
self.P[state][action].append( | |||
(1.0, new_state, reward, terminated) | |||
) | |||
|
|||
def __build_rainy_transitions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To minimise changes, could this function, take row, col, pass_idx, dest_idx, action
as arguments that we run the original code unless is_rainy
then we call this function.
Then return the data.
It just means that we don't need to copy and paste the massive for loop and is clear what the differences between the current functions is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I have done a reasonable job at making the code as reusable as possible. Please let me know if you had something else in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @foreverska.
One last request is to change the function names to _{name}
rather than double underscore to make the style of the rest of the project.
Then we should be good to merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger, adjusted the function names, ready for final review.
Description
Adds rainy transition probabilities and fickle passenger to align environment with paper.
Fixes #161
Type of change
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)