From 7a5b74536287d7e363ae0a181ab458228fe40b18 Mon Sep 17 00:00:00 2001 From: Prabhat Date: Tue, 3 Sep 2019 15:15:29 +0900 Subject: [PATCH 1/2] adds new training times and scores for rainbow --- examples/atari/reproduction/rainbow/README.md | 135 +++++++++--------- 1 file changed, 69 insertions(+), 66 deletions(-) diff --git a/examples/atari/reproduction/rainbow/README.md b/examples/atari/reproduction/rainbow/README.md index 55c4456cc..0da2fb219 100644 --- a/examples/atari/reproduction/rainbow/README.md +++ b/examples/atari/reproduction/rainbow/README.md @@ -22,80 +22,82 @@ python train_rainbow.py [options] To view the full list of options, either view the code or run the example with the `--help` option. ## Results -These results reflect ChainerRL `v0.6.0`. +These results reflect ChainerRL `v0.7.0`. | Results Summary || | ------------- |:-------------:| | Number of seeds | 1 | -| Number of common domains | 49 | -| Number of domains where paper scores higher | 20 | -| Number of domains where ChainerRL scores higher | 27 | -| Number of ties between paper and ChainerRL | 2 | +| Number of common domains | 51 | +| Number of domains where paper scores higher | 21 | +| Number of domains where ChainerRL scores higher | 29 | +| Number of ties between paper and ChainerRL | 1 | | Game | ChainerRL Score | Original Reported Scores | | ------------- |:-------------:|:-------------:| -| AirRaid | 6926.1| N/A| -| Alien | 9376.0| **9491.7**| -| Amidar | N/A| 5131.2| -| Assault | **16203.2**| 14198.5| -| Asterix | **674122.5**| 428200.3| -| Asteroids | **20008.5**| 2712.8| -| Atlantis | **938895.5**| 826659.5| -| BankHeist | 1114.3| **1358.0**| -| BattleZone | **103190.0**| 62010.0| -| BeamRider | **20029.4**| 16850.2| -| Berzerk | **6461.2**| 2545.6| -| Bowling | **80.8**| 30.0| -| Boxing | 99.4| **99.6**| -| Breakout | 360.6| **417.5**| -| Carnival | 6050.1| N/A| -| Centipede | **8429.7**| 8167.3| -| ChopperCommand | **19403.5**| 16654.0| -| CrazyClimber | **177331.0**| 168788.5| +| AirRaid | 8447.9| N/A| +| Alien | **12163.2**| 9491.7| +| Amidar | 4697.4| **5131.2**| +| Assault | **18425.9**| 14198.5| +| Asterix | 298025.0| **428200.3**| +| Asteroids | **5131.2**| 2712.8| +| Atlantis | **851950.0**| 826659.5| +| BankHeist | **1630.5**| 1358.0| +| BattleZone | **98923.1**| 62010.0| +| BeamRider | **19279.4**| 16850.2| +| Berzerk | **3757.2**| 2545.6| +| Bowling | **45.0**| 30.0| +| Boxing | **99.8**| 99.6| +| Breakout | 351.8| **417.5**| +| Carnival | 4446.0| N/A| +| Centipede | **8337.6**| 8167.3| +| ChopperCommand | 9068.4| **16654.0**| +| CrazyClimber | 163036.0| **168788.5**| | Defender | N/A| 55105.0| -| DemonAttack | 109342.0| **111185.2**| -| DoubleDunk | -6.8| **-0.3**| -| Enduro | 2125.8| **2125.9**| -| FishingDerby | **57.3**| 31.3| -| Freeway | 31.9| **34.0**| -| Frostbite | **10288.5**| 9590.5| -| Gopher | 69889.0| **70354.6**| -| Gravitar | **2437.3**| 1419.3| -| Hero | 37921.8| **55887.4**| -| IceHockey | **6.2**| 1.1| -| Jamesbond | 20242.0| N/A| -| Kangaroo | **14825.0**| 14637.5| -| Krull | 7896.7| **8741.5**| -| KungFuMaster | 32833.5| **52181.0**| +| DemonAttack | 104041.3| **111185.2**| +| DoubleDunk | **0.0**| -0.3| +| Enduro | **2311.3**| 2125.9| +| FishingDerby | **40.9**| 31.3| +| Freeway | 33.3| **34.0**| +| Frostbite | **10497.6**| 9590.5| +| Gopher | **98084.0**| 70354.6| +| Gravitar | 1302.5| **1419.3**| +| Hero | 30907.1| **55887.4**| +| IceHockey | **2.9**| 1.1| +| Jamesbond | 21323.2| N/A| +| JourneyEscape | -185.3| N/A| +| Kangaroo | **15500.0**| 14637.5| +| Krull | 6761.6| **8741.5**| +| KungFuMaster | 39858.3| **52181.0**| | MontezumaRevenge | 0.0| **384.0**| -| MsPacman | 5223.1| **5380.4**| -| NameThisGame | N/A| 13136.0| -| Phoenix | **280612.8**| 108528.6| -| Pitfall | -2.2| N/A| +| MsPacman | **6015.4**| 5380.4| +| NameThisGame | 13092.1| **13136.0**| +| Phoenix | **223676.3**| 108528.6| +| Pitfall | -3.5| N/A| | Pitfall! | N/A| 0.0| -| Pong | **20.9**| **20.9**| -| Pooyan | 20962.1| N/A| +| Pong | **21.0**| 20.9| +| Pooyan | 7946.6| N/A| | PrivateEye | 100.0| **4234.0**| -| Qbert | **39152.5**| 33817.5| -| Riverraid | 18084.6| N/A| -| RoadRunner | **68956.5**| 62041.0| -| Robotank | **74.3**| 61.4| -| Seaquest | 1836.7| **15898.9**| -| Skiing | **-9714.6**| -12957.8| -| Solaris | **7086.3**| 3560.3| -| SpaceInvaders | 9352.0| **18789.0**| -| StarGunner | **211851.5**| 127029.0| +| Qbert | **38605.7**| 33817.5| +| Riverraid | 22309.8| N/A| +| RoadRunner | **64002.0**| 62041.0| +| Robotank | **74.5**| 61.4| +| Seaquest | 1843.6| **15898.9**| +| Skiing | **-11093.2**| -12957.8| +| Solaris | 911.8| **3560.3**| +| SpaceInvaders | 2812.9| **18789.0**| +| StarGunner | **202136.4**| 127029.0| | Surround | N/A| 9.7| -| Tennis | **-0.0**| **0.0**| -| TimePilot | **27177.0**| 12926.0| -| Tutankham | 161.1| **241.0**| -| UpNDown | 260453.0| N/A| -| Venture | **1359.5**| 5.5| -| VideoPinball | 465601.0| **533936.5**| -| WizardOfWor | **22575.0**| 17862.5| -| YarsRevenge | 80853.9| **102557.0**| -| Zaxxon | **25779.5**| 22209.5| +| Tennis | **0.0**| **0.0**| +| TimePilot | **23123.8**| 12926.0| +| Tutankham | **250.7**| 241.0| +| UpNDown | 27630.0| N/A| +| Venture | 0.0| **5.5**| +| VideoPinball | 438907.9| **533936.5**| +| WizardOfWor | **20770.2**| 17862.5| +| YarsRevenge | 101023.5| **102557.0**| +| Zaxxon | 14635.1| **22209.5**| + ## Evaluation Protocol @@ -115,9 +117,10 @@ Our evaluation protocol is designed to mirror the evaluation protocol of the ori Time statistics... -| Statistic | | | -| ------------- |:-------------:|:-------------:| -| Mean time (in days) across all domains | 11.8778333366 | -| Fastest Domain | Assault | 11.1949926343 | -| Slowest Domain | Krull | 12.4206600419 | +| Training time (in days) across all domains | | +| ------------- |:-------------:| +| Mean | 13.2241181406 | +| Fastest Domain |11.6815262606 (Phoenix)| +| Slowest Domain | 16.8376358549 (Berzerk)| + From 7b29e5120c48d2eb2b8fcd45f07ab829fc6846f8 Mon Sep 17 00:00:00 2001 From: Prabhat Date: Tue, 3 Sep 2019 15:20:57 +0900 Subject: [PATCH 2/2] improves formatting of training times --- examples/atari/reproduction/rainbow/README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/examples/atari/reproduction/rainbow/README.md b/examples/atari/reproduction/rainbow/README.md index 0da2fb219..f66796b74 100644 --- a/examples/atari/reproduction/rainbow/README.md +++ b/examples/atari/reproduction/rainbow/README.md @@ -117,10 +117,11 @@ Our evaluation protocol is designed to mirror the evaluation protocol of the ori Time statistics... -| Training time (in days) across all domains | | +| Training time (in days) across all domains | | | ------------- |:-------------:| -| Mean | 13.2241181406 | -| Fastest Domain |11.6815262606 (Phoenix)| -| Slowest Domain | 16.8376358549 (Berzerk)| +| Mean | 13.224 | +| Fastest Domain |11.682 (Phoenix)| +| Slowest Domain | 16.838 (Berzerk)| +