Skip to content

Commit db97bf6

Browse files
author
Chris Elion
authored
Academy singleton docs (#3218)
1 parent 6d70c0d commit db97bf6

15 files changed

+23
-57
lines changed

docs/FAQ.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ UnityEnvironment(file_name=filename, worker_id=X)
9494

9595
If you receive a message `Mean reward : nan` when attempting to train a model
9696
using PPO, this is due to the episodes of the Learning Environment not
97-
terminating. In order to address this, set `Max Steps` for either the Academy or
97+
terminating. In order to address this, set `Max Steps` for the
9898
Agents within the Scene Inspector to a value greater than 0. Alternatively, it
9999
is possible to manually set `done` conditions for episodes from within scripts
100100
for custom episode-terminating events.

docs/Getting-Started-with-Balance-Ball.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,6 @@ it contains not one, but several agent cubes. Each agent cube in the scene is a
4848
independent agent, but they all share the same Behavior. 3D Balance Ball does this
4949
to speed up training since all twelve agents contribute to training in parallel.
5050

51-
### Academy
52-
53-
The Academy object for the scene is placed on the Ball3DAcademy GameObject.
5451

5552
### Agent
5653

docs/Glossary.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ML-Agents Toolkit Glossary
22

3-
* **Academy** - Unity Component which controls timing, reset, and
3+
* **Academy** - Singleton object which controls timing, reset, and
44
training/inference settings of the environment.
55
* **Action** - The carrying-out of a decision on the part of an agent within the
66
environment.
@@ -12,7 +12,7 @@
1212
carried out given an observation.
1313
* **Editor** - The Unity Editor, which may include any pane (e.g. Hierarchy,
1414
Scene, Inspector).
15-
* **Environment** - The Unity scene which contains Agents and the Academy.
15+
* **Environment** - The Unity scene which contains Agents.
1616
* **FixedUpdate** - Unity method called each time the game engine is
1717
stepped. ML-Agents logic should be placed here.
1818
* **Frame** - An instance of rendering the main camera for the display.

docs/Learning-Environment-Create-New.md

Lines changed: 2 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,11 @@ steps:
1717
1. Create an environment for your agents to live in. An environment can range
1818
from a simple physical simulation containing a few objects to an entire game
1919
or ecosystem.
20-
2. Add an Academy MonoBehaviour to a GameObject in the Unity scene
21-
containing the environment.
22-
3. Implement your Agent subclasses. An Agent subclass defines the code an Agent
20+
2. Implement your Agent subclasses. An Agent subclass defines the code an Agent
2321
uses to observe its environment, to carry out assigned actions, and to
2422
calculate the rewards used for reinforcement training. You can also implement
2523
optional methods to reset the Agent when it has finished or failed its task.
26-
4. Add your Agent subclasses to appropriate GameObjects, typically, the object
24+
3. Add your Agent subclasses to appropriate GameObjects, typically, the object
2725
in the scene that represents the Agent in the simulation.
2826

2927
**Note:** If you are unfamiliar with Unity, refer to
@@ -103,27 +101,6 @@ different material from the list of all materials currently in the project.)
103101
Note that we will create an Agent subclass to add to this GameObject as a
104102
component later in the tutorial.
105103

106-
### Add an Empty GameObject to Hold the Academy
107-
108-
1. Right click in Hierarchy window, select Create Empty.
109-
2. Name the GameObject "Academy"
110-
111-
![The scene hierarchy](images/mlagents-NewTutHierarchy.png)
112-
113-
You can adjust the camera angles to give a better view of the scene at runtime.
114-
The next steps will be to create and add the ML-Agent components.
115-
116-
## Add an Academy
117-
The Academy object coordinates the ML-Agents in the scene and drives the
118-
decision-making portion of the simulation loop. Every ML-Agent scene needs one
119-
(and only one) Academy instance.
120-
121-
First, add an Academy component to the Academy GameObject created earlier:
122-
123-
1. Select the Academy GameObject to view it in the Inspector window.
124-
2. Click **Add Component**.
125-
3. Select **Academy** in the list of components.
126-
127104
## Implement an Agent
128105

129106
To create the Agent:
@@ -524,7 +501,6 @@ to use Unity ML-Agents: an Academy and one or more Agents.
524501

525502
Keep in mind:
526503

527-
* There can only be one Academy game object in a scene.
528504
* If you are using multiple training areas, make sure all the Agents have the same `Behavior Name`
529505
and `Behavior Parameters`
530506

docs/Learning-Environment-Design.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ The ML-Agents Academy class orchestrates the agent simulation loop as follows:
5151
an Agent to restart if it finishes before the end of an episode. In this
5252
case, the Academy calls the `AgentReset()` function.
5353

54-
To create a training environment, extend the Academy and Agent classes to
54+
To create a training environment, extend the Agent class to
5555
implement the above methods. The `Agent.CollectObservations()` and
5656
`Agent.AgentAction()` functions are required; the other methods are optional —
5757
whether you need to implement them or not depends on your specific scenario.
@@ -64,14 +64,13 @@ information.
6464

6565
## Organizing the Unity Scene
6666

67-
To train and use the ML-Agents toolkit in a Unity scene, the scene must contain
68-
a single Academy and as many Agent subclasses as you need.
67+
To train and use the ML-Agents toolkit in a Unity scene, the scene as many Agent subclasses as you need.
6968
Agent instances should be attached to the GameObject representing that Agent.
7069

7170
### Academy
7271

73-
The Academy object orchestrates Agents and their decision making processes. Only
74-
place a single Academy object in a scene.
72+
The Academy is a singleton which orchestrates Agents and their decision making processes. Only
73+
a single Academy exists at a time.
7574

7675
#### Academy resetting
7776
To alter the environment at the start of each episode, add your method to the Academy's OnEnvironmentReset action.
@@ -81,9 +80,7 @@ public class MySceneBehavior : MonoBehaviour
8180
{
8281
public void Awake()
8382
{
84-
var academy = FindObjectOfType<Academy>();
85-
academy.LazyInitialization();
86-
academy.OnEnvironmentReset += EnvironmentReset;
83+
Academy.Instance.OnEnvironmentReset += EnvironmentReset;
8784
}
8885

8986
void EnvironmentReset()
@@ -144,8 +141,6 @@ training and for testing trained agents. Or, you may be training agents to
144141
operate in a complex game or simulation. In this case, it might be more
145142
efficient and practical to create a purpose-built training scene.
146143

147-
Both training and testing (or normal game) scenes must contain an Academy object
148-
to control the agent decision making process.
149144
When you create a training environment in Unity, you must set up the scene so
150145
that it can be controlled by the external training process. Considerations
151146
include:

docs/Limitations.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,13 @@ making. See
1616
[Execution Order of Event Functions](https://docs.unity3d.com/Manual/ExecutionOrder.html)
1717
for more information.
1818

19+
You can control the frequency of Academy stepping by calling
20+
`Academy.Instance.DisableAutomaticStepping()`, and then calling
21+
`Academy.Instance.EnvironmentStep()`
22+
1923
## Python API
2024

2125
### Python version
2226

2327
As of version 0.3, we no longer support Python 2.
2428

25-
### TensorFlow support
26-
27-
Currently the Ml-Agents toolkit uses TensorFlow 1.7.1 only.

docs/ML-Agents-Overview.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -131,17 +131,15 @@ components:
131131

132132
_Simplified block diagram of ML-Agents._
133133

134-
The Learning Environment contains two additional components that help
134+
The Learning Environment contains an additional component that help
135135
organize the Unity scene:
136136

137137
- **Agents** - which is attached to a Unity GameObject (any character within a
138138
scene) and handles generating its observations, performing the actions it
139139
receives and assigning a reward (positive / negative) when appropriate. Each
140140
Agent is linked to a Policy.
141-
- **Academy** - which orchestrates the observation and decision making process.
142-
The External Communicator lives within the Academy.
143141

144-
Every Learning Environment will always have one global Academy and one Agent for
142+
Every Learning Environment will always have one Agent for
145143
every character in the scene. While each Agent must be linked to a Policy, it is
146144
possible for Agents that have similar observations and actions to have
147145
the same Policy type. In our sample game, we have two teams each with their own medic.

docs/Migrating.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,14 @@ The versions can be found in
1111
## Migrating from 0.13 to latest
1212

1313
### Important changes
14-
* The Academy class was changed to be sealed and its virtual methods were removed.
14+
* The Academy class was changed to a singleton, and its virtual methods were removed.
1515
* Trainer steps are now counted per-Agent, not per-environment as in previous versions. For instance, if you have 10 Agents in the scene, 20 environment steps now corresponds to 200 steps as printed in the terminal and in Tensorboard.
1616
* Curriculum config files are now YAML formatted and all curricula for a training run are combined into a single file.
1717
* The `--num-runs` command-line option has been removed.
1818

1919
### Steps to Migrate
2020
* If you have a class that inherits from Academy:
21-
* If the class didn't override any of the virtual methods and didn't store any additional data, you can just replace the instance of it in the scene with an Academy.
21+
* If the class didn't override any of the virtual methods and didn't store any additional data, you can just remove the old script from the scene.
2222
* If the class had additional data, create a new MonoBehaviour and store the data on this instead.
2323
* If the class overrode the virtual methods, create a new MonoBehaviour and move the logic to it:
2424
* Move the InitializeAcademy code to MonoBehaviour.OnAwake

docs/Python-API.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -263,7 +263,6 @@ i = env.reset()
263263
Once a property has been modified in Python, you can access it in C# after the next call to `step` as follows:
264264

265265
```csharp
266-
var academy = FindObjectOfType<Academy>();
267-
var sharedProperties = academy.FloatProperties;
266+
var sharedProperties = Academy.Instance.FloatProperties;
268267
float property1 = sharedProperties.GetPropertyWithDefault("parameter_1", 0.0f);
269268
```

docs/Training-Curriculum-Learning.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,8 @@ the same environment.
4141
In order to define the curricula, the first step is to decide which parameters of
4242
the environment will vary. In the case of the Wall Jump environment,
4343
the height of the wall is what varies. We define this as a `Shared Float Property`
44-
that can be accessed in `Academy.FloatProperties`, and by doing so it becomes
45-
adjustable via the Python API.
44+
that can be accessed in `Academy.Instance.FloatProperties`, and by doing
45+
so it becomes adjustable via the Python API.
4646
Rather than adjusting it by hand, we will create a YAML file which
4747
describes the structure of the curricula. Within it, we can specify which
4848
points in the training process our wall height will change, either based on the

0 commit comments

Comments
 (0)