Skip to content

Commit 0a98680

Browse files
chenkinsaiAdrianSergeCroise
authored
Update transition map details. (#6)
* Update basic actions. * Add Flatland 3 interplay action and position/direction. * Add graph demo. * Update graph demo notebook. * Rename basic_railway_elements.drawio.png. * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> * Update getting-started/env.md Co-authored-by: Serge Croisé <[email protected]> --------- Co-authored-by: Adrian Egli <[email protected]> Co-authored-by: Serge Croisé <[email protected]>
1 parent 5f5c75d commit 0a98680

File tree

6 files changed

+792
-29
lines changed

6 files changed

+792
-29
lines changed

_toc.yml

+1
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ parts:
2929
- file: environment/state_machine
3030
- file: environment/pettingzoo
3131
- file: environment/Agent-Close-Following
32+
- file: environment/graph_demo
3233
# - file: _sources/flatland/docs/10_interface_toc.rst
3334

3435
# - caption: Environment Documentation
67.5 KB
Loading
Loading
246 KB
Loading

environment/graph_demo.ipynb

+726
Large diffs are not rendered by default.

getting-started/env.md

+65-29
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ Flatland Environment
33

44
The goal in Flatland is simple:
55

6-
> **We seek to minimize the time it takes to bring all the agents to their respective target.**
6+
> **We seek to minimize the time it takes to bring all the agents to their respective target.**
77
88
This raises a number of questions:
99

@@ -12,50 +12,81 @@ This raises a number of questions:
1212
- [**Observations:**](#observations) what can each agent "see"?
1313
- [**Rewards:**](#rewards) what is the metric used to evaluate the agents?
1414

15-
1615
🗺️ Environment
1716
---
1817

19-
Flatland is a 2D rectangular grid environment of arbitrary width and height, where the most primitive unit is a cell. Each cell has the capacity to hold a single agent (train).
18+
Flatland is a 2D rectangular grid environment of arbitrary width and height, where the most primitive unit is a cell. Each cell has the capacity to hold a
19+
single agent (train).
20+
21+
An agent in a cell can have a discrete orientation direction which represents the cardinal direction the agent is pointing to. An agent can move to a subset of
22+
adjacent cells. The subset of adjacent cells that an agent is allowed to transition to is defined by a 4-bit transition map representing possible transitions in
23+
4 different directions.
2024

21-
An agent in a cell can have a discrete orientation direction which represents the cardinal direction the agent is pointing to. An agent can move to a subset of adjacent cells. The subset of adjacent cells that an agent is allowed to transition to is defined by a 4-bit transition map representing possible transitions in 4 different directions.
25+
![basic_railway_elements.drawio.png](../assets/images/basic_railway_elements.drawio.png)
2226

23-
![Imgur](https://i.imgur.com/Q72tAI8.png)
24-
*8 unique cells enable us to implement any realworld railway network in the flatland env*
27+
*10 basic cells modulo rotation enable us to implement any real-world railway network in the flatland env*
28+
This gives a set of 30 valid transitions in total (see `#` giving number of rotations).
2529

26-
Agents can only travel in the direction they are currently facing. Hence, the permitted transitions for any given agent depend both on its position and on its direction. Transition maps define the railway network in the flatland world. One can implement any real world railway network within the Flatland environment by manupulating the transition maps of cells.
27-
28-
For more information on transtion maps checkout [environment information](../environment/environment_information)!
30+
Agents can only travel in the direction they are currently facing. Hence, the permitted transitions for any given agent depend both on its position and on its
31+
direction. Transition maps define the railway network in the flatland world. One can implement any real world railway network within the Flatland environment by
32+
manipulating the transition maps of cells.
33+
34+
For more information on transition maps checkout [environment information](../environment/environment_information)!
2935

3036

3137
↔️ Actions
3238
---
3339

34-
The trains in Flatland have strongly limited movements, as you would expect from a railway simulation. This means that only a few actions are valid in most cases.
40+
The trains in Flatland have strongly limited movements, as you would expect from a railway simulation. This means that only a few actions are valid in most
41+
cases.
3542

3643
Here are the possible actions:
3744

38-
- **`DO_NOTHING`**: If the agent is already moving, it continues moving. If it is stopped, it stays stopped. Special case: if the agent is at a dead-end, this action will result in the train turning around.
39-
- **`MOVE_LEFT`**: This action is only valid at cells where the agent can change direction towards the left. If chosen, the left transition and a rotation of the agent orientation to the left is executed. If the agent is stopped, this action will cause it to start moving in any cell where forward or left is allowed!
40-
- **`MOVE_FORWARD`**: The agent will move forward. This action will start the agent when stopped. At switches, this will chose the forward direction.
45+
- **`DO_NOTHING`**: If the agent is already moving, it continues moving. If it is stopped, it stays stopped. Special case: if the agent is at a dead-end, this
46+
action will result in the train turning around.
47+
- **`MOVE_LEFT`**: This action is only valid at cells where the agent can change direction towards the left. If chosen, the left transition and a rotation of
48+
the agent orientation to the left is executed. If the agent is stopped, this action will cause it to start moving in any cell where forward or left is
49+
allowed!
50+
- **`MOVE_FORWARD`**: The agent will move forward. This action will start the agent when stopped. At switches, this will choose the forward direction.
4151
- **`MOVE_RIGHT`**: The same as deviate left but for right turns.
4252
- **`STOP_MOVING`**: This action causes the agent to stop.
4353

44-
Flatland is a discrete time simulation, i.e. it performs all actions with constant time step. A single simulation step synchronously moves the time forward by a constant increment, thus enacting exactly one action per agent per timestep.
54+
Flatland is a discrete time simulation, i.e. it performs all actions with constant time step. A single simulation step synchronously moves the time forward by a
55+
constant increment, thus enacting exactly one action per agent per timestep.
56+
4557
```{admonition} Code reference
4658
The actions are defined in [flatland.envs.rail_env.RailEnvActions](https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/envs/rail_env.py#L69).
4759
4860
You can refer to the directions in your code using eg `RailEnvActions.MOVE_FORWARD`, `RailEnvActions.MOVE_RIGHT`...
4961
```
62+
63+
The following diagram shows the interplay of agent position/direction and actions.
64+
65+
The agent (red triangle) is in left switch cell with direction `W`. The left neighbor cell is a left switch, too.
66+
Upon entering the new cell, the `MOVE_LEFT` action will update the agent's direction to `S`, and the `MOVE_FORWARD` direction will keep the agent's direction at
67+
`W`.
68+
69+
![Flatland_3_Update.drawio.png](../assets/images/Flatland_3_Update.drawio.png)
70+
71+
> *Pro memoria*
72+
>
73+
> **current position and direction** determine **next cell**
74+
>
75+
> **action** determines **next direction**
76+
5077
### 💥 Agent Malfunctions
51-
Malfunctions are implemented to simulate delays by stopping agents at random times for random durations. Train that malfunction can’t move for a random, but known, number of steps. They of course block the trains following them 😬.
78+
79+
Malfunctions are implemented to simulate delays by stopping agents at random times for random durations. Train that malfunction can’t move for a random, but
80+
known, number of steps. They of course block the trains following them 😬.
5281

5382
👀 Observations
5483
---
5584

56-
In Flatland, you have full control over the observations that your agents will work with. Three observations are provided as starting point. However, you are encouraged to implement your own.
85+
In Flatland, you have full control over the observations that your agents will work with. Three observations are provided as starting point. However, you are
86+
encouraged to implement your own.
5787

5888
The three provided observations are:
89+
5990
- Global grid observation
6091
- Local grid observation
6192
- Tree observation
@@ -70,7 +101,8 @@ The three provided observations are:
70101
The provided observations are defined in [envs/observations.py](https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/envs/observations.py)
71102
```
72103

73-
Each of the provided observation has its strengths and weaknesses. However, it is unlikely that you will be able to solve the problem by using any single one of them directly. Instead you will need to design your own observation, which can be a combination of the existing ones or which could be radically different.
104+
Each of the provided observations has its strengths and weaknesses. However, it is unlikely that you will be able to solve the problem by using any single one of
105+
them directly. Instead you will need to design your own observation, which can be a combination of the existing ones or which could be radically different.
74106

75107
**[🔗 Create your own observations](../environment/custom_observations)**
76108

@@ -80,37 +112,41 @@ Each of the provided observation has its strengths and weaknesses. However, it i
80112

81113
In **Flat**land 3, rewards are only provided at the end of an episode by default making it a sparse reward setting.
82114

83-
The episodes finish when all the trains have reached their target, or when the maximum number of time steps is reached.
115+
The episodes finish when all the trains have reached their target, or when the maximum number of time steps is reached.
84116

85117
The actual reward structure has the following cases:
86118

87-
- **Train has arrived at it's target**: The agent will be given a reward of 0 for arriving on time or before the expected time. For arriving at the target later than the specified time, the agent is given a negative reward proportional to the delay.
88-
`min(latest_arrival - actual_arrival, 0 )`
119+
- **Train has arrived at its target**: The agent will be given a reward of 0 for arriving on time or before the expected time. For arriving at the target later
120+
than the specified time, the agent is given a negative reward proportional to the delay.
121+
`min(latest_arrival - actual_arrival, 0 )`
89122

90-
- **The train did not reach it's target yet**: The reward is negative and equal to the estimated amount of time needed by the agent to reach its target from it's current position, if it travels on the shortest path to the target, while accounting for it's latest arrival time.
91-
`agent.get_current_delay()` *refer to it in detail [here](../environment/timetables)*
92-
The value returned will be positive if the expected arrival time is projected before latest arrival and negative if the expected arrival time is projected after latest arrival. Since it is called at the end of the episode, the agent is already past it's deadline and so the value will always be negative.
123+
- **The train did not reach it's target yet**: The reward is negative and equal to the estimated amount of time needed by the agent to reach its target from
124+
it's current position, if it travels on the shortest path to the target, while accounting for it's latest arrival time.
125+
`agent.get_current_delay()` *refer to it in detail [here](../environment/timetables)*
126+
The value returned will be positive if the expected arrival time is projected before latest arrival and negative if the expected arrival time is projected
127+
after latest arrival. Since it is called at the end of the episode, the agent is already past it's deadline and so the value will always be negative.
93128

94-
- **The train never departed**: If the agent hasn't departed (i.e. status is `READY_TO_DEPART`) at the end of the episode, it is considered to be cancelled and the following reward is provided.
95-
`-1 * cancellation_factor * (travel_time_on_shortest_path + cancellation_time_buffer)`
129+
- **The train never departed**: If the agent hasn't departed (i.e. status is `READY_TO_DEPART`) at the end of the episode, it is considered to be cancelled and
130+
the following reward is provided.
131+
`-1 * cancellation_factor * (travel_time_on_shortest_path + cancellation_time_buffer)`
96132

97133
```{admonition} Code reference
98134
The reward is calculated in [envs/rail_env.py](https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/envs/rail_env.py)
99135
```
100136

101-
102-
103137
🚉 Other concepts
104138
-----------------
105139

106140
### Stochasticity
107141

108-
An important aspect of these levels will be their **stochasticity**, which means how often and for how long trains will malfunction. Malfunctions force the agents the reconsider their plans which can be costly.
142+
An important aspect of these levels will be their **stochasticity**, which means how often and for how long trains will malfunction. Malfunctions force the
143+
agents to reconsider their plans, which can be costly.
109144

110145
**[🔗 Adjust stochasticity](../environment/stochasticity)**
111146

112147
### Speed profiles
113148

114-
Finally, trains in real railway networks don't all move at the same speed. A freight train will for example be slower than a passenger train. This is an important consideration, as you want to avoid scheduling a fast train behind a slow train!
149+
Finally, trains in real railway networks don't all move at the same speed. A freight train will for example be slower than a passenger train. This is an
150+
important consideration, as you want to avoid scheduling a fast train behind a slow train!
115151

116152
**[🔗 Tune speed profiles](../environment/speed_profiles)**

0 commit comments

Comments
 (0)