tactics2d.envs
ParkingEnv
Bases: Env
This class provides an environment to train a vehicle to park in a parking lot without dynamic traffic participants, such as pedestrians and vehicles.
Observation
ParkingEnv
provides two types of observations:
- Camera: A top-down semantic segmentation image of the agent vehicle and its surrounding. The perception range is 20 meters. The image is returned as a 3D numpy array with the shape of (200, 200, 3).
- LiDAR: A single line lidar sensor that scans a full 360-degree view with a range of 20 meters. The lidar data is returned as a 1D numpy array with the shape of (120,).
Action
ParkingEnv
accepts either a continuous or a discrete action command for the agent vehicle.
- The continuous action is a 2D numpy array with the shape of (2,) representing the steering angle and the acceleration.
- The discrete action is an integer from 1 to 5, representing the following actions:
- Do nothing: (0, 0)
- Turn left: (-0.5, 0)
- Turn right: (0.5, 0)
- Move forward: (0, 1)
- Move backward: (0, -1)
The first element of the action tuple is the steering angle, which should be in the range of [-0.75, 0.75]. Its unit is radian. The second element of the action tuple is the acceleration, which should be in the range of [-2.0, 2.0]. Its unit is m/s$^2$.
Status and Reward
The environment provides a reward for each step. The status check and reward is calculated based on the following rules:
- Check time exceed: If the time step exceeds the maximum time step (20000 steps), the scenario status will be set to
TIME_EXCEEDED
and a negative reward -1 will be given. - Check no action: If the agent vehicle does not move for over 100 steps, the scenario status will be set to
NO_ACTION
and a negative reward -1 will be given. - Check out bound: If the agent vehicle goes out of the boundary of the map, the scenario status will be set to
OUT_BOUND
and a negative reward -5 will be given. - Check collision: If the agent vehicle collides with the static obstacles, the traffic status will be set to
COLLISION_STATIC
and a negative reward -5 will be given. - Check completed: If the agent vehicle successfully parks in the target area, the scenario status will be set to
COMPLETED
and a positive reward 5 will be given. - Otherwise, the reward is calculated as the sum of the time penalty and the IoU reward. The time penalty is calculated as -tanh(t / T) * 0.1, where t is the current time step and T is the maximum time step. The IoU reward is calculated as the difference between the current IoU and the maximum IoU.
If the agent has successfully completed the scenario, the environment will set the terminated flag to True. If the scenario status goes abnormal or the traffic status goes abnormal, the environment will set the truncated flag to True.
The status information is returned as a dictionary with the following keys:
lidar
: A 1D numpy array with the shape of (120,) representing the lidar data.state
: The current state of the agent vehicle.target_area
: The coordinates of the target area.target_heading
: The heading of the target area.traffic_status
: The status of the traffic scenario.scenario_status
: The status of the scenario.
__init__(type_proportion=0.5, render_mode='human', render_fps=60, max_step=int(20000.0), continuous=True)
Initialize the parking environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
type_proportion |
float
|
The proportion of "bay" parking scenario in all generated scenarios. It should be in the range of [0, 1]. If the input is out of the range, it will be clipped to the range. When it is 0, the generator only generates "parallel" parking scenarios. When it is 1, the generator only generates "bay" parking scenarios. |
0.5
|
render_mode |
str
|
The mode of the rendering. It can be "human" or "rgb_array". |
'human'
|
render_fps |
int
|
The frame rate of the rendering. |
60
|
max_step |
int
|
The maximum time step of the scenario. |
int(20000.0)
|
continuous |
bool
|
Whether to use continuous action space. |
True
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
If the render mode is not supported. |
reset(seed=None, options=None)
This function resets the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
seed |
int
|
The random seed. |
None
|
options |
dict
|
The options for the environment. |
None
|
Returns: observation (np.array): The BEV image observation of the environment. infos (dict): The information of the environment.
step(action)
This function takes a step in the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
action |
Union[Tuple[float], int]
|
The action command for the agent vehicle. |
required |
Raises:
Type | Description |
---|---|
InvalidAction
|
If the action is not in the action space. |
Returns:
Name | Type | Description |
---|---|---|
observation |
array
|
The BEV image observation of the environment. |
reward |
float
|
The reward of the environment. |
terminated |
bool
|
Whether the scenario is terminated. If the agent has completed the scenario, the scenario is terminated. |
truncated |
bool
|
Whether the scenario is truncated. If the scenario status goes abnormal or the traffic status goes abnormal, the scenario is truncated. |
infos |
dict
|
The information of the environment. |
RacingEnv
Bases: Env
This class provides an environment to train a racing car to drive on a racing track.
Observation
RacingEnv
provides a top-down semantic segmentation image of agent and its surrounding. The observation is represented as a
Action
RacingEnv
accepts either a continuous or a discrete action command for the agent vehicle.
- The continuous action is a 2D numpy array with the shape of (2,) representing the steering angle and the acceleration.
- The discrete action is an integer from 1 to 5, representing the following actions:
- Do nothing: (0, 0)
- Turn left: (-0.5, 0)
- Turn right: (0.5, 0)
- Move forward: (0, 1)
- Move backward: (0, -1)
The first element of the action tuple is the steering angle, which should be in the range of [-0.75, 0.75]. Its unit is radian. The second element of the action tuple is the acceleration, which should be in the range of [-2.0, 2.0]. Its unit is m/s$^2$.
Status and Reward
The environment provides a reward for each step. The status check and reward is calculated based on the following rules:
- Check time exceed: If the time step exceeds the maximum time step (100000 steps), the scenario status will be set to
TIME_EXCEEDED
and a negative reward -1 will be given. - Check no action: If the agent vehicle does not move for over 100 steps, the scenario status will be set to
NO_ACTION
and a negative reward -1 will be given. - Check out bound: If the agent vehicle goes out of the boundary of the map, the scenario status will be set to
OUT_BOUND
and a negative reward -5 will be given. - Check off road: If the agent vehicle drives off the road, the scenario status will be set to
OFF_ROAD
and a negative reward -5 will be given. - Check arrived: If the agent vehicle arrives at the destination, the scenario status will be set to
ARRIVED
and a positive reward 10 will be given. - Otherwise,
If the agent has successfully completed the scenario, the environment will set the terminated flag to True. If the scenario status goes abnormal or the traffic status goes abnormal, the environment will set the truncated flag to True.
The status information is returned as a dictionary with the following keys:
state
: The current state of the agent vehicle.traffic_status
: The status of the traffic scenario.scenario_status
: The status of the scenario.
__init__(render_mode='human', render_fps=60, max_step=int(100000.0), continuous=True)
Initialize the racing environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
render_mode |
str
|
The mode of the rendering. It can be "human" or "rgb_array". |
'human'
|
render_fps |
int
|
The frame rate of the rendering. |
60
|
max_step |
int
|
The maximum time step of the scenario. |
int(100000.0)
|
continuous |
bool
|
Whether to use continuous action space. |
True
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
If the render mode is not supported. |
step(action)
This function takes a step in the environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
action |
Union[array, int]
|
The action command for the agent vehicle. |
required |
Raises:
Type | Description |
---|---|
InvalidAction
|
If the action is not in the action space. |
Returns:
Name | Type | Description |
---|---|---|
observation |
array
|
The BEV image observation of the environment. |
reward |
float
|
The reward of the environment. |
terminated |
bool
|
Whether the scenario is terminated. If the agent has completed the scenario, the scenario is terminated. |
truncated |
bool
|
Whether the scenario is truncated. If the scenario status goes abnormal or the traffic status goes abnormal, the scenario is truncated. |
infos |
dict
|
The information of the environment. |