Waymo Open Motion Dataset (WOMD)¶

About¶

Waymo Open Motion Dataset (WOMD) is a large-scale motion forecasting benchmark built from real autonomous driving logs collected in San Francisco, Phoenix, Mountain View, Los Angeles, Detroit, and Seattle. Each scenario contains approximately 20 seconds of 10 Hz agent trajectories, together with 3D map features and dynamic traffic signal states.

The dataset is designed for motion forecasting, interaction reasoning, map-aware behaviour analysis, and simulation-ready traffic understanding.

How to Obtain the Dataset¶

Download the Motion Dataset shards from the official Waymo Open Dataset portal:

Waymo Open Dataset download page
Recommended format for the current parser: Motion Dataset v1.3.1 -> uncompressed/scenario/{training,validation,validation_interactive}/

For local experiments with Tactics2D, place a shard under:

./tactics2d/data/trajectory_sample/WOMD/

You may keep the original shard name, or rename a local sample shard to motion_data_one_scenario.tfrecord.

Supported Map and Trajectory Elements in `Tactics2D`¶

The current WOMDParser supports:

tracked participants and time-indexed trajectories
lane boundaries and lane relations
road edges and road lines
crosswalks and speed bumps
driveway polygons, stored in Tactics2D as drivable_area
stop signs and dynamic lane signal states, exposed as traffic_light regulations

Data Analysis¶

This analysis is based on official WOMD sample shards available in the local test workspace. A full-dataset statistical study can be added later when we have sufficient local storage and processing resources.

In [1]:

hide_input

Copied!

%matplotlib inline
import warnings

warnings.filterwarnings("ignore")

import logging

logging.basicConfig(level=logging.WARNING)

import matplotlib as mpl
import seaborn as sns
%matplotlib inline
import warnings

warnings.filterwarnings("ignore")

import logging

logging.basicConfig(level=logging.WARNING)

import matplotlib as mpl
import seaborn as sns

In [2]:

hide_input

Copied!





mpl.rcParams.update(
    {
        "figure.dpi": 200,
        "font.family": "DejaVu Sans Mono",
        "font.size": 6,
        "font.stretch": "semi-expanded",
        "animation.html": "html5",
        "animation.embed_limit": 5000,
        "axes.edgecolor": "black",
        "axes.linewidth": 0.8,
        "axes.grid": True,
        "grid.color": "#cccccc",
        "axes.facecolor": "white",
    }
)
sns.set_palette("Set2")
mpl.rcParams.update(
    {
        "figure.dpi": 200,
        "font.family": "DejaVu Sans Mono",
        "font.size": 6,
        "font.stretch": "semi-expanded",
        "animation.html": "html5",
        "animation.embed_limit": 5000,
        "axes.edgecolor": "black",
        "axes.linewidth": 0.8,
        "axes.grid": True,
        "grid.color": "#cccccc",
        "axes.facecolor": "white",
    }
)
sns.set_palette("Set2")

From the local sample shard, we can already observe several characteristic WOMD patterns:

heterogeneous traffic participants (vehicles, pedestrians, cyclists)
lane-centric topology with explicit left/right boundaries
polygonal semantic areas such as crosswalks, speed bumps, and drivable areas
dynamic lane signal states that can be consumed by downstream traffic-light-aware reasoning

Tactics2D Integration¶

In [3]:

Copied!





%matplotlib notebook

from shapely.geometry import Point
from matplotlib.animation import FuncAnimation
import numpy as np

from tactics2d.dataset_parser import WOMDParser
from tactics2d.sensor import BEVCamera
from tactics2d.renderer import MatplotlibRenderer
%matplotlib notebook

from shapely.geometry import Point
from matplotlib.animation import FuncAnimation
import numpy as np

from tactics2d.dataset_parser import WOMDParser
from tactics2d.sensor import BEVCamera
from tactics2d.renderer import MatplotlibRenderer

Dataset Preparation¶

You can place the WOMD shard in any local directory. In the example below, the shard is stored under ./tactics2d/data/trajectory_sample/WOMD/.

Class Mapping¶

The current WOMDParser maps Waymo object classes into Tactics2D participant classes such as Vehicle, Pedestrian, and Cyclist. It also converts driveway polygons into drivable_area, and dynamic lane signal states into traffic_light regulations.

Print Information of a Scenario¶

In [4]:

Copied!





parser = WOMDParser()
folder = "../../data/WOMD"
file_name = "uncompressed_scenario_validation_interactive_validation_interactive.tfrecord-00000-of-00150"
scenario_id = "234dfbe99b740c80"
parser = WOMDParser()
folder = "../../data/WOMD"
file_name = "uncompressed_scenario_validation_interactive_validation_interactive.tfrecord-00000-of-00150"
scenario_id = "234dfbe99b740c80"

In [5]:

Copied!





participants, actual_time_range = parser.parse_trajectory(
    scenario_id, file=file_name, folder=folder
)
map_ = parser.parse_map(scenario_id, file=file_name, folder=folder)

class_counts = {}
for participant in participants.values():
    class_name = type(participant).__name__
    class_counts[class_name] = class_counts.get(class_name, 0) + 1

area_counts = {}
for area in map_.areas.values():
    area_counts[area.subtype] = area_counts.get(area.subtype, 0) + 1

reg_counts = {}
for regulation in map_.regulations.values():
    reg_counts[regulation.subtype] = reg_counts.get(regulation.subtype, 0) + 1

print("scenario_id:", scenario_id)
print("time_range:", actual_time_range)
print("participant_classes:", class_counts)
print("map_counts:", {
    "lanes": len(map_.lanes),
    "areas": len(map_.areas),
    "roadlines": len(map_.roadlines),
    "regulations": len(map_.regulations),
})
print("area_types:", area_counts)
print("regulation_types:", reg_counts)
participants, actual_time_range = parser.parse_trajectory(
    scenario_id, file=file_name, folder=folder
)
map_ = parser.parse_map(scenario_id, file=file_name, folder=folder)

class_counts = {}
for participant in participants.values():
    class_name = type(participant).__name__
    class_counts[class_name] = class_counts.get(class_name, 0) + 1

area_counts = {}
for area in map_.areas.values():
    area_counts[area.subtype] = area_counts.get(area.subtype, 0) + 1

reg_counts = {}
for regulation in map_.regulations.values():
    reg_counts[regulation.subtype] = reg_counts.get(regulation.subtype, 0) + 1

print("scenario_id:", scenario_id)
print("time_range:", actual_time_range)
print("participant_classes:", class_counts)
print("map_counts:", {
    "lanes": len(map_.lanes),
    "areas": len(map_.areas),
    "roadlines": len(map_.roadlines),
    "regulations": len(map_.regulations),
})
print("area_types:", area_counts)
print("regulation_types:", reg_counts)

scenario_id: 234dfbe99b740c80
time_range: (0, 8997)
participant_classes: {'Vehicle': 52, 'Pedestrian': 2, 'Cyclist': 1}
map_counts: {'lanes': 664, 'areas': 86, 'roadlines': 143, 'regulations': 16}
area_types: {'crosswalk': 32, 'speed_bump': 18, 'drivable_area': 36}
regulation_types: {'stop_sign': 10, 'traffic_light': 6}

Parse and Replay a Scenario¶

The following cells parse an official WOMD shard and replay one scenario using BEVCamera and MatplotlibRenderer.

In [6]:

Copied!





participants, actual_time_range = parser.parse_trajectory(
    scenario_id, file=file_name, folder=folder
)
map_ = parser.parse_map(scenario_id, file=file_name, folder=folder)

print(len(participants), actual_time_range)
print(len(map_.lanes), len(map_.areas), len(map_.roadlines), len(map_.regulations))
print(sorted({type(p).__name__ for p in participants.values()}))
participants, actual_time_range = parser.parse_trajectory(
    scenario_id, file=file_name, folder=folder
)
map_ = parser.parse_map(scenario_id, file=file_name, folder=folder)

print(len(participants), actual_time_range)
print(len(map_.lanes), len(map_.areas), len(map_.roadlines), len(map_.regulations))
print(sorted({type(p).__name__ for p in participants.values()}))

55 (0, 8997)
664 86 143 16
['Cyclist', 'Pedestrian', 'Vehicle']

In [7]:

Copied!





def render_scenario_animation(
    file_name: str,
    scenario_id: str,
    folder=None,
    resolution=(1200, 800),
    window_ms: int = 4000,
    fill_invalid_gaps: bool = True,
    max_gap_frames: int = 2,
):

    parser = WOMDParser()
    participants, _ = parser.parse_trajectory(
        scenario_id,
        file=file_name,
        folder=folder,
        fill_invalid_gaps=fill_invalid_gaps,
        max_gap_frames=max_gap_frames,
    )
    map_ = parser.parse_map(scenario_id, file=file_name, folder=folder)

    for roadline in map_.roadlines.values():
        if roadline.type_ is None:
            roadline.type_ = "roadline"

    all_frames = sorted({frame for p in participants.values() for frame in p.trajectory.history_states.keys()})
    if len(all_frames) == 0:
        raise ValueError("No valid frames were parsed from the WOMD scenario.")

    frames_per_window = max(1, int(round(window_ms / 100)))
    counts = [
        (frame, sum(1 for p in participants.values() if frame in p.trajectory.history_states))
        for frame in all_frames
    ]
    best_start = 0
    best_score = None
    if len(counts) > frames_per_window:
        for i in range(len(counts) - frames_per_window + 1):
            score = sum(c for _, c in counts[i : i + frames_per_window])
            if best_score is None or score > best_score:
                best_score = score
                best_start = i
        frames = [frame for frame, _ in counts[best_start : best_start + frames_per_window]]
    else:
        frames = all_frames

    x_min, x_max, y_min, y_max = map_.boundary
    camera_position = np.array([(x_min + x_max) / 2, (y_min + y_max) / 2])

    camera = BEVCamera(id_=0, map_=map_, perception_range=120)
    prev_road_id_set = set()
    prev_participant_id_set = set()

    renderer = MatplotlibRenderer(
        xlim=(x_min, x_max), ylim=(y_min, y_max), resolution=resolution, auto_scale=True
    )
    fig = renderer.fig

    def update(frame):
        nonlocal prev_road_id_set, prev_participant_id_set
        participant_ids = [pid for pid, p in participants.items() if frame in p.trajectory.history_states]
        geometry_data, prev_road_id_set, prev_participant_id_set = camera.update(
            frame,
            participants,
            participant_ids,
            prev_road_id_set,
            prev_participant_id_set,
            Point(camera_position),
        )
        renderer.update(geometry_data)
        renderer.ax.set_title(
            f"scenario={scenario_id} | t = {frame} ms | active: {len(participant_ids)}",
            fontsize=6,
        )

    ani = FuncAnimation(fig, update, frames=frames, interval=100, repeat=True)
    return ani
def render_scenario_animation(
    file_name: str,
    scenario_id: str,
    folder=None,
    resolution=(1200, 800),
    window_ms: int = 4000,
    fill_invalid_gaps: bool = True,
    max_gap_frames: int = 2,
):

    parser = WOMDParser()
    participants, _ = parser.parse_trajectory(
        scenario_id,
        file=file_name,
        folder=folder,
        fill_invalid_gaps=fill_invalid_gaps,
        max_gap_frames=max_gap_frames,
    )
    map_ = parser.parse_map(scenario_id, file=file_name, folder=folder)

    for roadline in map_.roadlines.values():
        if roadline.type_ is None:
            roadline.type_ = "roadline"

    all_frames = sorted({frame for p in participants.values() for frame in p.trajectory.history_states.keys()})
    if len(all_frames) == 0:
        raise ValueError("No valid frames were parsed from the WOMD scenario.")

    frames_per_window = max(1, int(round(window_ms / 100)))
    counts = [
        (frame, sum(1 for p in participants.values() if frame in p.trajectory.history_states))
        for frame in all_frames
    ]
    best_start = 0
    best_score = None
    if len(counts) > frames_per_window:
        for i in range(len(counts) - frames_per_window + 1):
            score = sum(c for _, c in counts[i : i + frames_per_window])
            if best_score is None or score > best_score:
                best_score = score
                best_start = i
        frames = [frame for frame, _ in counts[best_start : best_start + frames_per_window]]
    else:
        frames = all_frames

    x_min, x_max, y_min, y_max = map_.boundary
    camera_position = np.array([(x_min + x_max) / 2, (y_min + y_max) / 2])

    camera = BEVCamera(id_=0, map_=map_, perception_range=120)
    prev_road_id_set = set()
    prev_participant_id_set = set()

    renderer = MatplotlibRenderer(
        xlim=(x_min, x_max), ylim=(y_min, y_max), resolution=resolution, auto_scale=True
    )
    fig = renderer.fig

    def update(frame):
        nonlocal prev_road_id_set, prev_participant_id_set
        participant_ids = [pid for pid, p in participants.items() if frame in p.trajectory.history_states]
        geometry_data, prev_road_id_set, prev_participant_id_set = camera.update(
            frame,
            participants,
            participant_ids,
            prev_road_id_set,
            prev_participant_id_set,
            Point(camera_position),
        )
        renderer.update(geometry_data)
        renderer.ax.set_title(
            f"scenario={scenario_id} | t = {frame} ms | active: {len(participant_ids)}",
            fontsize=6,
        )

    ani = FuncAnimation(fig, update, frames=frames, interval=100, repeat=True)
    return ani

In [8]:

Copied!





render_scenario_animation(
    file_name=file_name,
    scenario_id=scenario_id,
    folder=folder,
    resolution=(1000, 1000),
    window_ms=4000,
    fill_invalid_gaps=True,
    max_gap_frames=2,
)
render_scenario_animation(
    file_name=file_name,
    scenario_id=scenario_id,
    folder=folder,
    resolution=(1000, 1000),
    window_ms=4000,
    fill_invalid_gaps=True,
    max_gap_frames=2,
)

Out[8]:

In [ ]: