Commands#
Commands specify what the policy should achieve at each moment: a target velocity, a reference trajectory, a goal position. The command manager generates these signals, resamples them at configurable intervals, and passes them to the policy through the observation system.
Registration#
Commands are registered in ManagerBasedRlEnvCfg as a dictionary
mapping string names to CommandTermCfg instances. Unlike the
function-based terms used by other managers, every command term is a
class that inherits from CommandTerm.
The resampling_time_range field controls how often the command
changes. After each resample the term draws a new timer value uniformly
from the given (min, max) range in seconds. Commands are also
resampled unconditionally on every episode reset.
commands = {
"twist": UniformVelocityCommandCfg(
entity_name="robot",
resampling_time_range=(3.0, 8.0),
ranges=UniformVelocityCommandCfg.Ranges(
lin_vel_x=(-1.0, 1.0),
lin_vel_y=(-1.0, 1.0),
ang_vel_z=(-0.5, 0.5),
),
),
}
The generated_commands observation function reads the current
command tensor by name and passes it to the policy:
ObservationTermCfg(
func=mdp.generated_commands,
params={"command_name": "twist"},
)
If the environment has no commands, the manager no-ops all operations and returns empty tensors. There is no special handling required.
Included command terms#
Each task ships with its own command terms tailored to its objective.
Term |
Description |
|---|---|
|
Generates planar velocity commands |
|
Generates a 3D target position for a manipulated object. Supports fixed and dynamic difficulty modes. Tracks metrics including position error and episode success rate. Used by the manipulation task. |
|
Streams reference joint positions, velocities, and body poses
from a pre-recorded |
Each term can render debug visualizations in the interactive viewer
when debug_vis=True is set in the configuration. The image below
shows the ghost visualization from MotionCommand, which renders a
translucent copy of the robot at the reference pose alongside the
actual robot.
Viser visualization of the commanded reference motion for the G1 tracking task.#
Writing custom command terms#
A custom command term is a class inheriting from CommandTerm paired
with a configuration dataclass inheriting from CommandTermCfg. The
term must implement four methods: _resample_command(env_ids) to
sample new goals, _update_command() for per-step updates,
_update_metrics() for logging, and a command property returning
the current goal tensor. The base class manages the resampling timer
and reset logic automatically.
The configuration must implement a build(env) method that
constructs the paired term instance.