Metrics#
The metrics manager logs per-step scalar values as episode averages. Unlike rewards, metrics carry no weight and are not scaled by the step duration. They exist purely for diagnostics: tracking quantities such as tracking error, contact forces, or energy consumption alongside reward curves without influencing the optimization.
Metrics are computed every environment step, accumulated per environment,
and averaged over the episode length when the environment resets. The
resulting averages are written to the training logger (TensorBoard or
Weights & Biases) under the Episode_Metrics/ prefix.
If the metrics dictionary on ManagerBasedRlEnvCfg is empty, the
environment substitutes a lightweight no-op manager with zero overhead.
Registration#
Each metric term is registered by name in the metrics dictionary of
ManagerBasedRlEnvCfg. The configuration is minimal: a callable and an
optional params dictionary.
from mjlab.managers.metrics_manager import MetricsTermCfg
metrics = {
"base_height": MetricsTermCfg(
func=base_height,
params={"asset_cfg": SceneEntityCfg("robot")},
),
}
The callable receives env as its first argument and any entries in
params as keyword arguments. It must return a tensor of shape
[num_envs], one scalar per environment per step.
How metrics are computed#
The manager maintains a running sum and a step counter for each
environment. On every call to compute():
The step counter increments for all environments.
Each term function is called with the current environment state.
The returned per-environment values are added to the running sums.
When an environment resets, the manager divides each term’s accumulated sum
by the step count for that environment, averages the result across all
resetting environments, and returns the scalar under the key
Episode_Metrics/<term_name>. The sums and counters are then zeroed for
the reset environments. Division is per-environment, so environments that
terminated early are not diluted by longer-running ones.
These scalars flow through env.extras["log"] into the training runner,
which writes them to the configured logger. In a typical training run they
appear as:
Episode_Metrics/base_height
Episode_Metrics/contact_force
alongside the Episode_Reward/ entries produced by the reward manager.
Writing custom metric functions#
A metric function follows the same pattern as reward and observation
functions. It takes the environment as its first argument, reads whatever
state it needs, and returns a [num_envs] tensor.
import torch
from mjlab.envs import ManagerBasedRlEnv
from mjlab.managers.scene_entity_config import SceneEntityCfg
def base_height(
env: ManagerBasedRlEnv,
asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
) -> torch.Tensor:
robot = env.scene[asset_cfg.name]
return robot.data.root_link_pos_w[:, 2]
For metrics that require cached setup or per-episode state, implement the
term as a class with __init__(self, cfg, env) and a __call__
method. If the class defines a reset(env_ids) method, the manager
calls it automatically on episode resets.