Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the LeRobot dataset recorder to improve its flexibility and data collection capabilities. The main goal is to remove the hardcoded observation configuration from robot_meta and instead directly use the environment's observation space, enabling automatic recording of additional data types like object poses, velocities, masses, and robot joint velocities and forces.
Changes:
- Refactored dataset feature building to use environment observation space directly instead of robot_meta configuration
- Added support for recording additional observation types including object velocities (new function), robot qvel, and qf
- Enhanced observation function
get_rigid_object_posewith optional matrix/quaternion return format
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| embodichain/lab/gym/utils/gym_utils.py | Added torch.uint8 and torch.uint16 type mappings to support additional image/data formats |
| embodichain/lab/gym/envs/managers/observations.py | Added to_matrix parameter to get_rigid_object_pose and new get_rigid_object_velocity function for extended data recording |
| embodichain/lab/gym/envs/managers/datasets.py | Major refactoring: removed robot_meta observation config dependency, rebuilt feature construction to use env observation space, added extra episode metadata support, improved progress feedback with tqdm |
| embodichain/lab/gym/envs/embodied_env.py | Moved dataset_manager initialization to init after observation space is available, removed unnecessary warning message |
| embodichain/lab/gym/envs/base_env.py | Simplified observation space properties by removing num_envs==1 special case, adjusted reset order to set _elapsed_steps after episode initialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| color_data = obs["sensor"][sensor_name]["color"] | ||
| color_img = color_data[:, :, :3].cpu() | ||
| frame[f"{sensor_name}.color"] = color_img | ||
|
|
||
| if is_stereo: | ||
| color_right_data = obs["sensor"][camera_name]["color_right"] | ||
| if isinstance(color_right_data, torch.Tensor): | ||
| color_right_img = color_right_data[:, :, :3].cpu().numpy() | ||
| else: | ||
| color_right_img = np.array(color_right_data)[:, :, :3] | ||
|
|
||
| if color_right_img.dtype in [np.float32, np.float64]: | ||
| color_right_img = (color_right_img * 255).astype(np.uint8) | ||
|
|
||
| frame[get_right_name(camera_name)] = color_right_img | ||
| color_right_data = obs["sensor"][sensor_name]["color_right"] | ||
| color_right_img = color_right_data[:, :, :3].cpu() | ||
| frame[f"{sensor_name}.color_right"] = color_right_img |
There was a problem hiding this comment.
Images are now stored as tensors on CPU instead of numpy arrays. LeRobot datasets typically expect numpy uint8 arrays for images. The tensors returned from .cpu() will be torch tensors, which may not be compatible with LeRobot's internal image processing. Consider converting to numpy and ensuring uint8 dtype: color_img.cpu().numpy().astype(np.uint8) if the tensor is normalized float, or just .cpu().numpy() if already uint8.
| names = key | ||
| if "vel" in key: | ||
| names = ["lin_x", "lin_y", "lin_z", "ang_x", "ang_y", "ang_z"] | ||
| elif "pose" in key: | ||
| names = ["x", "y", "z", "qw", "qx", "qy", "qz"] | ||
|
|
||
| features[f"observation.{key}"] = { | ||
| "dtype": str(space.dtype), | ||
| "shape": space.shape, | ||
| "names": names, | ||
| } |
There was a problem hiding this comment.
The feature names assignment uses a simple string for observation keys (line 376), but specific patterns for 'vel' and 'pose' keys (lines 377-380). This heuristic approach could lead to incorrect feature names if observation keys don't follow these exact naming patterns. Consider documenting the expected naming conventions or making this more robust by checking the shape of the observation space to determine appropriate names.
| for frame_name, space in value.items(): | ||
| # TODO: Support depth (uint16) and mask (also uint16 or uint8) | ||
| if frame_name not in ["color", "color_right"]: | ||
| logger.log_error( | ||
| f"Only support 'color' frame for vision sensors, but got '{frame_name}' in sensor '{sensor_name}'" | ||
| ) | ||
|
|
||
| features[f"{sensor_name}.{frame_name}"] = { | ||
| "dtype": "video" if self.use_videos else "image", | ||
| "shape": (sensor.cfg.height, sensor.cfg.width, 3), | ||
| "names": ["height", "width", "channel"], | ||
| } | ||
|
|
||
| if is_stereo: | ||
| features[f"{sensor_name}.{frame_name}_right"] = { | ||
| "dtype": "video" if self.use_videos else "image", | ||
| "shape": (sensor.cfg.height, sensor.cfg.width, 3), | ||
| "names": ["height", "width", "channel"], | ||
| } |
There was a problem hiding this comment.
The stereo camera handling logic appears incorrect. When iterating over frames at line 342, for stereo cameras the observation space will contain both 'color' and 'color_right'. The code then checks for stereo at line 355 and tries to add a '_right' suffix, which would create 'sensor_name.color_right' again (redundant) and also create an incorrect 'sensor_name.color_right_right' when frame_name is already 'color_right'. The stereo check should skip the _right suffix addition if frame_name already contains '_right', or the loop should only process 'color' frames and handle right frames separately.
| action = action[self._joint_ids] | ||
| if isinstance(action, torch.Tensor): | ||
| action_data = action.cpu().numpy() | ||
| elif isinstance(action, np.ndarray): | ||
| action_data = action | ||
| action_data = action.cpu() |
There was a problem hiding this comment.
The action indexing on line 435 assumes action is a tensor, but the following code checks if action is a torch.Tensor or dict. If action is already a dict, attempting to index it with self._joint_ids will raise a TypeError. The indexing should only happen after verifying action is a tensor, or should be done after extracting the tensor from the dict.
| @@ -372,13 +448,7 @@ def _convert_frame_to_lerobot( | |||
| break | |||
|
|
|||
| if isinstance(action_tensor, torch.Tensor): | |||
| action_data = action_tensor.cpu().numpy() | |||
| elif isinstance(action_tensor, np.ndarray): | |||
| action_data = action_tensor | |||
| else: | |||
| action_data = np.array(action_tensor) | |||
| else: | |||
| action_data = np.array(action) | |||
| action_data = action_tensor.cpu() | |||
|
|
|||
| frame["action"] = action_data | |||
There was a problem hiding this comment.
The action_data variable is used in line 453 but may be undefined if action_tensor is None or not a torch.Tensor. If the action is a dict and no valid tensor is found, or if action_tensor is not a torch.Tensor, the code will raise an UnboundLocalError. Add an else clause or error handling to ensure action_data is always defined before use.
| frame["observation.qpos"] = obs["robot"]["qpos"][self._joint_ids].cpu() | ||
| frame["observation.qvel"] = obs["robot"]["qvel"][self._joint_ids].cpu() | ||
| frame["observation.qf"] = obs["robot"]["qf"][self._joint_ids].cpu() | ||
|
|
||
| # Add extra observation features if they exist | ||
| for key in obs: | ||
| if key in ["robot", "sensor"]: | ||
| continue | ||
|
|
||
| frame["observation.state"] = state_data | ||
| frame[f"observation.{key}"] = obs[key].cpu() |
There was a problem hiding this comment.
Robot observation data (qpos, qvel, qf) and extra observations are stored as torch tensors on CPU instead of numpy arrays. LeRobot datasets expect numpy arrays for all data. These tensors should be converted to numpy arrays using .cpu().numpy() instead of just .cpu() to ensure compatibility with LeRobot's dataset format.
| @@ -405,8 +406,6 @@ def _initialize_episode( | |||
| mode="save", | |||
| env_ids=successful_env_ids_tensor, | |||
| ) | |||
There was a problem hiding this comment.
The removed log message "No successful episodes to save" provided useful debugging information when save operations are triggered but no data needs to be saved. Consider adding equivalent logging in the parent if-statement to maintain visibility into save operations, or document why this logging was removed.
| ) | |
| ) | |
| else: | |
| # Log when a save is requested but there are no successful episodes | |
| logger.debug( | |
| "Save requested but no successful episodes to save for env_ids=%s", | |
| env_ids_to_process, | |
| ) |
| ] | ||
|
|
||
| if successful_env_ids: | ||
|
|
There was a problem hiding this comment.
The extra blank line at line 400 appears to be an unintentional formatting change. Consider removing it to maintain consistent code formatting.
| # Reset hook for user to perform any custom reset logic. | ||
| self._initialize_episode(reset_ids, **options) | ||
| self._elapsed_steps[reset_ids] = 0 |
There was a problem hiding this comment.
The reset of _elapsed_steps was moved to after _initialize_episode. This changes the semantics: previously, elapsed_steps would be 0 during _initialize_episode for the reset environments. Now it retains the old value during _initialize_episode and only gets reset afterwards. This could affect any logic in _initialize_episode that depends on elapsed_steps being 0 for newly reset environments.
| # Reset hook for user to perform any custom reset logic. | |
| self._initialize_episode(reset_ids, **options) | |
| self._elapsed_steps[reset_ids] = 0 | |
| # Reset elapsed steps for the environments being reset before initialization. | |
| self._elapsed_steps[reset_ids] = 0 | |
| # Reset hook for user to perform any custom reset logic. | |
| self._initialize_episode(reset_ids, **options) |
Description
This PR update LerobotDataset Recorder in the following aspects:
Type of change
Checklist
black .command to format the code base.