Add valve-detection and valve-subtype-resolver to vortex-cv/mission/tacc/visual_inspection#11
Add valve-detection and valve-subtype-resolver to vortex-cv/mission/tacc/visual_inspection#11
Conversation
…sion/tacc/visual_inspection
There was a problem hiding this comment.
3 round of review or something?
Assume we always have depth camera properties and remove unsued code.
Also most of the functions here are very large. Try to split them up, or use helper functions.
This review only applied to valve_pose_estimation. Also rename to package name and folders
| # Color camera intrinsics – fallback values used when color_image_info_topic is not published. | ||
| # Source: /camera/camera/color/camera_info | ||
| color_fx: 452.121521 | ||
| color_fy: 451.737976 | ||
| color_cx: 438.254059 | ||
| color_cy: 248.703217 | ||
| color_image_width: 896 | ||
| color_image_height: 504 | ||
|
|
||
| # Distortion coefficients [k1, k2, p1, p2, k3] (plumb_bob model) | ||
| color_d1: -0.0548105500638485 | ||
| color_d2: 0.0597584918141365 | ||
| color_d3: 0.000486430013552308 | ||
| color_d4: 0.00031599000794813 | ||
| color_d5: -0.0192314591258764 | ||
|
|
||
| # Depth camera intrinsics (fallback when depth camera_info is not published) | ||
| # Source: /camera/camera/depth/camera_info | ||
| depth_fx: 449.449707 | ||
| depth_fy: 449.449707 | ||
| depth_cx: 444.819946 | ||
| depth_cy: 248.226578 | ||
| depth_image_width: 896 | ||
| depth_image_height: 504 | ||
|
|
||
| # ── Depth-to-color extrinsic ────────────────────────────────────────────── | ||
| # Translation (metres) of the depth origin expressed in the color camera | ||
| # frame. Rotation is assumed to be identity (both optical frames share the | ||
| # same orientation for the RealSense D555). | ||
| # Source: ros2 topic echo /realsense/extrinsics/depth_to_color | ||
| depth_to_color_tx: -0.059 | ||
| depth_to_color_ty: 0.0 | ||
| depth_to_color_tz: 0.0 |
There was a problem hiding this comment.
All these can be discoverable during runtime using tf2 and camerainfo topics. Since they are fallback values we sould include another config param that is something along the lines of use_config for the different params etc.
| vortex_msgs::msg::Landmark lm; | ||
| lm.header = header; | ||
| lm.id = static_cast<int32_t>(i); | ||
| lm.type.value = type; |
There was a problem hiding this comment.
here you can do something like lm.type.value = vortex_msgs::msg::LandmarkType::VALVE;
Also, for this package i dont see us needing more than one type so maybe just hardcode it and remove the type from the function arg
| if (depth->encoding == sensor_msgs::image_encodings::TYPE_16UC1 || | ||
| depth->encoding == "16UC1") { | ||
| cv_bridge::CvImageConstPtr cv_depth = | ||
| cv_bridge::toCvShare(depth, "16UC1"); | ||
| cv_depth->image.convertTo(depth_img, CV_32FC1, 0.001); | ||
| } else { | ||
| cv_bridge::CvImageConstPtr cv_depth = | ||
| cv_bridge::toCvShare(depth, "32FC1"); | ||
| depth_img = cv_depth->image.clone(); | ||
| } | ||
| return depth_img; |
There was a problem hiding this comment.
Does the realsense publish using 16UC1?
from https://github.com/vortexntnu/perception-auv/blob/f7ea0eafb049f5f82e37bccbde95c707f5983238/perception_setup/config/cameras/realsense_d555.yaml:

Maybe we should add a flag, like convert_from_mm_to_m?
There was a problem hiding this comment.
Might be that is still publishes as 16UC1, but the driver uses a different name for it
| void project_pixel_to_point(const int u, | ||
| const int v, | ||
| const float depth, | ||
| const double fx, | ||
| const double fy, | ||
| const double cx, | ||
| const double cy, | ||
| pcl::PointXYZ& out); |
There was a problem hiding this comment.
Should add a doxygen comment explaining that this is used on depth images, therefore we assume no distortion
| void extract_annulus_pcl( | ||
| const cv::Mat& depth_image, // CV_32FC1 meters | ||
| const BoundingBox& bbox, // in ORIGINAL image pixels | ||
| const ImageProperties& img_props, | ||
| const float annulus_radius_ratio, // inner radius = outer*ratio | ||
| pcl::PointCloud<pcl::PointXYZ>::Ptr& out); |
There was a problem hiding this comment.
should include a comment that this is currently not in use
| const auto info_qos = | ||
| rclcpp::QoS(rclcpp::KeepLast(1)) | ||
| .reliability(RMW_QOS_POLICY_RELIABILITY_RELIABLE) | ||
| .durability(RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL); |
There was a problem hiding this comment.
Should not use transient local. Ask @kluge7 what is currently in use
| sync_ = std::make_shared<message_filters::Synchronizer<SyncPolicy>>( | ||
| SyncPolicy(10), depth_sub_, det_sub_); | ||
| sync_->registerCallback(std::bind(&ValvePoseNode::sync_cb, this, _1, _2)); |
There was a problem hiding this comment.
Should set the max allowed sync tolerance. Ask @kluge7 about specific values, but the time offset was really low
| output_frame_id_ = frame_base.empty() ? "" : drone + "/" + frame_base; | ||
| landmark_type_ = declare_parameter<int>("landmark_type"); | ||
|
|
||
| setup_estimator(); |
There was a problem hiding this comment.
I would say we do not initialize the detector before we have received both camera intrinsics.
| void ValvePoseNode::sync_cb( | ||
| const sensor_msgs::msg::Image::ConstSharedPtr& depth, | ||
| const vision_msgs::msg::Detection2DArray::ConstSharedPtr& det) { | ||
| if (!depth || !det) | ||
| return; | ||
|
|
||
| cv::Mat depth_color; | ||
| const bool publish_colormap = | ||
| debug_visualize_ && depth_colormap_pub_ && | ||
| depth_colormap_pub_->get_subscription_count() > 0; | ||
| if (publish_colormap) { | ||
| cv_bridge::CvImageConstPtr cv_raw = | ||
| cv_bridge::toCvShare(depth, "16UC1"); | ||
| cv::Mat depth_f; | ||
| cv_raw->image.convertTo(depth_f, CV_32FC1); | ||
| const float scale = | ||
| 255.0f / (depth_colormap_vmax_ - depth_colormap_vmin_); | ||
| cv::Mat depth_8u; | ||
| depth_f.convertTo(depth_8u, CV_8UC1, scale, | ||
| -depth_colormap_vmin_ * scale); | ||
| cv::applyColorMap(depth_8u, depth_color, cv::COLORMAP_TURBO); | ||
| } | ||
|
|
||
| if (det->detections.empty()) { | ||
| // Publish empty arrays to clear stale data from previous detections. | ||
| if (debug_visualize_ && pose_pub_) | ||
| pose_pub_->publish(make_pose_array({}, depth->header)); | ||
| landmark_pub_->publish( | ||
| make_landmark_array({}, depth->header, landmark_type_)); | ||
| if (publish_colormap) { | ||
| depth_colormap_pub_->publish( | ||
| *cv_bridge::CvImage(depth->header, "bgr8", depth_color) | ||
| .toImageMsg()); | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| std::vector<std::pair<float, BoundingBox>> scored_boxes; | ||
| scored_boxes.reserve(det->detections.size()); | ||
| for (const auto& d : det->detections) { | ||
| float score = 0.0f; | ||
| if (!d.results.empty()) { | ||
| score = static_cast<float>(d.results[0].hypothesis.score); | ||
| } else { | ||
| score = static_cast<float>(d.bbox.size_x * d.bbox.size_y); | ||
| } | ||
| scored_boxes.emplace_back(score, to_bbox(d.bbox)); | ||
| } | ||
| const std::vector<size_t> kept = | ||
| filter_duplicate_detections(scored_boxes, iou_duplicate_threshold_); | ||
|
|
||
| const cv::Mat depth_img = decode_depth_to_float(depth); | ||
|
|
||
| pcl::PointCloud<pcl::PointXYZ>::Ptr ann_dbg( | ||
| new pcl::PointCloud<pcl::PointXYZ>); | ||
| pcl::PointCloud<pcl::PointXYZ>::Ptr pln_dbg( | ||
| new pcl::PointCloud<pcl::PointXYZ>); | ||
|
|
||
| std::vector<BoundingBox> raw_boxes; | ||
| std::vector<Pose> poses; | ||
|
|
||
| for (size_t idx : kept) { | ||
| const BoundingBox& yolo_box = scored_boxes[idx].second; | ||
| BoundingBox org_box = undistort_bbox(yolo_box, color_props_.intr); | ||
| raw_boxes.push_back(yolo_box); | ||
|
|
||
| const auto pose_result = detector_->compute_pose_from_depth( | ||
| depth_img, org_box, ann_dbg, pln_dbg, true); | ||
| if (pose_result.result_valid) | ||
| poses.push_back(pose_result.result); | ||
| } | ||
|
|
||
| if (debug_visualize_ && depth_cloud_pub_ && | ||
| depth_cloud_pub_->get_subscription_count() > 0) { | ||
| sensor_msgs::msg::PointCloud2 cloud_msg; | ||
| pcl::toROSMsg(*ann_dbg, cloud_msg); | ||
| cloud_msg.header = depth->header; | ||
| depth_cloud_pub_->publish(cloud_msg); | ||
| } | ||
|
|
||
| if (publish_colormap) { | ||
| for (size_t i = 0; i < raw_boxes.size(); ++i) { | ||
| const auto& box = raw_boxes[i]; | ||
| const float Z = (i < poses.size() && poses[i].z > 0.0) | ||
| ? static_cast<float>(poses[i].z) | ||
| : 0.0f; | ||
|
|
||
| const float angle_deg = | ||
| box.theta * 180.0f / static_cast<float>(M_PI); | ||
| cv::RotatedRect rrect(cv::Point2f(box.center_x, box.center_y), | ||
| cv::Size2f(box.size_x, box.size_y), | ||
| angle_deg); | ||
| cv::Point2f corners[4]; | ||
| rrect.points(corners); | ||
|
|
||
| if (Z > 0.0f) { | ||
| for (auto& c : corners) { | ||
| c = project_color_pixel_to_depth(c.x, c.y, Z, color_props_, | ||
| depth_props_, | ||
| depth_color_extrinsic_); | ||
| } | ||
| } | ||
| for (int j = 0; j < 4; ++j) { | ||
| cv::line(depth_color, corners[j], corners[(j + 1) % 4], | ||
| cv::Scalar(0, 255, 0), 2); | ||
| } | ||
| } | ||
| depth_colormap_pub_->publish( | ||
| *cv_bridge::CvImage(depth->header, "bgr8", depth_color) | ||
| .toImageMsg()); | ||
| } | ||
|
|
||
| if (debug_visualize_ && annulus_pub_ && plane_pub_) { | ||
| sensor_msgs::msg::PointCloud2 ann_msg, pln_msg; | ||
| pcl::toROSMsg(*ann_dbg, ann_msg); | ||
| pcl::toROSMsg(*pln_dbg, pln_msg); | ||
| ann_msg.header = depth->header; | ||
| pln_msg.header = depth->header; | ||
| annulus_pub_->publish(ann_msg); | ||
| plane_pub_->publish(pln_msg); | ||
| } | ||
|
|
||
| std_msgs::msg::Header pose_header = depth->header; | ||
| if (!output_frame_id_.empty()) | ||
| pose_header.frame_id = output_frame_id_; | ||
|
|
||
| if (debug_visualize_ && pose_pub_) | ||
| pose_pub_->publish(make_pose_array(poses, pose_header)); | ||
| landmark_pub_->publish( | ||
| make_landmark_array(poses, pose_header, landmark_type_)); | ||
| } |
There was a problem hiding this comment.
This function is currently too large. Split into separate helper functions and maybe move something into ros_utils. Functions for visualization etc are good candidates to move to utils
| const Eigen::Matrix3f R_dc = depth_color_extrinsic_.R.transpose(); | ||
| const Eigen::Vector3f O_d = -(R_dc * depth_color_extrinsic_.t); |
There was a problem hiding this comment.
These variable names are not descriptive
|
Subtype resolver looks good |
| depth_image_info_topic: "/camera/camera/depth/camera_info" | ||
|
|
||
| # Color camera-info topic (optional – overrides the fallback params below if published). | ||
| color_image_info_topic: "/moby/front_camera/camera_info" |
There was a problem hiding this comment.
What is this needed for (color_image_info_topic)? It gets declared but never used
| valve_handle_offset: 0.065 | ||
|
|
||
| # ── Behaviour toggles ───────────────────────────────────────────────────── | ||
| debug_visualize: false # enable all debug visualization topics |
There was a problem hiding this comment.
Why is this placed outside the debug block
| struct BoundingBox { | ||
| float center_x{0}; | ||
| float center_y{0}; | ||
| float size_x{0}; | ||
| float size_y{0}; | ||
| float theta{0}; // radians | ||
| }; |
There was a problem hiding this comment.
A better name might be OrientedBoundingBox? Since BoundingBox implies its only x, y, width, height
There was a problem hiding this comment.
Actually 🤓 this is fine imo. Follows the ros definition of including theta in BoundingBox2D
https://github.com/ros-perception/vision_msgs/blob/ros2/vision_msgs/msg/BoundingBox2D.msg
| if (iou > iou_duplicate_threshold || iom > 0.7f) | ||
| suppressed[j] = true; | ||
| } | ||
| } |
There was a problem hiding this comment.
0.7f is magic number, explain how you got it with a comment
- Updated PoseEstimator class to improve depth image handling and ray-plane intersection logic. - Renamed methods for clarity, including changing `calculate_letterbox_padding` to `compute_letterbox_transform`. - Enhanced the `compute_pose_from_depth` method to return a structured `DetectionResult` instead of `PoseResult`. - Simplified the handling of bounding boxes and their transformations between color and depth images. - Improved the ValvePoseNode class by restructuring parameter declarations and initialization. - Added robust error handling for TF lookups and improved logging for extrinsic transformations. - Streamlined the synchronization of depth and detection messages, ensuring better performance and clarity in the processing pipeline. - Updated debug visualization features to provide clearer outputs during development.


No description provided.