Skip to content

merlresearch/radar-bbox-diffusion

Repository files navigation

pytorch Python paper Open Source AGPL-3.0-or-later License

PyTorch implementation of REXO (multi-view Radar object dEtection with 3D bounding boX diffusiOn), a radar-based pipeline that takes multi-view radar heatmaps as input and estimates 3D bounding boxes (BBox) of human objects.

REXO operates a BBox diffusion process directly in the 3D radar space and utilizes these noisy 3D BBoxes to guide an explicit cross-view radar feature association. At each diffusion timestep, these noisy 3D BBoxes are projected into every radar view, where RoI-aligned feature cropping extracts view-specific radar features. These multi-view-associated radar features are then aggregated to condition the 3D BBox denoising process. The denoised 3D BBoxes are transformed into the 3D camera coordinate system and projected onto the 2D image plane.

REXO

[Paper] [Code] [Dataset] [BibTeX]

Environment setup [Linux/Windows (WSL)]

Follow the steps below:

conda create -n rexo python=3.13 -y
conda activate rexo
pip3 install torch==2.9.1 torchvision==0.24.1 --index-url https://download.pytorch.org/whl/cu128
pip3 install -r requirements.txt

Download and organize datasets

Download and prepare two indoor radar datasets - HIBER and MMVR - which provide multi-view radar heatmaps along with 2D/3D BBox annotations for training and evaluation.

1. MMVR (CC BY-SA 4.0 License)

  • Download the full MMVR dataset from Zenodo:

    • P1.zip
    • P2_00.zip, P2_01.zip, P2_02.zip
  • Create a directory (e.g. ./MMVR) and extract the archives:

    • unzip P1.zip -d ./MMVR/P1
    • unzip P2_00.zip -d ./MMVR/P2 && unzip P2_01.zip -d ./MMVR/P2 && unzip P2_02.zip -d ./MMVR/P2
  • Ensure the final directory layout matches the structure below:

    MMVR/
    ├── P1/
    │   ├── README.md
    │   ├── data_split.npz
    │   ├── load_sample.ipynb
    │   ├── d1s1/
    │   │   ├── 000/
    │   │   │   ├── 00000_meta.npz
    │   │   │   ├── 00000_radar.npz
    │   │   │   ├── 00000_bbox.npz
    │   │   │   ├── 00000_pose.npz
    │   │   │   └── 00000_mask.npz
    │   │   ├── 001/
    │   │   └── ...
    │   ├── ...
    │   └── d4s1/
    └── P2/
        ├── d5s1/
        ├── ...
        └── d9s6/
    

2. HIBER (MIT License)

  • Refer to the official HIBER repository for dataset access instructions.
  • After downloading, place the dataset in a directory named HIBER while maintaining the original folder structure:
    HIBER/
    ├── test/
    ├── train/
    └── val/
    

3. Set the dataset root directory

Place both radar datasets directly under <root_dir> with the following structure:

<root_dir>/
├── MMVR/
│   ├── P1/
│   └── P2/
└── HIBER/
    ├── train/
    ├── val/
    └── test/

Data Preparation

We first group consecutive radar frames into segments. This section provides instructions to create the segmented (grouped) radar data for the MMVR and HIBER datasets.

Run the following scripts with specified grouping parameters:

  • MMVR:
    python src/data/create_grouped_dataset_mmvr.py --num_frames 4 --overlap 3 --dataset_dir <root_dir>/MMVR --output <root_dir>/MMVR
  • HIBER:
    python src/data/create_grouped_dataset_hiber.py --num_frames 4 --overlap 2 --dataset_dir <root_dir>/HIBER --output <root_dir>/HIBER

Grouping Parameters

  • --num_frames Number of consecutive radar frames combined into a single segment. Example: --num_frames 4

  • --overlap Number of overlapping frames between consecutive segments. Example: --overlap 3

  • --dataset_dir Dataset directory for MMVR/HIBER. Example: --dataset_dir <root_dir>/MMVR

  • --output Output directory for the segmented (grouped) data. For MMVR, it must be the original dataset folder (<root_dir>/MMVR). Example: --output <root_dir>/MMVR

Both scripts create a new segmented (grouped) data folder named segment_{num_frames}_{overlap} under the specified output directory. In the example above, the folder segment_4_3 is created in <root_dir>/MMVR, while segment_4_2 is created in <root_dir>/HIBER. We used the same grouping parameters in our paper for training and evaluation on HIBER and MMVR.

The HIBER script create_grouped_dataset_hiber.py further refines the original 2D BBoxes on the image plane. In the original HIBER dataset, the 2D BBoxes are oversized. The script tightens them to better align with human objects. A comparison between the original and refined image-plane 2D BBox labels is shown below.

HIBER 2D BBox Label Refinement Example

Original (left) and Refined (right) 2D BBoxes on the image plane in the HIBER dataset.

Train

Use the segmented radar data as the training input and run the following command:

  • MMVR:
    python src/train.py --dataset_name MMVR --root <root_dir>/MMVR/segment_4_3 --split [P1S1/P1S2/P2S1/P2S2] --device [cuda/cpu]
  • HIBER:
    python src/train.py --dataset_name HIBER --root <root_dir>/HIBER/segment_4_2 --split WALK --device [cuda/cpu]

Parameters

  • --dataset_name The dataset name. [MMVR or HIBER]. Example: --dataset_name MMVR

  • --root Data directory for the corresponding grouped/segmented data. Example: --root <root_dir>/MMVR/segment_4_3

  • --split Data protocol and data split. [P1S1, P1S2, P2S1, P2S2] for MMVR, [WALK] for HIBER. Example: --split P2S2

  • --device [cuda or cpu]. Example: --device cuda

You can specify backbone network, diffusion parameters, and training parameters such as batch size, learning rate, number of epochs, and number of workers. Refer to src/utils/configs.py and src/models/module_rexo/model_arguments.py for details.

During training, log files and checkpoints are automatically saved under the directory logs/refined/[mmvr/hiber]/[P1S1/P1S2/P2S1/P2S2/WALK]/rexo/YYYYmmdd_HHMMSS, where YYYYmmdd_HHMMSS represents the timestamp of the training session.

Evaluation

  1. Download pretrained checkpoints from the links in the table below or directly from this repo. For these large checkpoint files, we recommend using Git LFS to fetch them.

    Dataset Protocol & Split Backbone BBox AP BBox AP50 BBox AP75 Pretrained model link Size
    MMVR P1S1 ResNet18 39.23 73.46 37.83 P1S1.pth 68MB
    MMVR P1S2 ResNet18 36.48 87.02 20.51 P1S2.pth 68MB
    MMVR P2S1 ResNet18 48.35 85.89 48.38 P2S1.pth 68MB
    MMVR P2S2 ResNet18 23.47 64.41 10.44 P2S2.pth 68MB
    HIBER WALK ResNet18 25.29 66.66 13.30 WALK.pth 68MB
  2. Evaluate object detection performance

    • MMVR:
      python src/test.py --dataset_name MMVR --root <root_dir>/MMVR/segment_4_3 --split [P1S1/P1S2/P2S1/P2S2] --ckpt_path pretrained/[P1S1/P1S2/P2S1/P2S2].pth --device [cuda/cpu]
    • HIBER:
      python src/test.py --dataset_name HIBER --root <root_dir>/HIBER/segment_4_2 --split WALK --ckpt_path pretrained/WALK.pth --device [cuda/cpu]

    Parameters

    • --ckpt_path Path to a saved model checkpoint, either pretrained or trained from scratch. Example: --ckpt_path pretrained/P2S2.pth

    You should obtain detection performance metrics such as Average Precision (AP) and Average Recall (AR) with a number of IoU thresholds.

Contributing

See CONTRIBUTING.md for our policy on contributions.

Citation

If you use this repository, please consider citing our paper:

@misc{REXO,
  title         = {{REXO}: Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion},
  author        = {Ryoma Yataka and Pu Perry Wang and Petros Boufounos and Ryuhei Takahashi},
  year          = {2025},
  eprint        = {2511.17806},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2511.17806},
}

@inproceedings{REXO26_AAAI,
  title         = {Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion},
  author        = {Ryoma Yataka and Pu Perry Wang and Petros Boufounos and Ryuhei Takahashi},
  year          = {2026},
  booktitle     = {The Fortieth AAAI Conference on Artificial Intelligence (AAAI)},
}

License

Released under the AGPL-3.0-or-later License, as found in LICENSE.md.

All files, except as noted below:

Copyright (C) 2025 Mitsubishi Electric Research Laboratories (MERL)

SPDX-License-Identifier: AGPL-3.0-or-later

The following files:

  • ./src/models/module_rexo/util/batch_norm.py
  • ./src/models/module_rexo/util/blocks.py
  • ./src/models/module_rexo/util/comm.py
  • ./src/models/module_rexo/util/develop.py
  • ./src/models/module_rexo/util/env.py
  • ./src/models/module_rexo/util/file_io.py
  • ./src/models/module_rexo/util/memory.py
  • ./src/models/module_rexo/util/registry.py
  • ./src/models/module_rexo/util/shape_spec.py
  • ./src/models/module_rexo/util/wrappers.py

were taken without modification from here (license included in LICENSES/Apache-2.0.txt), with the following copyrights:

Copyright (c) Facebook, Inc. and its affiliates

SPDX-License-Identifier: Apache-2.0

Parts of the following files:

  • ./src/models/rexo.py
  • ./src/models/module_rexo/loss.py
  • ./src/models/module_rexo/head.py

were adapted from here (license included in LICENSES/CC-BY-NC-4.0.txt) and here (license included in LICENSES/Apache-2.0.txt), with the following copyrights:

Copyright (C) 2026 Mitsubishi Electric Research Laboratories (MERL)
Copyright (C) 2022 Shoufa Chen
Copyright (c) Facebook, Inc. and its affiliates

SPDX-License-Identifier: AGPL-3.0-or-later
SPDX-License-Identifier: CC-BY-NC-4.0
SPDX-License-Identifier: Apache-2.0

Parts of the following files:

  • ./src/train.py
  • ./src/test.py
  • ./src/models/module_rexo/model_arguments.py
  • ./src/models/module_rexo/util/misc.py

were adapted from here (license included in LICENSES/CC-BY-NC-4.0.txt), with the following copyrights:

Copyright (C) 2026 Mitsubishi Electric Research Laboratories (MERL)
Copyright (C) 2022 Shoufa Chen

SPDX-License-Identifier: AGPL-3.0-or-later
SPDX-License-Identifier: CC-BY-NC-4.0

Parts of the following files:

  • ./src/models/module_rexo/util/__init__.py
  • ./src/models/module_rexo/util/backbone.py
  • ./src/models/module_rexo/util/box_ops.py
  • ./src/models/module_rexo/util/boxes.py
  • ./src/models/module_rexo/util/build.py
  • ./src/models/module_rexo/util/config.py
  • ./src/models/module_rexo/util/fpn.py
  • ./src/models/module_rexo/util/image_list.py
  • ./src/models/module_rexo/util/instances.py
  • ./src/models/module_rexo/util/logger.py
  • ./src/models/module_rexo/util/masks.py
  • ./src/models/module_rexo/util/nms.py
  • ./src/models/module_rexo/util/postprocessing.py
  • ./src/models/module_rexo/util/resnet.py
  • ./src/models/module_rexo/util/roi_align.py
  • ./src/models/module_rexo/util/poolers.py

were adapted from here (license included in LICENSES/Apache-2.0.txt), with the following copyrights:

Copyright (C) 2026 Mitsubishi Electric Research Laboratories (MERL)
Copyright (c) Facebook, Inc. and its affiliates

SPDX-License-Identifier: AGPL-3.0-or-later
SPDX-License-Identifier: Apache-2.0

Parts of the following files:

  • ./src/data/hiber_dataset.py
  • ./src/data/raw_dataset.py
  • ./src/data/camera_parameters_hiber.py

were adapted from here and here (license included in LICENSES/MIT.txt), with the following copyrights:

Copyright (C) 2026 Mitsubishi Electric Research Laboratories (MERL)
Copyright (c) 2022 wuzhiwyyx

SPDX-License-Identifier: AGPL-3.0-or-later
SPDX-License-Identifier: MIT

About

This repository contains the implementation of the AAAI 2026 paper "REXO: Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion". (https://arxiv.org/abs/2511.17806)

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

No contributors

Languages