A lightweight, navigation-oriented OCR framework.
It is designed for robotic navigation scenarios, where only navigation-relevant text should be detected, such as:
- Signboards
- Room numbers
while irrelevant text, such as advertisements or price tags, is ignored.
- Focuses on navigation-relevant text to reduce unnecessary information and improve OCR speed
- Supports both standalone use and ROS 2 integration
- Optimized for CPU-first robotic platforms, achieving 8 FPS on Intel CPUs with OpenVINO
- Supports Paddle and PaddleDetection for GPU environments
navocr_standalone.py: Run detection + OCR on a single image or a directorynavocr/ros_node.py: ROS 2 node entry pointconfigs/navocr_openvino.params.yaml: OpenVINO detector + OpenVINO OCR configconfigs/navocr_paddle.params.yaml: PaddleDetection detector + Paddle OCR config
| Model format | Runtime / engine | Hardware | Text detection | Text recognition | FPS |
|---|---|---|---|---|---|
| OpenVINO IR | OpenVINO Runtime | Intel CPU | RT-DETRv4 (Fine-tuned) | PP-OCRv5 | ... |
| Paddle model | Paddle Inference | CPU / GPU | PP-YOLOE (Fine-tuned) | PP-OCRv5 | ... |
| ONNX | ONNX Runtime | CPU / GPU | ... | ... | ... |
| PyTorch | PyTorch | CPU / GPU | ... | ... | ... |
Both the OpenVINO models and the PaddlePaddle models are included in this repository.
git clone git@github.com:kc-ml2/NavOCR.gitUsing a venv keeps NavOCR's Python dependencies isolated from the system Python and avoids conflicts with colcon build.
python3 -m venv ~/.venvs/navocr
source ~/.venvs/navocr/bin/activate
pip install --upgrade pip
pip install colcon-common-extensionspip install openvino pyyaml opencv-python numpyThis is only required for paddlepaddle backend.
Install PaddlePaddle following the official installation guide for your OS / Python / CUDA version:
Then install PaddleDetection and PaddleOCR:
pip install pyyaml opencv-python numpy
# PaddleDetection
git clone https://github.com/PaddlePaddle/PaddleDetection.git
cd PaddleDetection
pip install -r requirements.txt
python setup.py install
# PaddleOCR
pip install paddleocrFor the ROS 2 node, make sure your ROS environment includes:
rclpysensor_msgsvision_msgscv_bridge
These are required by navocr/ros_node.py and are also declared in package.xml.
For ROS 2 Humble, you can install the message and bridge packages with:
sudo apt install ros-humble-cv-bridge ros-humble-sensor-msgs ros-humble-vision-msgsWhen building inside the venv, invoke colcon through Python so it uses the venv interpreter:
cd ~/ros2_ws
colcon build --symlink-install --packages-select navocr
python -m colcon build --symlink-install --packages-select navocr # if you're using venv
source install/setup.bash# Setup python env
pip install gdown==5.2.0
# Download sample testset
mkdir data && cd data
gdown https://drive.google.com/uc?id=1GcgddRm4GsjPKUOVdmWFzeF5gElCZfx2
unzip example_sequence.zip
cd .. && mkdir resultsgit clone git@github.com:kc-ml2/NavOCR.git
# If you encounter oneDNN compatibility issues on CPU, set these before running:
export FLAGS_enable_pir_api=0
export FLAGS_enable_pir_in_executor=0
python navocr_standalone.py \
--params-file configs/navocr_openvino.params.yaml \
--infer_dir data/example_sequence/imagespython navocr_standalone.py \
--params-file configs/navocr_paddle.params.yaml \
--infer_dir data/example_sequence/imagespython navocr_standalone.py \
--params-file configs/navocr_openvino.params.yaml \
--input data/example_sequence/images/000000.jpg# Build ROS 2 package according to "Build ROS 2 package (Optional)" above.
# If you encounter oneDNN compatibility issues on CPU, set these before running:
export FLAGS_enable_pir_api=0
export FLAGS_enable_pir_in_executor=0
ros2 run navocr navocr_with_ocr_nodeIf you want to select a different params file at runtime:
ros2 run navocr navocr_with_ocr_node --ros-args \
-p params_file:=/absolute/path/to/configs/navocr_paddle.params.yamlPublished topics:
detections_topicdefault:/navocr/detectionsannotated_image_topicdefault:/navocr/annotated_image
We gratefully acknowledge the following open-source projects that made this work possible:
- RT-DETRv4: https://github.com/RT-DETRs/RT-DETRv4
- PaddleDetection: https://github.com/PaddlePaddle/PaddleDetection
- PaddleOCR / PP-OCRv5: https://github.com/PaddlePaddle/PaddleOCR
- OpenVINO: https://github.com/openvinotoolkit/openvino
We're working on expanding support beyond store signboards detection model. Stay tuned for upcoming features for broader navigation use cases.
- Library migration due to a license issue (
ultralytics->PaddleDetection) - Alternative inference for higher FPS on CPU (Add
OpenVINOsupport) - Integration with text recognition (PaddleOCR)
- Integration with SLAM packages via ROS (TextMap)
- Model training scripts (Dataset crawling, model fine-tuning, ...)
- Floor sign detection
- Directional guide text detection
This repository is licensed under the Apache License, Version 2.0.
This project includes code and configuration files derived from PaddleDetection (https://github.com/PaddlePaddle/PaddleDetection), which is also licensed under the Apache License, Version 2.0.
