University of Bonn Author: Federico Rosatelli
This repository contains a PyTorch implementation of the CVPR 2022 paper "Fast and Unsupervised Action Boundary Detection for Action Segmentation".
The goal of this project is to solve the Temporal Action Segmentation task without relying on frame-wise ground truth during training. Instead of supervised learning, the method utilizes inherent feature similarities to detect Action Boundaries (Change Points) and clusters temporal segments to label actions. This approach is designed to be efficient, low-latency, and applicable to both offline analysis and online streaming scenarios.
- Unsupervised Learning: Does not require frame-by-frame annotations, leveraging the internal consistency of action features.
- Boundary Detection: Implements a signal processing approach to find local minima in cosine similarity (Change Point Detection).
- Robust Refinement: Uses a Weighted Merge strategy (Hierarchical Agglomerative Clustering) to fix over-segmentation while preserving temporal duration weights.
- Dual Mode:
- Offline: Processes the entire video at once for maximum accuracy (MoF/F1 evaluation).
- Online (Simulated): A buffer-based streaming processor that respects causal latency for real-time applications.
To run the application locally, ensure you have the necessary dependencies installed:
pip install -r requirements.txt