Skip to content

Enhancement: ML-Aware Macro Placement for SRAM-Heavy and Structured Designs #9996

@orbaps

Description

@orbaps

Description

Problem Statement

The current rtl_macro_placer frequently fails or produces suboptimal results for designs with:

  • a large number of small SRAM macros
  • clustered memory architectures
  • structured layouts (e.g., systolic arrays, tiled accelerators)

This is evident from multiple issues:

Typical failure:

Root Cause Analysis

The current macro placement approach:

relies primarily on simulated annealing
assumes macros are:
few
large
loosely connected

However, modern accelerator workloads (ML/AI designs) exhibit:

many small SRAM macros
strong locality and communication constraints
structured layouts (grid / tiled / systolic)

Key Mismatch:

OpenROAD → generic macro placement
Modern designs → structured + memory-dense

This leads to:

failure in search space exploration
invalid placements (MPL-0040)
routing congestion
degraded PPA

Suggested Solution

Inspiration from Industry (Vivado)

Modern tools like Xilinx Vivado handle similar challenges using:

hierarchical placement (clustering)
constraint-driven floorplanning (Pblocks)
architecture-aware optimization (BRAM/DSP awareness)
multi-stage placement (coarse → refine)

Proposed Enhancement: ML-Aware Macro Placement Mode

Introduce a new mode:

rtl_macro_placer -mode ml_aware

This mode enables structure-aware, memory-aware placement.

Proposed Approach

  1. Macro Clustering (Hierarchical Placement)
    Build a connectivity graph of macros
    Cluster macros based on:
    communication intensity
    shared memory structures
    Treat clusters as super-macros

Reduces search complexity and improves locality

  1. Structure / Grid-Aware Placement
    Detect structured patterns:
    systolic arrays
    tiled compute blocks
    Enforce:
    alignment (row/column)
    regular spacing

Matches real accelerator layouts

  1. Enhanced Cost Function

Extend placement cost:

Cost = α * wirelength
+ β * overlap
+ γ * congestion
+ δ * clustering_penalty
+ ε * locality_reward

Add:

locality awareness
communication distance penalties
cluster compactness
4. Congestion-Aware Placement
Estimate routing congestion early
Penalize dense macro regions

Prevents post-placement routing failures

  1. Multi-Stage Placement (Vivado-style)
    Stage 1: Clustering
    group macros into clusters
    Stage 2: Coarse Placement
    place clusters globally
    Stage 3: Refinement
    expand clusters → place individual macros
  2. Failure Recovery Mechanism

Instead of:

FAIL → exit

Introduce:

FAIL → adaptive retry:
- reduce clustering granularity
- adjust placement density
- change annealing parameters
- retry with different seeds

Additional Context

I would be interested in contributing to as I belong to VLSI domain :

macro clustering implementation
cost function improvements
benchmarking and evaluation

Looking forward to feedback from maintainers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    mplMacro Placement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions