Skip to content

activator: backfill flex-algo node segments on topology events (RFC-18, PR 2/4)#3475

Draft
ben-malbeclabs wants to merge 51 commits intobc/rfc18-smartcontractfrom
bc/rfc18-activator
Draft

activator: backfill flex-algo node segments on topology events (RFC-18, PR 2/4)#3475
ben-malbeclabs wants to merge 51 commits intobc/rfc18-smartcontractfrom
bc/rfc18-activator

Conversation

@ben-malbeclabs
Copy link
Copy Markdown
Contributor

@ben-malbeclabs ben-malbeclabs commented Apr 7, 2026

Summary of Changes

  • When a TopologyInfo account is created onchain, the activator now automatically runs backfill_topology for all currently-activated devices, allocating flex-algo node segment IDs for the new topology.
  • When a new Vpnv4 loopback interface is activated on a device, the activator backfills all existing topologies' node segments for that device.
  • Both paths batch device pubkeys in chunks of 28 to stay within Solana's per-transaction account limit.

Stacked PRs: This is PR 2/4. Depends on #3474 (smartcontract). Can merge in either order relative to #3476 (controller). #3477 (e2e) depends on this.

RFC: rfcs/rfc18-link-classification-flex-algo.md

Diff Breakdown

Category Files Lines (+/-) Net
Core logic 4 +143 / -4 +139

Small, focused change — all additions are activator logic with no test files in this PR.

Key files (click to expand)
  • activator/src/processor.rsAccountData::Topology arm in the event handler triggers backfill for all devices
  • activator/src/process/iface_mgr.rs — on Vpnv4 loopback activation, backfills existing topologies for that device
  • activator/src/process/device.rs — helper to enumerate activated device pubkeys for backfill batching

Testing Verification

ben-malbeclabs and others added 30 commits March 30, 2026 17:51
Introduces onchain link color model using IS-IS Flexible Algorithm
(RFC 9350) to separate VPN unicast and multicast forwarding topologies.
Defines LinkColorInfo PDA, link_color field on Link, FlexAlgo feature
flag, and controller changes for admin-group tagging, flex-algo
definitions, system-colored-tunnel-rib BGP resolution, and per-tunnel
color extended community stamping.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ulti-color, cleanup

- Replace onchain feature flag with controller features.yaml config file
- Add LinkColorInfo account with AdminGroupBits ResourceExtension for
  persistent bit allocation; bits never reused after deletion
- Change link_color: Pubkey to link_colors: Vec<Pubkey> (cap 8)
- Add include_topology_colors: Vec<Pubkey> on Tenant for per-tenant
  color assignment; defaults to UNICAST-DEFAULT (color 1)
- Redesign interface admin-group cleanup: overwrite remaining colors
  on deletion rather than targeted named no command
- Add full revert: enabled: false removes all flex-algo config
- Pin UNICAST-DEFAULT as protocol invariant (bit 0, first color created)
- Add controller startup check blocking enabled: true if any Vpn4v
  loopback has unset flex_algo_node_segment_idx
- Clarify clear sweep atomicity and idempotency
- Address all PR review comments (nikw9944, vihu, elitegreg)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ew comments

Addresses PR #3288 review feedback and previous second-pass review findings:

- Rename DZF concept: 'color/link color' → 'topology/link topology' throughout
  the RFC wherever referring to the constrained IS-IS forwarding plane concept.
  EOS/BGP protocol terms retained as-is: EOS color value, BGP color extended
  community, system-colored-tunnel-rib, set extcommunity color, Color:CO(00):N.
  Adds 'Topology vs color' terminology entry to make the distinction explicit.

- Fix resolution RIB profile (packethog line 412): replace system-connected with
  tunnel-rib system-tunnel-rib in template, prose, test requirements, verification
  tests, and asymmetric routing description. Confirmed correct via lab test configs.

- Fix flex-algo node segment data model (packethog line 451): replace single
  flex_algo_node_segment_idx: u16 on Interface with Vec<FlexAlgoNodeSegment>
  (topology: Pubkey, node_segment_idx: u16). Fixes multi-topology support and
  moves node segment index to TopologyInfo scope. Loopback template updated to
  iterate .FlexAlgoNodeSegments directly.

- Add auto-tagging at activation: activation processor sets link_topologies[0]
  to UNICAST-DEFAULT pubkey automatically; fails explicitly if UNICAST-DEFAULT
  does not exist.

- Add bootstrapping deployment procedure and migration enforcement: controller
  treats enabled: true as no-op if unset loopbacks are found; migration command
  covers both link tagging and loopback node segment allocation; idempotent with
  --dry-run and summary output.

- Rename CLI flags and struct fields: --link-color → --link-topology,
  --include-topology-colors → --include-topologies, LinkColorInfo → TopologyInfo,
  link_colors → link_topologies, include_topology_colors → include_topologies,
  LinkColorConstraint → TopologyConstraint, b"link-color-info" → b"topology",
  exclude_topology_colors → exclude_topologies.

- Alphabetize New Terminology section; rename multicast table column to
  'Links in path'; update default color framing to clarify all unicast tenants
  receive color 1 by default with include_topologies as override.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename `eos_color_value` → `color` throughout onchain/controller
  context (formula variable, flex-algo template `.Color`, table columns,
  prose). Terminology entry "EOS color value" retained as the defined
  term for the EOS/BGP mechanism.
- Add UNICAST-DRAINED as the second reserved TopologyInfo (bit 1,
  flex-algo 129, color 2, constraint Exclude). Protocol invariant
  enforced: second topology create must be named unicast-drained.
- Controller template injects `exclude $drainBit` into every include-any
  flex-algo definition; drain is additive to link_topologies.
- Add `doublezero link drain / restore` CLI commands (foundation-only).
- Add bootstrapping step 3 (create UNICAST-DRAINED immediately after
  UNICAST-DEFAULT).
- Add smart contract, controller unit, and e2e test requirements for
  UNICAST-DRAINED invariant, drain/restore lifecycle, and exclude
  precedence (RFC 9350 §5.2.1).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, backfill on topology creation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…location

Implements the TopologyCreate instruction (variant 104) for the
doublezero-serviceability program. The instruction creates a TopologyInfo PDA,
allocates the lowest available bit from the AdminGroupBits ResourceExtension
(skipping pre-reserved bit 1 / UNICAST-DRAINED), derives flex_algo_number =
128 + admin_group_bit, and optionally backfills Vpnv4 loopback interfaces on
Device accounts with a FlexAlgoNodeSegment entry.

Also adds stub TopologyDelete (105) and TopologyClear (106) instructions for
future implementation, and fixes missing flex_algo_node_segments field in CLI
test fixtures for InterfaceV3.
Wire AdminGroupBits resource creation into process_set_globalconfig,
following the same pattern as DeviceTunnelBlock, LinkIds, etc. The PDA
is created on first initialization and a migration path handles existing
deployments that predate RFC-18.

Remove the separate create_admin_group_bits test helper and replace all
call sites with a plain PDA derivation; update all SetGlobalConfig
account lists to include admin_group_bits_pda as the 9th account.
AdminGroupBits is created automatically by global-config set (not by
doublezero resource create), bits are auto-allocated by CreateTopology,
and bits are never deallocated per RFC-18. No valid operator CLI
workflow exists for any of the create/allocate/deallocate paths.
…v4 loopback backfill

Adds a standalone BackfillTopology onchain instruction (variant 109) that
allocates FlexAlgoNodeSegment entries on existing Vpnv4 loopbacks for an
already-created topology. Idempotent — skips loopbacks that already have
an entry for this topology.

Also fixes a pre-existing bug where CreateIndex/DeleteIndex unpack arms
used discriminants 104/105, colliding with CreateTopology/DeleteTopology —
corrected to 107/108 per their enum variant comments.

Wires up SDK command (BackfillTopologyCommand), CLI command
(BackfillTopologyCliCommand under `doublezero link topology backfill`),
and rewrites doublezero-admin migrate Part 2 to actually call
BackfillTopology instead of just reporting gaps.

Integration tests: success + idempotency, non-foundation rejected,
nonexistent topology rejected.
…gments reading

- link fixture: add link_topologies (1 entry) and unicast_drained=true
- tenant fixture: add include_topologies (1 entry)
- new topology_info fixture for TopologyInfo account (account_type=16)
- Python/TypeScript Interface: add V3 deserialization with flex_algo_node_segments;
  bump CURRENT_INTERFACE_VERSION to 3
…ady in accounts

topology clear --links was broken: execute_transaction_inner always appended
payer at the end, but clear.rs also included payer explicitly at [2]. Solana
deduplicates by removing the first occurrence, causing the link account to
shift into payer's position. The signer check then fails.

Fix: skip appending payer if it is already present in the accounts list.
Incidentally formats cli/src/topology/create.rs (nightly rustfmt).
…ript SDKs

Cherry-pick added V3 deserialization logic but not the type declarations.
Add FlexAlgoNodeSegment dataclass/interface and flexAlgoNodeSegments field
to Interface/DeviceInterface in both SDKs.
  compile errors in cli link commands and activator tests
@ben-malbeclabs ben-malbeclabs force-pushed the bc/rfc18-smartcontract branch 2 times, most recently from b9b2b3b to 3c428f2 Compare April 7, 2026 20:20
@ben-malbeclabs ben-malbeclabs marked this pull request as draft April 8, 2026 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant