smartcontract: add flex-algo topology accounts and link classification (RFC-18, PR 1/4)#3474
smartcontract: add flex-algo topology accounts and link classification (RFC-18, PR 1/4)#3474ben-malbeclabs wants to merge 55 commits intomainfrom
Conversation
fa02ab9 to
25b32e7
Compare
- interface_test.rs: add admin_group_bits_pda to SetGlobalConfig accounts
in test_interface_create_invalid_mtu_{non_cyoa,cyoa}; RFC-18 added
AdminGroupBits as a required account for SetGlobalConfig
- sdk/go/serviceability/state.go: fix gofmt formatting violation
- sdk/rs/client.rs: fix payer deduplication when user_payer==payer;
check `a.is_signer` before skipping payer insertion so the payer
is added as a signer even when already present as a non-signer
- e2e/compatibility_test.go: add testnet envOverride for all interface
and link steps covering v0.10.0–v0.16.x (testnet v0.10.0–v0.11.0
predate the commands; testnet v0.16.0 was built before RFC-18
InterfaceV2 change); create unicast-default topology in compat test
setup so the activator can activate links (required by RFC-18)
- interface_test.rs: add admin_group_bits_pda to SetGlobalConfig accounts
in test_interface_create_invalid_mtu_{non_cyoa,cyoa}; RFC-18 added
AdminGroupBits as a required account for SetGlobalConfig
- sdk/go/serviceability/state.go: fix gofmt formatting violation
- sdk/rs/client.rs: fix payer deduplication when user_payer==payer;
check `a.is_signer` before skipping payer insertion so the payer
is added as a signer even when already present as a non-signer
- e2e/compatibility_test.go: add testnet envOverride for all interface
and link steps covering v0.10.0–v0.16.x (testnet v0.10.0–v0.11.0
predate the commands; testnet v0.16.0 was built before RFC-18
InterfaceV2 change); create unicast-default topology in compat test
setup so the activator can activate links (required by RFC-18)
d771fbc to
d431ac6
Compare
nikw9944
left a comment
There was a problem hiding this comment.
Code Review: RFC-18 Smartcontract — Flex-Algo Topology Accounts
Overall this is a well-structured, well-tested PR that cleanly introduces the topology lifecycle. The onchain processors have proper authorization checks, the CLI provides good UX with auto-discovery, and the test coverage is thorough. A few issues worth addressing:
Bug: Python SDK AccountType mismatch
sdk/serviceability/python/serviceability/state.py — TOPOLOGY = 16 but the Rust AccountType enum defines Topology = 17 (with Index = 16). This will cause Python SDK consumers to misidentify topology accounts as index accounts and vice versa.
The fixture JSON metadata (topology_info.json) also has "account_type": 16 which appears to be a copy from the Rust enum ordinal rather than the discriminant value. The binary fixture should be correct (generated from borsh::to_vec which uses the Rust discriminant of 17), but the JSON metadata is misleading.
Fix: Change TOPOLOGY = 16 to TOPOLOGY = 17 in the Python SDK, and update topology_info.json's account_type and AccountType.value from 16 to 17.
Bug: PR description says include-all but code has Exclude
The PR summary says "constraint type (include-any or include-all)" but the actual TopologyConstraint enum has IncludeAny = 0 and Exclude = 1. The CLI parser also accepts "include-any" or "exclude". Minor, but worth fixing the PR description to avoid confusion for reviewers of downstream PRs.
Observation: resolve_topology_names duplicated 4 times
The identical resolve_topology_names function is copy-pasted across link/get.rs, link/list.rs, tenant/get.rs, and tenant/list.rs. Consider extracting it to a shared utility (e.g., in smartcontract/cli/src/util.rs) to avoid drift. Not blocking, but worth a follow-up.
Observation: MTU validation removal
The PR removes the LINK_MTU check from process_create_link (link create processor) and process_update_link (link update processor). This is a separate behavioral change from topology support — links can now be created/updated with any MTU value. If this is intentional (not just an artifact of the topology work), it might be worth calling out explicitly in the PR description for reviewer awareness, since it changes the validation contract.
Observation: AdminGroupBits range vs PR description
PR description says "capped at 62 topologies" and "admin-group bit (1–62)" but the AdminGroupBits resource range is IdRange(1, 127), allowing up to 127 topologies. The code is more permissive than the description suggests — just a documentation nit.
Observation: topology_info.json missing trailing newline
Both topology_info.json and the updated link.json / tenant.json fixtures are missing a trailing newline (\ No newline at end of file). Minor, but most linters flag this.
Minor: Unused binding in topology/list.rs
let _ = get_topology_pda(&program_id, &t.name); // kept for symmetry; use key directlyThis binding is explicitly unused (commented "kept for symmetry"). It should be removed — dead code with a comment explaining why it exists is worse than just not having it.
Design: Auto-tagging at activation
The process_activate_link processor unconditionally sets link.link_topologies = vec![*unicast_default_topology_account.key] — this overwrites any topology that might have been assigned between create and activate. In the current flow this is fine (links are created with empty topologies), but it's worth noting that any pre-activation topology assignment would be lost. A push + dedup approach would be more defensive, but given the lifecycle this is acceptable as-is.
Verdict: The AccountType mismatch in the Python SDK is a real bug that should be fixed before merge. The rest are minor observations. Solid work overall — the test coverage (52% of additions) and the stacked-PR approach are well done.
…location Implements the TopologyCreate instruction (variant 104) for the doublezero-serviceability program. The instruction creates a TopologyInfo PDA, allocates the lowest available bit from the AdminGroupBits ResourceExtension (skipping pre-reserved bit 1 / UNICAST-DRAINED), derives flex_algo_number = 128 + admin_group_bit, and optionally backfills Vpnv4 loopback interfaces on Device accounts with a FlexAlgoNodeSegment entry. Also adds stub TopologyDelete (105) and TopologyClear (106) instructions for future implementation, and fixes missing flex_algo_node_segments field in CLI test fixtures for InterfaceV3.
…r test helper usage
…ard; restore Custom(65) test assertion
…mports, unused variables
Wire AdminGroupBits resource creation into process_set_globalconfig, following the same pattern as DeviceTunnelBlock, LinkIds, etc. The PDA is created on first initialization and a migration path handles existing deployments that predate RFC-18. Remove the separate create_admin_group_bits test helper and replace all call sites with a plain PDA derivation; update all SetGlobalConfig account lists to include admin_group_bits_pda as the 9th account.
…ice field addition
…ecks, topology_test mtu fixes
… update_link; add topology tests
…opologies addition
compile errors in cli link commands and activator tests
…V2 (version 1); rustfmt
- interface_test.rs: add admin_group_bits_pda to SetGlobalConfig accounts
in test_interface_create_invalid_mtu_{non_cyoa,cyoa}; RFC-18 added
AdminGroupBits as a required account for SetGlobalConfig
- sdk/go/serviceability/state.go: fix gofmt formatting violation
- sdk/rs/client.rs: fix payer deduplication when user_payer==payer;
check `a.is_signer` before skipping payer insertion so the payer
is added as a signer even when already present as a non-signer
- e2e/compatibility_test.go: add testnet envOverride for all interface
and link steps covering v0.10.0–v0.16.x (testnet v0.10.0–v0.11.0
predate the commands; testnet v0.16.0 was built before RFC-18
InterfaceV2 change); create unicast-default topology in compat test
setup so the activator can activate links (required by RFC-18)
…nd deserialize.go - topology/create.rs: pass &Pubkey directly to validate_program_account! pda param instead of Some(&Pubkey) - deserialize.go: remove duplicate DeserializeTenant function introduced by rebase conflict resolution
b9b2b3b to
3c428f2
Compare
…nterface MTU in test
… topology and wrong loopback MTU - devnet/smartcontract_init.go: create unicast-default topology during devnet init so link activation succeeds (RFC-18 requires this PDA to exist) - topology_test.rs: change Vpnv4 loopback interface MTU from 1500 to 9000 to satisfy the non-CYOA/DIA interface MTU validation
0914071 to
d6881e1
Compare
ecda500 to
a92da56
Compare
- topology_test.rs: fix assign_link_topology helper to pass topology
pubkeys as extra accounts in the UpdateLink transaction; the processor
validates topology accounts as trailing accounts after system_program,
but the helper was only including them in the instruction args
- e2e/compatibility_test.go: extend default known-incompatible range for
interface-update and link steps from [0.12.0, 0.16.0) to [0.10.0, 0.16.0);
RFC-18 added flex_algo_node_segments to InterfaceV2 which causes v0.10.0
and v0.11.0 to fail deserialization ("Interface not found") when reading
back accounts written by the new program (trailing bytes); interface
create steps are unaffected (old CLIs don't read back existing interfaces)
- Add admin_group_bits_pda to SetGlobalConfig call in telemetry ServiceabilityProgramHelper::new(), which RFC-18 requires as the 10th account - Create the unicast-default topology in ServiceabilityProgramHelper::new() and pass it to ActivateLink, which RFC-18 requires for auto-tagging - Add unicast_default_topology_pubkey field to ServiceabilityProgramHelper - Restore WAN link mtu values to 9000 in initialize_device_latency_samples_tests.rs (rebase artifact: incorrectly changed to 0/1500/4500) - Set compute_max_units(1_000_000) on all inline ProgramTest setups that call SetGlobalConfig; RFC-18 creates an additional AdminGroupBits resource extension account which can exceed the default 200k CU limit under parallel test load
…S compat timeout The testnet envOverride for device_interface_create* was incorrectly including v0.10.0, v0.11.0, and v0.16.0. These CLI versions can successfully create interfaces (the create instruction doesn't read back the account), so the step was passing but our known-incompatible table said it should fail. Testnet v0.16.0 can't deserialize the new InterfaceV2 format (built before RFC-18), but that only affects operations that READ interface accounts. The create step itself succeeds. The default range [0.12.0, 0.16.0) covers the MTU issue (versions that send the wrong MTU value). Also increase the TypeScript compat test setDefaultTimeout from 30s to 120s. The getProgramData test fetches all mainnet accounts via RPC, which can take 60-90s during busy periods, causing the test to timeout.
Summary of Changes
TopologyInfoas a new onchain account type. DZF creates topologies with an auto-assigned IS-IS TE admin-group bit (1–62), a derived flex-algo number (128 + bit), and a constraint type (include-anyorinclude-all). Topologies are enforced to be unique by name and limited to 62 total via anAdminGroupBitsresource extension.link_topologies: Vec<Pubkey>(capped at 8) andlink_flags: u8(bit 0 = unicast-drained) to theLinkaccount. Link activation now requires the UNICAST-DEFAULT topology to exist — the activate processor enforces this.include_topologiesto theTenantaccount, allowing tenants to opt into topology-filtered routing.doublezero link topology create/delete/clear/list/backfill.doublezero link updatewith--link-topology(assign a named topology, ordefaultto clear) and--unicast-drained true/false.doublezero link listwith--topology <name>filter (default= untagged links).doublezero tenant updatewith--include-topologies <name>.doublezero-admin migrate flex-algo [--dry-run]to tag existing links with UNICAST-DEFAULT and backfill Vpnv4 loopback node segments for pre-existing accounts.TopologyInfodeserialization and the new link/tenant fields.Diff Breakdown
Test-heavy: ~52% of additions are program integration tests. The topology test file alone is 2,410 lines.
Key files (click to expand)
smartcontract/programs/doublezero-serviceability/tests/topology_test.rs— 2,410-line integration test suite covering all topology processor pathssmartcontract/programs/doublezero-serviceability/src/processors/topology/create.rs— auto-assigns admin-group bit, derives flex-algo number, validates capssmartcontract/programs/doublezero-serviceability/src/processors/link/update.rs— topology assignment andlink_flagsbitmask handlingsmartcontract/programs/doublezero-serviceability/src/processors/link/activate.rs— enforces UNICAST-DEFAULT exists before activationsmartcontract/programs/doublezero-serviceability/src/state/topology.rs—TopologyInfoaccount definitionsmartcontract/cli/src/topology/— create, delete, clear (with auto-discovery), list, backfillsmartcontract/cli/src/link/list.rs—--topologyfilter; displays topology and unicast-drained in outputcontrolplane/doublezero-admin/src/cli/migrate.rs—migrate flex-algowith--dry-runsupportTesting Verification
make rust-testpasses: 675+ tests across workspace, program integration tests, and account compat checkstests/link_wan_test.rsinclude_topologiesserialization verified against binary fixtures in Go, Python, and TypeScript SDKs