Skip to content

fix: skip unresolvable channel bindings instead of crashing on startup#412

Open
l33t0 wants to merge 4 commits intospacedriveapp:mainfrom
l33t0:fix/graceful-missing-messaging-config
Open

fix: skip unresolvable channel bindings instead of crashing on startup#412
l33t0 wants to merge 4 commits intospacedriveapp:mainfrom
l33t0:fix/graceful-missing-messaging-config

Conversation

@l33t0
Copy link
Contributor

@l33t0 l33t0 commented Mar 12, 2026

Summary

When a channel binding references a messaging platform (e.g. telegram) with no corresponding [messaging.telegram] config section, the agent hard-crashes during config validation, causing Fly machines to crash-loop until max restart count.

Changed validate_named_messaging_adapters() from a hard-fail validator to a filter that returns only resolvable bindings. Unresolvable bindings are logged as warnings and skipped. The agent starts successfully with the remaining valid bindings.

All four error cases now warn-and-skip instead of crash:

  • Missing messaging config for platform
  • Missing named adapter instance
  • Named adapter on unsupported platform
  • No default credentials configured

Closes #411

Test plan

  • cargo clippy --all-targets -- -D warnings passes clean
  • All 622 lib tests pass (including 5 updated + 1 new validation test)
  • Configure a binding for a platform with no messaging config, verify agent starts with warning log

Note

Key changes: Refactored validate_named_messaging_adapters() to return filtered Vec<Binding> instead of validating in-place. The function now logs warnings for unresolvable bindings and skips them gracefully rather than hard-failing. Updated 4 test cases to reflect this new behavior and added 1 new test for the missing messaging config scenario.

Written by Tembo for commit ee4f1cb. This will update automatically on new commits.

When a channel binding references a messaging platform with no
corresponding config section, the agent would hard-crash during
config validation, causing Fly machines to crash-loop.

Changed validate_named_messaging_adapters to warn and skip
unresolvable bindings instead of returning a fatal error. The agent
now starts successfully with the remaining valid bindings.

Closes spacedriveapp#411
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Walkthrough

Validation of named messaging adapters now returns a filtered Vec and accepts a strict flag: unresolvable or unsupported bindings are skipped with warnings in non-strict mode, or cause errors in strict mode. Call sites and tests were updated to consume and assert on the returned bindings.

Changes

Cohort / File(s) Summary
Validator Implementation
src/config/types.rs
Signature changed from Result<()> to Result<Vec<Binding>> and adds a strict: bool flag. Validation now accumulates resolvable bindings; unresolvable/unsupported bindings are logged and skipped (non-strict) or cause errors (strict). Adapter/platform state computation updated to consider only enabled instances.
Loader Call Site
src/config/load.rs
Introduces from_toml_inner(..., strict_bindings) and routes from_toml through it with strict=false. TOML validation calls from_toml_inner(..., true). Updated to consume the returned Vec<Binding> from the validator.
Tests & Config Usage
src/config.rs
Tests updated/renamed to reflect "skipped" behavior for unresolvable bindings; many new tests added covering mixed valid/invalid bindings, missing messaging config, disabled instances/platform defaults, and strict vs non-strict behavior. Assertions now inspect returned binding vectors.
Manifest / Metadata
Cargo.toml
Minor manifest/test metadata adjustments accompanying the changes. (Small lines changed.)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: refactoring to skip unresolvable channel bindings instead of crashing, which is the core objective of the PR.
Description check ✅ Passed The description clearly explains the problem, the solution, and lists all four error cases that now warn-and-skip instead of crash, directly relating to the changeset.
Linked Issues check ✅ Passed The PR fully addresses issue #411 by refactoring validate_named_messaging_adapters() to skip unresolvable bindings with warnings instead of crashing, allowing the agent to start successfully.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing the warn-and-skip behavior for unresolvable bindings; no unrelated modifications detected in the refactored validation logic and updated tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Adds a test that verifies the filter behavior when a mix of valid and
invalid bindings are passed - only valid bindings should survive. Covers
the scenario where some bindings reference missing adapters or missing
platform configs alongside valid ones.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config/load.rs`:
- Line 2248: The current call in from_toml() uses the lenient binding filter
(validate_named_messaging_adapters) which is also used by
Config::validate_toml(), allowing invalid bindings to pass validation; change
the flow so Config::validate_toml() performs strict validation (e.g., call a new
strict validator or pass a strict flag to validate_named_messaging_adapters)
while from_toml() keeps the lenient startup filter. Concretely: introduce either
validate_named_messaging_adapters_strict(...) or add a parameter like
validate_named_messaging_adapters(..., strict: bool), update
Config::validate_toml() to call the strict variant (or pass strict=true) to
reject broken bindings, and leave from_toml() calling the lenient variant (or
strict=false) so startup filtering remains permissive.

In `@src/config/types.rs`:
- Around line 1685-1736: The current validation lets bindings reference adapters
that exist but are disabled; update build_adapter_validation_states to consider
instance.enabled and platform-level enabled flags when building named_instances
and default_present so only actually instantiable adapters are considered
resolvable. Specifically, in build_adapter_validation_states use
validate_instance_names over iterator(s) that filter instances by
instance.enabled (e.g., .instances.iter().filter(|i| i.enabled).map(|i|
i.name.as_str())) and set default_present to require the platform config to be
enabled (e.g., platform.enabled && config-field-nonempty). This will make the
binding loop that checks adapter_states (the channel loop / match on
binding.adapter) automatically exclude bindings pointing to disabled adapters or
defaults.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 348e242e-14fe-4e06-9f2a-1b936a677f5f

📥 Commits

Reviewing files that changed from the base of the PR and between ccdbce5 and ee4f1cb.

📒 Files selected for processing (3)
  • src/config.rs
  • src/config/load.rs
  • src/config/types.rs

Address PR review feedback:
- Include agent_id (and adapter name where applicable) in all
  tracing::warn messages for easier debugging of skipped bindings
- Use expect() instead of unwrap() in tests for clearer failure messages
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/config/types.rs (1)

1752-1770: ⚠️ Potential issue | 🟠 Major

Disabled adapters still count as “resolvable”.

validate_named_messaging_adapters() now promises to return only bindings the runtime can resolve, but build_adapter_validation_states() still includes disabled platform defaults and disabled named instances. A binding to messaging.telegram.enabled = false or to an instance.enabled = false adapter will survive this filter even though startup cannot actually instantiate it.

Suggested fix
     if let Some(discord) = &messaging.discord {
         let named_instances = validate_instance_names(
             "discord",
             discord
                 .instances
                 .iter()
+                .filter(|instance| instance.enabled)
                 .map(|instance| instance.name.as_str()),
         )?;
+        let default_present = discord.enabled && !discord.token.trim().is_empty();
         validate_runtime_keys(
             "discord",
-            !discord.token.trim().is_empty(),
+            default_present,
             &named_instances,
         )?;
         states.insert(
             "discord",
             AdapterValidationState {
-                default_present: !discord.token.trim().is_empty(),
+                default_present,
                 named_instances,
             },
         );
     }

Apply the same instance.enabled filtering and platform.enabled && credentials_present check for Slack, Telegram, Twitch, Email, and Signal too.

Also applies to: 1774-1790, 1794-1810, 1813-1829, 1833-1850, 1855-1871

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/types.rs` around lines 1752 - 1770,
build_adapter_validation_states is including disabled platform defaults and
disabled named instances as "resolvable"; update the logic where each platform
(discord, slack, telegram, twitch, email, signal) builds AdapterValidationState
so that instances passed into validate_instance_names and validate_runtime_keys
are first filtered by instance.enabled, and set default_present only when
platform.enabled && credentials_present (e.g., token.trim().is_empty() checks)
is true; mirror the same instance.enabled filtering used by
validate_named_messaging_adapters and apply the platform.enabled &&
credentials_present check for both the unnamed default and for constructing the
named_instances collection for each platform.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/config/types.rs`:
- Around line 1752-1770: build_adapter_validation_states is including disabled
platform defaults and disabled named instances as "resolvable"; update the logic
where each platform (discord, slack, telegram, twitch, email, signal) builds
AdapterValidationState so that instances passed into validate_instance_names and
validate_runtime_keys are first filtered by instance.enabled, and set
default_present only when platform.enabled && credentials_present (e.g.,
token.trim().is_empty() checks) is true; mirror the same instance.enabled
filtering used by validate_named_messaging_adapters and apply the
platform.enabled && credentials_present check for both the unnamed default and
for constructing the named_instances collection for each platform.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0efa9459-2302-47c7-b71e-5ea76af7b461

📥 Commits

Reviewing files that changed from the base of the PR and between 08bf26c and cee5153.

📒 Files selected for processing (2)
  • src/config.rs
  • src/config/types.rs

…nfig authoring

Address CodeRabbit review feedback:

- build_adapter_validation_states now only includes enabled instances in
  the resolvable set and checks platform-level enabled flag for
  default_present. This prevents bindings to disabled adapters from
  passing validation but failing at runtime.

- validate_named_messaging_adapters now accepts a strict flag. When
  strict (used by validate_toml for config authoring), unresolvable
  bindings return errors so users see the problem. When lenient (used by
  from_toml at startup), they log warnings and get skipped so the agent
  can still boot.

- Structural validation (duplicate/empty/reserved instance names) still
  checks ALL instances regardless of enabled status.

Tests added:
- validate_strict_mode_rejects_missing_messaging_config
- validate_disabled_instance_is_filtered_out
- validate_disabled_platform_default_is_filtered_out
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/config.rs (1)

1661-1707: ⚠️ Potential issue | 🟡 Minor

This test doesn't exercise the unsupported-platform branch.

email is listed in src/config/types.rs::is_named_adapter_platform(), so this path currently just re-tests the missing/disabled adapter case. Use a non-named platform such as webhook so the unsupported-platform branch can't regress unnoticed.

Based on learnings, "align test names to intent instead."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 1661 - 1707, The test
validate_adapter_on_unsupported_platform_rejected is not exercising the
unsupported-platform branch because "email" is considered a named platform;
update the test so the Binding uses a non-named platform (e.g., set channel to
"webhook") while keeping adapter Some("named") and messaging.webhook left None
so validate_named_messaging_adapters(...) will hit the unsupported-platform
path; also ensure the test name/comment matches this intent. Refer to
validate_named_messaging_adapters and is_named_adapter_platform when making the
change.
🧹 Nitpick comments (2)
src/config/types.rs (1)

1697-1763: Prefer message over msg in these new error paths.

The repo's Rust naming rule explicitly calls out message over msg; please expand these locals for consistency.

As per coding guidelines, "Don't abbreviate variable names. Use queue not q, message not msg. Common abbreviations like config are fine"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/types.rs` around lines 1697 - 1763, Replace the local variable
named msg with message in the newly added error paths so they follow the repo
naming rule; specifically update the format! assignments and the subsequent
Err(ConfigError::Invalid(...).into()) calls and any other uses in the branches
inside the binding validation logic (the branches that check platform adapter
support, missing messaging config, missing/disabled named adapter, and missing
default adapter). Keep the exact formatted strings and uses of binding.agent_id
/ binding.channel / adapter_name but rename msg -> message in each place where
it was introduced.
src/config.rs (1)

1863-1889: Add a Config::validate_toml() regression test for the strict path.

This only proves validate_named_messaging_adapters(..., true) is strict. The user-facing guarantee in this PR depends on validate_toml() calling the strict path in src/config/load.rs, so a small TOML-level test would prevent that wiring from regressing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 1863 - 1889, Add a regression test that verifies
Config::validate_toml() exercises the strict path: create a minimal TOML string
that defines a binding referencing a messaging channel (e.g., "telegram") but
omits any messaging adapter configuration, then parse it into the Config and
call Config::validate_toml() (the code-path in src/config/load.rs that should
invoke validate_named_messaging_adapters). Assert that the call returns an error
(matching the behavior of validate_named_messaging_adapters(..., true)) so the
TOML-level wiring can't regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/config.rs`:
- Around line 1661-1707: The test
validate_adapter_on_unsupported_platform_rejected is not exercising the
unsupported-platform branch because "email" is considered a named platform;
update the test so the Binding uses a non-named platform (e.g., set channel to
"webhook") while keeping adapter Some("named") and messaging.webhook left None
so validate_named_messaging_adapters(...) will hit the unsupported-platform
path; also ensure the test name/comment matches this intent. Refer to
validate_named_messaging_adapters and is_named_adapter_platform when making the
change.

---

Nitpick comments:
In `@src/config.rs`:
- Around line 1863-1889: Add a regression test that verifies
Config::validate_toml() exercises the strict path: create a minimal TOML string
that defines a binding referencing a messaging channel (e.g., "telegram") but
omits any messaging adapter configuration, then parse it into the Config and
call Config::validate_toml() (the code-path in src/config/load.rs that should
invoke validate_named_messaging_adapters). Assert that the call returns an error
(matching the behavior of validate_named_messaging_adapters(..., true)) so the
TOML-level wiring can't regress.

In `@src/config/types.rs`:
- Around line 1697-1763: Replace the local variable named msg with message in
the newly added error paths so they follow the repo naming rule; specifically
update the format! assignments and the subsequent
Err(ConfigError::Invalid(...).into()) calls and any other uses in the branches
inside the binding validation logic (the branches that check platform adapter
support, missing messaging config, missing/disabled named adapter, and missing
default adapter). Keep the exact formatted strings and uses of binding.agent_id
/ binding.channel / adapter_name but rename msg -> message in each place where
it was introduced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bc5b413c-6a6f-4f8e-a85e-9997366ea381

📥 Commits

Reviewing files that changed from the base of the PR and between cee5153 and e9a74c9.

📒 Files selected for processing (3)
  • src/config.rs
  • src/config/load.rs
  • src/config/types.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent crashes on startup when channel binding references missing messaging config

1 participant