Skip to content

Gate operations on device availability (dead-node check)#889

Open
raman325 wants to merge 1 commit intomainfrom
fix/dead-node-gating
Open

Gate operations on device availability (dead-node check)#889
raman325 wants to merge 1 commit intomainfrom
fix/dead-node-gating

Conversation

@raman325
Copy link
Owner

@raman325 raman325 commented Mar 2, 2026

Proposed change

When a Z-Wave lock node dies (dead battery, out of range), LCM continues sending
RF commands that will all fail, flooding the Z-Wave mesh. This adds a device
availability check that prevents commands from being sent to dead nodes.

Inspired by FutureTense/keymaster#571.

Architectural change — two distinct connectivity layers:

  1. Integration connectivity (is_integration_connected) — is the integration's client/driver connected?
  2. Device availability (is_device_available) — is the physical device responsive? (includes integration check)

Renames for clarity:

  • is_connection_up()is_integration_connected()
  • async_is_connection_up()async_is_integration_connected()
  • async_internal_is_connection_up()async_internal_is_integration_connected()

New provider-agnostic hook:

  • is_device_available() / async_is_device_available() in BaseLock — default True, override per provider
  • _execute_rate_limited() now checks both integration connectivity AND device availability before any operation

Z-Wave JS implementation:

  • Overrides async_is_device_available() to return False when node.status == NodeStatus.DEAD
  • Asleep nodes are considered available (they wake periodically)

Type of change

  • Bugfix (non-breaking change which fixes an issue)

Additional information

  • This PR is related to issue:
  • 6 new tests (base class default, rate limiter gate, Z-Wave JS alive/asleep/dead/exception cases)
  • 12 files touched (renames propagated across codebase + tests)

Rename is_connection_up/async_is_connection_up/async_internal_is_connection_up
to is_integration_connected/async_is_integration_connected/
async_internal_is_integration_connected to clarify they check integration
connectivity, not device availability.

Add is_device_available() and async_is_device_available() to BaseLock with
default True. Override in ZWaveJSLock to return False when node status is
DEAD, preventing doomed RF commands to dead nodes.

Gate _execute_rate_limited() on device availability after the existing
integration connectivity check, raising LockDisconnected when the device
is not available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 2, 2026 21:04
@github-actions github-actions bot added python Pull requests that update Python code documentation Documentation changes bug Something isn't working labels Mar 2, 2026
@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.99%. Comparing base (712bd04) to head (b2602e8).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #889      +/-   ##
==========================================
- Coverage   96.01%   95.99%   -0.03%     
==========================================
  Files          29       29              
  Lines        2611     2622      +11     
  Branches       83       83              
==========================================
+ Hits         2507     2517      +10     
- Misses        104      105       +1     
Flag Coverage Δ
python 95.94% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
custom_components/lock_code_manager/__init__.py 96.72% <100.00%> (ø)
custom_components/lock_code_manager/coordinator.py 95.58% <100.00%> (+1.47%) ⬆️
...om_components/lock_code_manager/providers/_base.py 95.33% <100.00%> (+0.11%) ⬆️
..._components/lock_code_manager/providers/virtual.py 100.00% <100.00%> (ø)
...components/lock_code_manager/providers/zwave_js.py 84.98% <100.00%> (+0.28%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a provider-agnostic “device availability” gate (in addition to integration connectivity) to prevent sending lock operations to dead/unresponsive devices, and propagates connection-method renames across providers and tests.

Changes:

  • Rename connection checks to clarify they represent integration connectivity (*_is_integration_connected).
  • Add is_device_available() / async_is_device_available() hook to BaseLock and gate _execute_rate_limited() on it.
  • Implement Z-Wave JS device-availability logic using node.status and add test coverage for alive/asleep/dead/exception cases.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
custom_components/lock_code_manager/providers/_base.py Introduces device-availability hook and gates rate-limited operations; renames integration connectivity methods.
custom_components/lock_code_manager/providers/zwave_js.py Implements Z-Wave JS async_is_device_available() based on NodeStatus.
custom_components/lock_code_manager/providers/virtual.py Renames connectivity method to async_is_integration_connected().
custom_components/lock_code_manager/coordinator.py Updates periodic connection polling to use async_internal_is_integration_connected().
custom_components/lock_code_manager/__init__.py Updates initial connectivity check to use the renamed internal method.
tests/providers/test_base.py Adds tests for default device-availability behavior and gating in _execute_rate_limited().
tests/providers/test_zwave_js.py Renames connection tests and adds device-availability tests for Z-Wave JS.
tests/providers/test_virtual.py Updates test to use renamed internal integration-connectivity method.
tests/test_coordinator.py Updates mock/provider method naming to is_integration_connected().
tests/test_exceptions.py Updates mock/provider method naming to is_integration_connected().
tests/common.py Updates test helper provider method naming to is_integration_connected().
AGENTS.md Updates provider interface docs for the connectivity rename.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if not await self.async_internal_is_connection_up():
if not await self.async_internal_is_integration_connected():
raise LockDisconnected(
f"Cannot {_OPERATION_MESSAGES[operation_type]} {self.lock.entity_id} - lock not connected"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disconnection error message still says "lock not connected" even though the check is now specifically for integration connectivity (async_internal_is_integration_connected). Consider updating the message to refer to the integration/client/driver being disconnected to avoid confusion when the device itself is offline vs the integration is down.

Suggested change
f"Cannot {_OPERATION_MESSAGES[operation_type]} {self.lock.entity_id} - lock not connected"
f"Cannot {_OPERATION_MESSAGES[operation_type]} {self.lock.entity_id} - integration not connected"

Copilot uses AI. Check for mistakes.
Comment on lines +374 to +380
def is_device_available(self) -> bool:
"""Return whether the physical device is available for commands."""
return True

async def async_is_device_available(self) -> bool:
"""Return whether the physical device is available for commands."""
return await self._async_executor_call(self.is_device_available)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description states that device availability "includes integration check", but BaseLock.is_device_available()/async_is_device_available() do not check integration connectivity (they only call the device hook). Either adjust the PR description or update the implementation/docstrings to match the intended semantics (e.g., make async_is_device_available incorporate async_is_integration_connected, or clarify that the operation gate checks both independently).

Copilot uses AI. Check for mistakes.
- `virtual.py`: Virtual lock implementation for testing
- Each provider implements: `async_get_usercodes()`, `async_set_usercode()`, `async_clear_usercode()`,
`async_is_connection_up()`, `async_hard_refresh_codes()`
`async_is_integration_connected()`, `async_hard_refresh_codes()`
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provider interface documentation here was updated for async_is_integration_connected, but it doesn't mention the new provider-agnostic device availability hook (is_device_available/async_is_device_available) that now gates operations. Please document the new hook (and that the default is True) so future provider implementations understand when/why to override it.

Suggested change
`async_is_integration_connected()`, `async_hard_refresh_codes()`
`is_device_available()` / `async_is_device_available()`, `async_is_integration_connected()`,
`async_hard_refresh_codes()`
- Device availability: the default `is_device_available`/`async_is_device_available` implementation returns `True`.
Override this hook if a provider can detect that a specific device is unavailable even when the integration is connected;
operations against that device are gated on this hook.

Copilot uses AI. Check for mistakes.

async def async_is_connection_up(self) -> bool:
async def async_is_integration_connected(self) -> bool:
"""Return whether connection to lock is up."""
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring is now misleading: this method indicates whether the integration is connected, not whether the lock itself is up. Updating the docstring to match the renamed method will make logs/tests easier to interpret.

Suggested change
"""Return whether connection to lock is up."""
"""Return whether the integration is connected."""

Copilot uses AI. Check for mistakes.
"""Return whether the Z-Wave node is available for commands."""
try:
return self.node.status != NodeStatus.DEAD
except Exception: # noqa: BLE001
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_is_device_available swallows all exceptions and returns False without any logging. To make real failures diagnosable (e.g., unexpected attribute errors vs node missing), consider catching narrower exception types and/or logging the exception at debug level before returning False.

Suggested change
except Exception: # noqa: BLE001
except Exception as err: # noqa: BLE001
_LOGGER.debug(
"Lock %s: failed to determine node availability: %s",
getattr(self.lock, "entity_id", "unknown"),
err,
exc_info=True,
)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working documentation Documentation changes python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants