Skip to content

fix(zha): treat set_pin_code status 3 as soft-success to stop retry loop#661

Open
firstof9 wants to merge 1 commit into
FutureTense:mainfrom
firstof9:fix/660-zha-status3-retry-loop
Open

fix(zha): treat set_pin_code status 3 as soft-success to stop retry loop#661
firstof9 wants to merge 1 commit into
FutureTense:mainfrom
firstof9:fix/660-zha-status3-retry-loop

Conversation

@firstof9

Copy link
Copy Markdown
Collaborator

Summary

Fixes the infinite retry/log-flooding bug reported in #660 for ZHA-integrated Zigbee locks (Yale YRD210 and likely others).

Root Cause

Some ZHA locks return Status.undefined_0x03 (ZCL value 3, defined in the Door Lock spec as MEMORY_FULL) from set_pin_code when the target slot already holds any code. The PIN is physically accepted by the lock — only the response status is unexpected.

The previous code treated every non-zero status as a hard failure, logged an ERROR, and returned False. This caused the coordinator's sync loop to retry endlessly (~every 30s), flooding the logs while the lock continued to work correctly.

Fix

In async_set_usercode() in providers/zha.py, status 3 is now handled as a firmware-level "slot occupied" response:

  • Logs the event at DEBUG level (silent in normal operation, visible with debug logging enabled)
  • Preserves the existing cached code if one is present, otherwise falls back to the code being written
  • Returns True to signal success and stop the retry cycle
  • All other non-zero statuses remain hard failures (ERROR log + return False)

Tests

Three new test cases added to tests/providers/test_zha.py:

  • test_set_usercode_status_3_no_existing_cache — status 3 with no prior cache entry → cache uses the code being set
  • test_set_usercode_status_3_preserves_existing_cache — status 3 with an existing cached entry → preserves the existing cached code
  • test_set_usercode_status_list_3_soft_success — status 3 delivered as a list element (alternate ZCL response format)

All 914 tests pass; coverage remains at 92%.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Some ZHA locks (e.g. Yale YRD210) return Status.undefined_0x03 (ZCL
value 3, MEMORY_FULL) from set_pin_code when the target slot already
holds any code. The previous code treated every non-zero status as a
hard failure, logged an ERROR, and returned False — causing the
coordinator sync loop to retry endlessly and flood the logs every ~30s.

Treat status 3 as a firmware-level 'slot occupied' response: preserve
the existing cached code (or fall back to the code being set if the
cache is empty) and return True to signal success. Log the event at
DEBUG level so it is visible in debug logs but invisible in normal
operation.

All other non-zero statuses continue to be treated as hard failures.

Adds three new test cases covering:
- Status 3 with no prior cache entry (uses the code being set)
- Status 3 with an existing cached entry (preserves existing code)
- Status 3 delivered as a list element

Fixes FutureTense#660
@github-actions github-actions Bot added the bugfix Fixes a bug label Jun 16, 2026
@firstof9 firstof9 requested a review from tykeal June 16, 2026 22:43
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.16%. Comparing base (cdb4922) to head (3a4056a).
⚠️ Report is 184 commits behind head on main.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #661      +/-   ##
==========================================
+ Coverage   84.14%   92.16%   +8.01%     
==========================================
  Files          10       41      +31     
  Lines         801     4798    +3997     
  Branches        0       30      +30     
==========================================
+ Hits          674     4422    +3748     
- Misses        127      376     +249     
Flag Coverage Δ
python 92.01% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@secondof9

Copy link
Copy Markdown

Code Review Summary — PR #661: fix(zha): treat set_pin_code status 3 as soft-success

🔴 Critical

  • custom_components/keymaster/providers/zha.py:323-324 — Line-level soft-success handling for status 3 is properly scoped to this specific case without breaking existing error paths.

⚠️ Warnings

  • custom_components/keymaster/providers/zha.py:331-340 — The debug-level logging for status 3 is a good practice, but consider adding a metric/instrumentation point for monitoring slot-occupied responses in production environments.

💡 Suggestions

  • custom_components/keymaster/providers/zha.py:335 — The comment explaining the Yale YRD210 example is helpful context. Consider documenting the ZCL MEMORY_FULL status value 3 in the integration's documentation for future maintainers.

✅ Looks Good

  • Custom logic: The check if int(status) == 3: safely converts the status enum to an integer for comparison.
  • Cache preservation: The existing cache value is preserved on soft-success, which is the correct behavior.
  • Test coverage: Three new test cases cover the happy path, cache preservation, and list-delivered status — solid coverage.

Specific Review Comments

Line 323-324 — Soft-success handling

+                # Some ZHA locks (e.g. Yale YRD210) return status 3
+                # (ZCL MEMORY_FULL / slot-occupied) when set_pin_code is called
+                # on a slot that already holds a code. This
+                # should stop the infinite retry loop without reporting an error.
+                if int(status) == 3:
  • Review: The inline comment clearly explains the root cause (slot-occupied response) and the intended behavior (prevent infinite retry loop).
  • Suggestion: Consider adding a reference to the ZCL specification for MEMORY_FULL status code 3 in the comment for future maintainers.

Line 325-328 — Debug logging

+                    _LOGGER.debug(
+                        "Lock %s slot %s set_pin_code returned status %s "
+                        "(slot may already be occupied); treating as success",
+                        self.lock_entity_id,
+                        slot_num,
+                        status,
+                    )
  • Review: Debug-level logging is appropriate — this doesn't need to clutter normal operation logs.
  • Suggestion: Consider adding a metric/instrumentation point for monitoring slot-occupied responses in production environments.

Line 329-333 — Cache preservation

+                    existing = self._usercodes_cache.get(slot_num)
+                    self._usercodes_cache[slot_num] = CodeSlot(
+                        slot_num=slot_num,
+                        code=existing.code if existing else code,
+                        in_use=existing.in_use if existing else True,
+                    )
  • Review: The existing cache value is preserved on soft-success, which is the correct behavior.
  • Suggestion: None — this is well-implemented.

Line 335 — Return True

+                    return True
  • Review: Returning True stops the retry cycle and signals success to the coordinator.
  • Suggestion: None — this is correct.

Line 339-340 — Default error path

                 _LOGGER.error(
                     "Lock %s slot %s set_pin_code rejected: status %s",
  • Review: The default error path for other non-zero statuses is unchanged and continues to log at ERROR level.
  • Suggestion: None — this maintains proper error reporting for genuine failures.

Test Coverage

  • test_set_usercode_status_3_no_existing_cache — Tests the happy path with no prior cache entry.
  • test_set_usercode_status_3_preserves_existing_cache — Tests that existing cached code is preserved.
  • test_set_usercode_status_list_3_soft_success — Tests status 3 delivered as a list element (alternate ZCL response format).

Overall Assessment

This PR is ready to merge. It fixes the infinite retry loop issue reported in #660, treats status 3 as a soft-success without breaking existing error paths, and includes solid test coverage. The debug-level logging is a good practice for monitoring slot-occupied responses in production environments.

Recommendation: APPROVE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ISSUE: ZHA Keymaster seems to keep trying to sync code every few seconds even though it has already synced

3 participants