question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ZNP Adapter Commissioning/Restore Decision Process

See original GitHub issue

Current State

Currently the following startup process is implemented (as per startZnp.ts):

  • Check if the adapter has been configured using hasConfigured NV item:
    • Yes: Check if configuration matches data in NV RAM:
      • Yes: Regular startupFromApp
      • No: Start BDB commissioning
    • No: Check if coordinator backup exists:
      • Yes: Restore from backup and startupFromApp,
      • No: Check if configuration matches data in NV RAM:
        • Yes: Regular startupFromApp
        • No: Start BDB commissioning

BDB Commissioning

Part of BDB commissioning which forms a new ZigBee network is to make sure, no other network with the same PAN ID lives on the same channel in the vicinity of the adapter. This is done by Z-Stack adapter firmware by sending a Beacon Request when forming a new network. Adapter then waits for Beacon frames and determines if the configured PAN ID is in use by other ZigBee network nearby.

If a network with the same PAN ID is present, the commissioning process will keep incrementing the PAN ID by one until suitable (free) PAN ID is found. If the configured PAN ID is NOT found it is used. This information is then stored within NIB in adapters NV memory.

This means that contents of ZCD_NV_PANID and ZCD_NV_NIB may differ after network commissioning. This information is discovered by Z2M by ZDO:extNwkInfo SREQ/SRSP but not taken into account.

Problem

During an adapter upgrade or replacement, where the new adapter has been previously used by Z2M without having NVRAM cleared (eg. used in other network or used before with other parameters), Z2M would start BDB commissioning instead of restoring NV backup to recover the previous network. This is due to the decision process implementation - hasConfigured parameter is set from previous Z2M instance but parameters mismatch -> BDB Commissioning. The expected action, however, would be coordinator restore from backup.

Proposed Solutions

  1. NIB Error - terminate application if adapters NV memory value of ZCD_NV_PANID (therefore configured PAN ID) mismatches PAN ID within ZCD_NV_NIB ,
  2. NIB Management - if ZDO:extNwkInfo SRSP mismatches ZCD_NV_PANID (therefore configured PAN ID) update the PAN ID in NIB and write it back to device - then SREQ SYS:resetReq,
  3. Restore If Config Matches (suggested by @Koenkk) - restore coordinator backup if NV memory and configuration mismatches, but configuration and coordinator backup matches (I would still keep in mind the NIB mismatch thingy),
  4. “Sign” The NV Memory - instead of (or in addition to) simple hasConfigured NV item in adapter introduce a value, which uniquely represents every Z2M instance (eg. UUIDv4) - if the instance ID mismatches (use in different Z2M instance) - coordinator NV is rather restored than network re-commissioned.

This might be 2 issues as well (the decision process and commissioned PAN ID validation). I am posting this issue to open a discussion in this matter since it has caused me several problems and by searching on forums and other issues I see other people struggle with this as well (sometimes without a gist where the problem lies).

If we can find a suitable solution, I am willing to do the implementation and submit a PR.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:6
  • Comments:39 (33 by maintainers)

github_iconTop GitHub Comments

5reactions
puddlycommented, Jan 12, 2021

I’ve been messing around with the low-level NIB stuff for a while in zigpy-znp so if you want to compare results, here’s what I am using right now: https://github.com/zigpy/zigpy-znp/blob/dev/zigpy_znp/znp/nib.py#L50-L123

A few other data type sizes change between the different architectures so it’s not enough to tweak padding. I’ve documented all of the differences between the two structs in the comment of the CC2531NIB class right below the linked code snippet.

I also have complete NVRAM dumps for a CC2652R, though the format is a little different than what Z2M uses: https://github.com/zigpy/zigpy-znp/tree/dev/tests/nvram. I would be very interested in some complete CC2538 NVRAM dumps if you have any 😃

Regarding your main issue with PAN ID conflicts in newer Z-Stacks (especially with existing routers on the network): I have been trying to form a network with a PAN ID of 0xFFFF (to let Z-Stack pick) and then overriding it after a reset by modifying the NIB: https://github.com/zigpy/zigpy-znp/blob/dev/zigpy_znp/zigbee/application.py#L385-L396. I believe the reset is necessary to make sure the frame counter in the NIB is rounded up by Z-Stack, since it’s only written to NVRAM periodically (and I think in multiple places in different versions of Z-Stack)

In my tests it has worked and causes my CC2652R to happily form a conflicting network, even with other routers sharing that same PAN ID.

Not entirely related to this discussion, but if you want to collaborate on a unified network backup format between Z2M and zigpy/ZHA and possibly get deCONZ on board, it would be awesome: https://github.com/zigpy/zigpy/issues/557#issuecomment-745072411

2reactions
castorwcommented, Jan 12, 2021

I have updated the original comment with data from CC2652’s provided by you all: https://github.com/Koenkk/zigbee-herdsman/issues/286#issuecomment-758830202

@puddly: First of all, thanks for sharing your knowledge 😉

Here are CC2538 NVRAM dumps for you (created with zigpy_znp/tools/nvram_read.py): https://gist.github.com/castorw/d697a6b0952c9577038296574b872703 https://gist.github.com/castorw/df2cf1c75c4436897925628f067e5063

Regarding the nwkState (the only notable difference betwen 8-bit/16-bit addressing) - I don’t think we need to care about this parameter. We can just restore it shifted as is. And even if you want to read it, you still read the same position since the MCUs are all little endian and the maximum value does not reach nowhere near 0xff or above.

The idea of unified backup format seems nice. And the finding that NIB can be safely altered is great. I will dedicate some time for testing this with CC2531/CC2538 and conversions between NIBs as well.

@Koenkk: To get this issue back on track (since it seems to have spiralled a bit) I would like to split this in separate issues and close this one:

  • ZNP Commissioning PAN ID Collision Management
  • ZNP Adapter Restore Process (NIB Conversion)

Agreed?

Read more comments on GitHub >

github_iconTop Results From Across the Web

特別オファー PORT LBC ブルゾン - Ingana
book/airlines.csv at master · neo4j-graph-analytics/book · ZNP Adapter Commissioning/Restore Decision Process · Issue #286 589_tmp_0eb035a5cdb7c4619e2985 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found