Author Topic: [BitCloud] Join NWK via Commisioning after OTA. Devices not responding  (Read 4094 times)

0 Members and 1 Guest are viewing this topic.

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Hi Alex,

I have been working on doing OTA's on my coordinator, more precicely just reflashing my atmega2564rfr2 with the same firmware. My issue is, is that I supplied all the correct information:
- Channel (CS_NWK_LOGICAL_CHANNEL_ID) not mask
- Nwk Key
- Short Pan
- Ext Pan
- MAC
and I have tried to send a simple Identify ZCL Command to my router, but it doesn't respond.

My process thus far is as follows:
1. Program my Coordinator with fresh firmware
2. Store all information in EEPROM on my coord.
3. Allow a router to join
4. Send a ZCL identify command, confirmed it worked
5. Restart Coordinator
6. Send a ZCL identify command, confirmed it worked
7. Switch Coord off for several minutes then back on
8. Send a ZCL identify command, confirmed it worked
9. Read firmware to a separate (working.hex) hex file
10. reflash Coord knowing that it has no known device in any of its tables
11. Send a ZCL identify command, it doesn't work.
12. Restart Coord, try once more, doesn't work either.

I have seen every so often that at step 12 it can work *sometimes* but only once or twice in several tests and after quite a few restarts does it do anything. I then reflashed my firmware (working.hex) and noted that once more the router was actioning my ZCL Command. Then reflashed my firmware once more and did nothing once again.

I am unsure exactly why this is nor what "minor" differences both have in their frames to stop the reply from the router. Since it is active the whole time and only replies to the coordinator that has the original firmware on it. The payloads being sent to the router look pretty much exactly the same. All the same major pan etc. I even copied out some of the frames to look at them individually but couldn't see the minor differences when doing a file compare besides the general "route id" and "sequence numbers" that I *thought* were rather arbitrary.

Note in the attached wireshark log, the non-working firmware starts from packet 67 and the working firmware starts from 119. It didn't work this time but when I restarted it, it worked first go @139 (had to upload as .txt due to file extension constraints).

I know ZigBee *generally* isn't supposed to be used like this and the coordinator would have issues not knowing that a device is actually in its network. But when I wrote the working firmware on the unit, it would at least respond back almost all of the time!

Side note, my current work around is just to restart the router and it consistently works, but since this will be for OTA's on many units, I didn't really want to tell customers they have to reboot their ZigBee units after we fixed a bug for their devices to start working again :(

I am not sure if you can help me with this or not, but I hope so :D

Happy times!
Ryan.
« Last Edit: November 01, 2016, 11:43:40 pm by rfleming »
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #1 on: October 31, 2016, 04:14:41 pm »
Ok, so you are entering the world of pain. I strongly suspect that your problems are due to security counters resetting back to 0 on the coordinator. Devices ignore such frames thinking it is a replay attack.

You should really use BitCloud's PDS to recover from network failures. This does not save you if format of the PDS data changes between the firmware versions, but it is better that nothing.

There is no API for saving or restoring the counters, and it is probably not going to be implemented ever. But another option is to manually go though NWK tables and extract the counters. It is a hacky way, but that's what a normal API would do if it existed.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #2 on: November 01, 2016, 12:19:20 pm »
Bugger, I was hoping I wouldn't have to do anything with those counters. I did note that on the reflashed firmware that the counters were always significantly lower typically less than 100 in comparison to being greater than a few thousand on the working firmware.

If I was to go down this route, what counters would I have to read? Moreover, do you know what the variables names are that I could read the counters from?

I am assuming it is probably in the RAM...CS...parameters?
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #3 on: November 01, 2016, 05:25:51 pm »
a few thousand on the working firmware.
Normally 32-bit counter is split into two 16-bit halves. The second (high) half is incremented when the first one overflows (naturally) and on each reset cycle. Increment on reset cycle is done because the value of the counter is persistently stored only on overflow to save flash writes, so this increment will ensure that the counter is ahead of any previous value it may have had.

I am assuming it is probably in the RAM...CS...parameters?

Outgoing frame counters are stored in the  securityCounters member of the  csNIB structure. But I'm, not 100 positive what is involved in proper saving and restoring them with outside code, you will have to experiment.

On the other hand, you may want to rethink the way you handle network startup. There should be no need for any of this.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #4 on: November 01, 2016, 10:18:47 pm »
No worries, I'll have a crack at it!

On the other hand, you may want to rethink the way you handle network startup. There should be no need for any of this.

Yea I would think there should be an easier way to do OTA's for ZigBee on the Coordinator, but I am unsure exactly what other ways I could do this?

Do you have any suggestions on another "possibility" on how I can approach this?
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #5 on: November 01, 2016, 10:20:49 pm »
Yea I would think there should be an easier way to do OTA's for ZigBee on the Coordinator, but I am unsure exactly what other ways I could do this?
It is not about OTA, if you don't erase PDS storage in your update, then it will stay the same and coordinator will just continue working.

In fact, how does it survive normal power cycles right now? Does it still work?
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #6 on: November 01, 2016, 10:56:54 pm »
Wow you reply fast!

I have been considering this, leaving PDS in tact (well the main PDS at the lower addresses, not the backup at the higher addresses), but I keep thinking that defining any program memory based variables, it would push the PDS out. But then again, I am not that familiar with the linker file they have constructed.

The app works fine when I do power cycles. So PDS is storing everything just fine!
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #7 on: November 01, 2016, 11:04:18 pm »
So PDS is storing everything just fine!
Then don't erase it during the update. It is designed for that and will handle the update normally as long as you don't change things in the CS (like adding new variables or changing table size). If you do this, check sum will not match and PDS will be erased to the default state.

That's the reason why I tell people to not add custom stuff into the CS.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #8 on: November 01, 2016, 11:43:17 pm »
Schweet.

Ideally I did want to increase the size of the PDS so I could store 6kB with of NVM variables in PDS instead of in eeprom since it is a much faster way of interacting with the data, but from the sounds of it (thus increase PDS size from 8 to 16kB...I spoke to the atmel guys about it), it might be better to leave PDS as is and just use eeprom.

Any recommendations on where I should start flashing my app from then in the hex file? I can see the first non-zero byte in hex that appears to be right after PDS starts from byte addr 4400 for the atmega2564. Comparing various example app hex files it appears most of the app changes start from here. The header of the hex I am assuming has something to do with the interrupts vectors that is specific to the apps.

So just leave initial PDS block alone and write to the rest?
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not respondig
« Reply #9 on: November 01, 2016, 11:52:53 pm »
I can see the first non-zero byte in hex that appears to be right after PDS starts from byte addr 4400 for the atmega2564.
That's the offset.
If you look in the linker script you will see this:
Code: [Select]
    *(.vectors)
    KEEP(*(.vectors))
    . = ALIGN(0x400);

    /* PDS NV memory section */
    PROVIDE(__d_nv_mem_start = .);
    . = ALIGN(0x4400);
    PROVIDE(__d_nv_mem_end = .);

So just leave initial PDS block alone and write to the rest?
You need to update the area from 0 to 0x400 and from 0x4400 onwards.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not responding
« Reply #10 on: November 02, 2016, 01:05:51 am »
Just tested a serial upgrade through that atmel bootloader with minor mods. Worked first go :)

Can you see any issue for overwriting the PDS regions defined in the linker as the FD and FF sections?
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not responding
« Reply #11 on: November 02, 2016, 01:29:50 am »
Can you see any issue for overwriting the PDS regions defined in the linker as the FD and FF sections?
What do you mean?
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not responding
« Reply #12 on: November 02, 2016, 03:16:04 am »
Can you see any issue for overwriting the PDS regions defined in the linker as the FD and FF sections?
What do you mean?


Code: [Select]
//Looking at the IAR linker, easier to read in some regards...
-Z(CODE)PDS_FF=_APPLICATION_START-_..X_FLASH_NEND               /* PDS files descriptors segment */
-Z(CODE)PDS_FD=_APPLICATION_START-_..X_FLASH_NEND               /* PDS directories descriptors segment */

GCC
    /* Non-volatile file system PDS_FF section */
    PROVIDE(__pds_ff_start = .);
    KEEP(*(.pds_ff))
    PROVIDE(__pds_ff_end = .);

    /* Non-volatile file system PDS_FD section */
    PROVIDE(__pds_fd_start = .);
    KEEP(*(.pds_fd))
    PROVIDE(__pds_fd_end = .);

Or are these just identifiers within the same PDS section?
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11238
  • Country: us
    • Personal site
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not responding
« Reply #13 on: November 02, 2016, 03:22:11 am »
Or are these just identifiers within the same PDS section?
You need to update those, as they contain constants describing the structure of the persistent data. Of course, the actual content will not change, since you are not going to change any CS parameters, but technically they are a part of the firmware.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: [BitCloud] Join NWK via Commisioning after OTA. Devices not responding
« Reply #14 on: November 02, 2016, 06:25:42 am »
Awesome.

Thanks once again Alex :)  :-+
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf