Author Topic: Bitcloud - ZigBee End Device lost after parent power cycle  (Read 6813 times)

0 Members and 1 Guest are viewing this topic.

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Bitcloud - ZigBee End Device lost after parent power cycle
« on: June 30, 2016, 07:44:12 am »
Hi guys,

I am working with purely my Coordinator and a door lock at this stage. I add my doorlock successfully and commands are received fine via its "Data Requests". Once I power cycle my coordinator I start getting "Route discovery messages" from my coordinator trying to find my end device when I send commands. Apparently it has lost it despite my door lock still continuing to ask for data from my coordinator like nothing has happened.

Is there any way to trigger an orphan scan from my coordinator or something similar that tells my coordinator that it has a child in the network?

Cheers,
Ryan.
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #1 on: June 30, 2016, 04:27:09 pm »
You need to enable and use PDS on the coordinator. Coordinator has to remember its children.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #2 on: July 01, 2016, 12:22:33 am »
PDS is enabled on my coordinator. When I reboot its still in the same network, the routers work fine etc. So not sure why my child is being lost then.

I tried using the BitCloud HADevice for an example application changed minimal things, nothing in the configuration file for BitCloud 3.3.0 and I suffered the same issue.

My PDS definitions:
Code: [Select]
/* Enable wear-leveling version of PDS */
#define PDS_ENABLE_WEAR_LEVELING 1

/* ZigBee Platform NV items list*/
#define PERSISTENT_NV_ITEMS_PLATFORM    NWK_SECURITY_COUNTERS_MEM_ID

/* Application NV items list */
#define PERSISTENT_NV_ITEMS_APPLICATION 0xFFFu

When the coordinator reboots it shows the following in the sniffer logs:
Code: [Select]
Frame 880: 56 bytes on wire (448 bits), 54 bytes captured (432 bits) on interface 0
    Interface id: 0 (\\.\pipe\wireshark)
    Encapsulation type: IEEE 802.15.4 Wireless PAN (104)
    Arrival Time: Jan  1, 1970 11:13:22.134423000 AUS Eastern Daylight Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 802.134423000 seconds
    [Time delta from previous captured frame: 4.347481000 seconds]
    [Time delta from previous displayed frame: 4.347481000 seconds]
    [Time since reference or first frame: 797.355508000 seconds]
    Frame Number: 880
    Frame Length: 56 bytes (448 bits)
    Capture Length: 54 bytes (432 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: wpan:zbee_nwk:zbee_aps:zbee_zdp]
IEEE 802.15.4 Data, Dst: Broadcast, Src: 0x0000
    <Frame Length: 56>
    Frame Control Field: 0x8841, Frame Type: Data, Intra-PAN, Destination Addressing Mode: Short/16-bit, Source Addressing Mode: Short/16-bit
        .... .... .... .001 = Frame Type: Data (0x0001)
        .... .... .... 0... = Security Enabled: False
        .... .... ...0 .... = Frame Pending: False
        .... .... ..0. .... = Acknowledge Request: False
        .... .... .1.. .... = Intra-PAN: True
        .... 10.. .... .... = Destination Addressing Mode: Short/16-bit (0x0002)
        ..00 .... .... .... = Frame Version: 0
        10.. .... .... .... = Source Addressing Mode: Short/16-bit (0x0002)
    Sequence Number: 92
    Destination PAN: 0x16cc
    Destination: 0xffff
    Source: 0x0000
    [Extended Source: 00:00:00_12:34:56:78:90 (00:00:00:12:34:56:78:90)]
    [Origin: 746]
    <FCS Valid: Unknown>
ZigBee Network Layer Data, Dst: Broadcast, Src: 0x0000
    Destination: 0xfffd
    Source: 0x0000
    Radius: 10
    Sequence Number: 241
    [Extended Source: 00:00:00_12:34:56:78:90 (00:00:00:12:34:56:78:90)]
    [Origin: 746]
    ZigBee Security Header
        Security Control Field: 0x28, Key Id: Network Key, Extended Nonce
            ...0 1... = Key Id: Network Key (0x01)
            ..1. .... = Extended Nonce: True
        Frame Counter: 65537
        Extended Source: 00:00:00_12:34:56:78:90 (00:00:00:12:34:56:78:90)
        Key Sequence Number: 0
        Message Integrity Code: d00102a3
        [Key: cccccccccccccccccccccccccccccccc]
        [Key Origin: 762]
Frame Control Field: 0x0208, Frame Type: Data, Discover Route: Suppress, Security Data
    .... .... .... ..00 = Frame Type: Data (0x0000)
    .... .... ..00 10.. = Protocol Version: 2
    .... .... 00.. .... = Discover Route: Suppress (0x0000)
    .... ...0 .... .... = Multicast: False
    .... ..1. .... .... = Security: True
    .... .0.. .... .... = Source Route: False
    .... 0... .... .... = Destination: False
    ...0 .... .... .... = Extended Source: False
ZigBee Application Support Layer Data, Dst Endpt: 0, Src Endpt: 0
    Frame Control Field: Data (0x08)
        .... ..00 = Frame Type: Data (0x00)
        .... 10.. = Delivery Mode: Broadcast (0x02)
        ..0. .... = Security: False
        .0.. .... = Acknowledgement Request: False
        0... .... = Extended Header: False
    Destination Endpoint: 0
    Network Address Request (Cluster ID: 0x0000)
    Profile: ZigBee Device Profile (0x0000)
    Source Endpoint: 0
    Counter: 110
ZigBee Device Profile, Network Address Request, Device: EmberCor_00:00:e9:3d:9c
    Sequence Number: 0
    Extended Address: EmberCor_00:00:e9:3d:9c (00:0d:6f:00:00:e9:3d:9c)
    Request Type: Single Device Response (0)
    Index: 0
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #3 on: July 01, 2016, 12:25:06 am »
Can you capture a full log of:
1. Things working
2. Reset C
3. Let ED run for a few seconds (Data Req cycles)
4. Reset ED.

And attach full Wireshark log. Those text things are nightmare to read.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #4 on: July 01, 2016, 01:09:40 am »
Rename Wireshark attatchment extension to .cap

Start looking from packet 2321. Bear in mind this is for a Yale door lock that I implemented the ZCL file for. Everything works fine generally speaking.

At 2321, I send an "unlock command" the door unlocks and responds correctly.
At 2350, My coordinator has restarted using "Hal_WarmReset(), sends a Permit Duration of 0x00 on boot up.
At 2352, I try to send an unlock command once more. This time my coordinator is trying to search for it :(
At 2369 (roughly) I pull one of the batteries out of the door lock. wait a few seconds, replace it and the door lock talks to me with her annoying voice "Welcome to Yale Home Living" (errggghhh. Never test products that talk to you |O).

This app that I did these tests with in the wireshark log is my own custom one. The first 2K packets was me testing the HADevice app. In the HADevice app I felt it was easier to debug and force commands than to create them with the console. As such you see every so often that the door lock does continual data requests while I was implementing it. This has the exact same result as my custom app.
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #5 on: July 01, 2016, 02:19:20 am »
Your PDS is either not enabled or not working correctly. In this case coordinator forgot that 0x786c is a child.

At the same time, ED won't detect that parent is missing, since Data Reqs are happening on a MAC layer, and must be ACKed by the device, even if it has nothing to do with them.

I bet that it all will work if you physically disable the coordinator to let ED actively lose the network.

Show your configuration parameters set via configuration.h and application (if changed from what app is doing by default).
Alex
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #6 on: July 01, 2016, 02:22:34 am »
Also, it looks like ED assumes that network is the same and does not perform a rejoin on power up. This is kind of stupid, since it is now locked out, since even if PAN ID is the same, it may be a completely different network by then, or encryption key may have changed.

I don't know if there is some high level logic on the ED to detect this. It can technically do this by sending requests to the parent and figuring out that it never gets a reply.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #7 on: July 01, 2016, 03:00:49 am »
Also, it looks like ED assumes that network is the same and does not perform a rejoin on power up. This is kind of stupid, since it is now locked out, since even if PAN ID is the same, it may be a completely different network by then, or encryption key may have changed.

I don't know if there is some high level logic on the ED to detect this. It can technically do this by sending requests to the parent and figuring out that it never gets a reply.

I did the power cycle on the slave another few times it and did the rejoin, so not sure what happened there :/

Attached another capture. Look from 11158.
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #8 on: July 01, 2016, 03:06:23 am »
I did the power cycle on the slave another few times it and did the rejoin, so not sure what happened there :/
Ok then, now you need to figure out why C does not remember its children.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #9 on: July 01, 2016, 03:13:19 am »
I bet that it all will work if you physically disable the coordinator to let ED actively lose the network.

Yep. After 45 seconds (9 attempts) it starts to do beacon requests rather than Data requests. It also won't join another network while doing the beacon requests, only the original coordinator (yay!).

However, you're right I'm still looking to see if I can fix the coodinator loss at all.

My config file:
Code: [Select]
#ifndef _CONFIGURATION_H_
#define _CONFIGURATION_H_

/* Enable wear-leveling version of PDS */
#define PDS_ENABLE_WEAR_LEVELING 1

/* If Bootloader will be used in parallel with application. this should be commented. */
//#define PDS_NO_BOOTLOADER_SUPPORT //defined in makefiles

/* ZigBee Platform NV items list*/
#define PERSISTENT_NV_ITEMS_PLATFORM    NWK_SECURITY_COUNTERS_MEM_ID

/* Application NV items list */
#define PERSISTENT_NV_ITEMS_APPLICATION 0xFFFu
#define CS_APS_BINDING_TABLE_SIZE 14
////////////////////////////////////////////////////////////////////////////////////
// 32-bit mask of channels to be scanned before network is started. Channels that
// should be used are marked with logical 1 at corresponding bit location.
//  Valid channel numbers for 2.4 GHz band are 0x0b - 0x1a
//  Valid channel numbers for 900 MHz band are 0x00 - 0x0a
//
//  Notes:
//  1. for small amount of enabled channels it is more convenient to specify list
// of channels in the form of '(1ul << 0x0b)'
//  2. For 900 MHz band you also need to specify channel page
//
//  Value range: 32-bit values:
//  Valid channel numbers for 2.4 GHz band are 0x0b - 0x1a
//  Valid channel numbers for 900 MHz band are 0x00 - 0x0a
//
//  C-type: uint32_t
//  Can be set: at any time before network start
//#define CS_CHANNEL_MASK (1L<<0x0f)
//#define CS_CHANNEL_MASK 0x07FFF800UL //maximum possible channels
//#define CS_CHANNEL_MASK 0x00300000UL //upper channels only
//#define CS_CHANNEL_MASK (1L<< 0xd) //channel 15 only, appears to be no other traffic on it for debugging in the main CT office.

// The parameter specifies the predefined extended PANID of the network to be
// formed (for the coordinator) or joined (for a router or an end device). For a
// router or an end device the parameter can equal 0 allowing them to join the
// first suitable network that they discover.
//#define CS_EXT_PANID 0xAAAAA2AAAAAAAAAALL

// 64-bit Unique Identifier (UID) determining the device extended address. If this
// value is 0 stack will try to read hardware UID from external UID or EEPROM chip.
// at startup. Location of hardware UID is platform dependent and it may not be
// available on all platforms. If the latter case then UID value must be provided
// by user via this parameter. This parameter must be unique for each device in a
// network. This should not be 0 for Coordinator.
//#define CS_UID 0x1234567890ULL

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//SHORT ADDRESS CONFIG
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//Short address is automatically assigned by stack.
// Determines whether the static or automatic addressing mode will be used for the
// short address.
//
//  If set to 1, the CS_NWK_ADDR parameter will be used as the device's short
// address. Otherwise, the short address is assigned automatically by the stack. An
// actual assignment method is specified in CS_ADDRESS_ASSIGNMENT_METHOD.
// #define CS_NWK_UNIQUE_ADDR 0
// //#define CS_NWK_UNIQUE_ADDR 1
//
// //-----------------------------------------------
// //CS_NWK_UNIQUE_ADDR == 1
// //-----------------------------------------------
// #if (CS_NWK_UNIQUE_ADDR == 1)
//   // Specifies short (network) address if CS_NWK_UNIQUE_ADDR equals 1
//   //
//   //  If static addressing is applied the stack uses the value of the parameter as a
//   // short address. Otherwise, the stack assigns the parameter to a randomly chosen
//   // value unique within the network. In both cases after the network start the
//   // parameter holds actual short address of the device. While the device is in the
//   // network its value must not be changed.
//   #define CS_NWK_ADDR 0x0001
// #endif
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// The maximum number of direct children that a given device (the coordinator or a
// router) can have.
//
//  The parameter is only enabled for routers and the coordinator. An end device
// can not have children. If an actual number of children reaches a parameter's
// value, the node will have not been able to accept any more children joining the
// network. The parameter can be set to 0 on a router thus preventing it from
// accepting any children and can help form a desired network topology. For
// example, if the parameter is set to 0 on all routers, then the coordinator will
// be the only device that can have children and the network will have star
// topology.
#define CS_MAX_CHILDREN_AMOUNT 100
//#define CS_MAX_CHILDREN_AMOUNT 1
//#define CS_MAX_CHILDREN_AMOUNT CS_NEIB_TABLE_SIZE //no point doing any more than Neib table size
// The maximum number of routers among the direct children of the device
//
//  The parameter determines how many routers the device can have as children. Note
// that the maximum number of end devices is equal to CS_MAX_CHILDREN_AMOUNT -
// CS_MAX_CHILDREN_ROUTER_AMOUNT.
#define CS_MAX_CHILDREN_ROUTER_AMOUNT 50 //(CS_NEIB_TABLE_SIZE/2)
//#define CS_MAX_CHILDREN_ROUTER_AMOUNT 1
// #if CS_MAX_CHILDREN_ROUTER_AMOUNT >= CS_MAX_CHILDREN_AMOUNT
// #error no end devices can join. the maximum number of end devices is equal to CS_MAX_CHILDREN_AMOUNT - CS_MAX_CHILDREN_ROUTER_AMOUNT.
// #endif
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
//-----------------------------------------------
//STDLINK_SECURITY_MODE
//-----------------------------------------------
#ifdef STDLINK_SECURITY_MODE
// The parameter enabled in the high security mode specifies the size of the APS
// key-pair set. The APS key-pair set stores pairs of corresponding extended
// address and a link key or a master key. For each node with which the current
// node is going to communicate it must keep an entry with the remote node extended
// address and a link key. If the link key is unknown, the node can request the
// trust center for it via APS_RequestKeyReq(). The trust center must store a link
// key or a master key depending on the CS_SECURITY_STATUS used for each node it is
// going to authenticate. Entries can also be added manually by APS_SetLinkKey()
// and APS_SetMasterKey().
#define CS_APS_KEY_PAIR_DESCRIPTORS_AMOUNT 8

// Security information waiting timeout before secure network join considered
// failed.
//
//  A timeout is started when connection with a parent is established. If the
// security related procedures that are performed after this will not be completed
// before the timeout exceeds, the device will fail joining the network. A value is
// measured in milliseconds.
#define CS_APS_SECURITY_TIMEOUT_PERIOD 10000

// Depending on security key type and security mode this is either network key,
// master key, link key or initial link key.
//
//  Network key is used to encrypt a part of a data frame occupied by the NWK
// payload. This type of encryption is applied in both the standard and high
// security mode. The high security mode also enables encryption of the APS payload
// with a link key, but if the txOptions.useNwkKey field in APS request parameters
// is set to 0, the APS payload is encrypted with the network key.
//
//  The network key must be predefined if standard security is used with
// CS_ZDO_SECURITY_STATUS set to 0. For all other values of CS_ZDO_SECURITY_STATUS
// the network key is received from the trust center during device authentication.
// Note that in the standard security mode with CS_ZDO_SECURITY_STATUS equal to 3
// the network key is transferred to the joining device in an unencrypted frame.
//#define CS_NETWORK_KEY {0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC,0xCC}
#define CS_NETWORK_KEY {0xA0,0xFC,0x19,0x6A,0x6D,0x57,0x44,0x1C,0xC6,0xD7,0xBD,0x11,0xFE,0xA9,0x9B,0x4C}

// The parameter is used to determine the security type.
//
//  Value range: 0,3 - for standard security; 1,2 - for high security.
//  0 - network key is preconfigured ;
//  1 - network join without master key, but with a trust center link key, which
// must be set via APS_SetLinkKey();
//  2 - network join employs a master key, which must be set APS_SetMasterKey();
//  3 - network key is no preconfigured, but rather received from the trust center
// in an unencrypted frame. <br.
//#define CS_ZDO_SECURITY_STATUS 0
#define CS_ZDO_SECURITY_STATUS 1
// The maximum number of network keys that can be stored on the device
//
//  A device in a secured network can keep several network keys up to the value of
// this parameter. Upon frame reception the device extracts key sequence number
// from the auxiliary header of the frame and decrypts the message with the network
// key corresponding to this sequence number. Besides, one key is considered active
// for each device; this is the key that is used for encrypting outgoing frames.
// The keys are distributed by the trust center with the help of the
// APS_TransportKeyReq() command. The trust center can also change the active key
// of a remote node via a call to APS_SwitchKeyReq().
#define CS_NWK_SECURITY_KEYS_AMOUNT 1
#endif

// Maximum amount of records in the Neighbor Table.
//
//  The parameter determines the size of the neighbor table which is used to store
// beacon responses from nearby devices. The parameter puts an upper bound over the
// amount of child devices possible for the node.
#define CS_NEIB_TABLE_SIZE 8

// Maximum amount of records in the network Route Table.
//
//  The parameter sets the maximum number of records that can be kept in the NWK
// route table. The table is used by NWK to store information about established
// routes. Each table entry specifies the next-hop short address for a route from
// the current node to a given destination node. The table is being filled
// automatically during route discovery. An entry is added when a route is
// discovered.
#define CS_ROUTE_TABLE_SIZE 10

// Maximum amount of records in the network Address Map Table.
//
//  The parameter sets the maximum number of records in the address map table used
// by NWK to store pairs of corresponding short and extended addresses. The stack
// appeals to the table when a data frame is being sent to a specified extended
// address to extract the corresponding short address. If it fails to find the
// short address, an error is reported.
#define CS_ADDRESS_MAP_TABLE_SIZE 5

// Maximum amount of records in the network Route Discovery Table.
//
//  The parameter specifies the size of the route discovery table used by NWK to
// store next-hop addresses of the nodes for routes that are not yet established.
// Upon exhausting the capacity of the table, the stack starts rewriting old
// entries. If the size of the route table is big enough after all used routes are
// established the table may not be used.
#define CS_ROUTE_DISCOVERY_TABLE_SIZE 3

// Maximum amount of records in the Duplicate Rejection Table.
//
//  The duplicate rejection table is used by APS to store information about
// incoming unicast messages in order to reject messages that have been already
// received and processed. Following ZigBee specification, the parameter should be
// not less than 1.
#define CS_DUPLICATE_REJECTION_TABLE_SIZE 8

// Maximum amount of records in the Broadcast Transaction Table.
//
//  The broadcast transmission table is used for tracking incoming broadcast
// messages to mark messages that have already been processed by the node. This
// causes only one copy for each broadcast message to be processed. An entry for a
// broadcast message is stored for a certain period of time and then removed.
#define CS_NWK_BTT_SIZE 8

// The number of buffers for data requests on the APS layer.
//
//  The parameter specifies the number of buffers that are allocated by APS to
// store data requests parameters. The parameter puts an upper bound to the number
// of data requests that can be processed by APS simultaneously. If all buffers are
// in use and a new data request appears, it is kept in a queue until a buffer is
// released.
#define CS_APS_DATA_REQ_BUFFERS_AMOUNT 4

// The number of buffers for acknowledgment messages sent by APS.
//
//  This parameter determines the amount of memory that needs to be allocated for a
// special type of buffers used by APS to store payloads for acknowledgment
// frames. The need to use the buffers occurs when the node receives a frame that
// has to be acknowledged. That is, the APS component on the node has to send an
// acknowledgment frame. For frames initiated by the application, the memory for a
// payload is to be allocated by the application on its own, while the payload
// memory for an acknowledgment frame shall be reserved by APS. The request
// parameters are still stored in the data request buffers.
#define CS_APS_ACK_FRAME_BUFFERS_AMOUNT 3

// Amount of buffers on NWK layer used to keep incoming and outgoing frames. This
// parameters affects how many children of a parent are able to get broadcast
// messages.
#define CS_NWK_BUFFERS_AMOUNT 9

// The parameter specifies the TX power of the transceiver device, is measured in
// dBm(s). After the node has entered the network the value can only be changed via
// the ZDO_SetTxPowerReq() function.
//
//  Value range: depends on the hardware. Transmit power must be in the range from
// -17 to 3 dBm for AT86RF231, AT86RF230 and AT86RF230B. For AT86RF233 transmit
// power must be in the range from -17 to 4 dBm. For AT86RF212 transmit power must
// be in the range from -11 to 11 dBm.
#define CS_RF_TX_POWER 3

// Security parameters
//Default ZigBee defined HA Link Key. Must be used if Bitcloud is to communicate with 3rd party products.
#define LINK_KEY {0x5a, 0x69, 0x67, 0x42, 0x65, 0x65, 0x41, 0x6c, 0x6c, 0x69, 0x61, 0x6e, 0x63, 0x65, 0x30, 0x39}
//Randomly generated link key
//#define LINK_KEY {0x23, 0x0C, 0xC4, 0xA8, 0x30, 0x10, 0x48, 0x8E, 0xA5, 0xC3, 0x01, 0x71, 0x3D, 0xB3, 0x4E, 0x4B}

#define MAXIMUM_NETWORK_SIZE 400

#if MAXIMUM_NETWORK_SIZE > 400
#error Network size is larger than ram can handle with other structures
#endif

#endif // _CONFIGURATION_H_

A bit of the stuff I have taken out. Such as randomly generating the EPID, MAC, channels and bootloader support. All dictated in either the application or in the make files for release/debug convenience :)
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #10 on: July 01, 2016, 03:20:07 am »
Ok, check that code that starts like this:
Code: [Select]
  PDS_StoreByEvents(BC_ALL_MEMORY_MEM_ID);
  if (PDS_IsAbleToRestore(BC_ALL_MEMORY_MEM_ID) && PDS_Restore(BC_ALL_MEMORY_MEM_ID))
  {
is still present and is executed. And check that restore actually happens.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #11 on: July 01, 2016, 03:24:50 am »
Well I took my code from WSNDemo instead of the HADevice, but appears to be the same result.

Code: [Select]
//initialise Persistent Data Storage to ensure all network information is kept over restarts
// Configure PDS to store parameters in non-volatile memory and update them
// on occurrence of BitCloud events
PDS_StoreByEvents(BC_ALL_MEMORY_MEM_ID);

//Restore required configuration from non-volatile memory
if (PDS_IsAbleToRestore(BC_ALL_MEMORY_MEM_ID))
{
restoredFromPds = true;
PDS_Restore(BC_ALL_MEMORY_MEM_ID);
}

Debugging shows that it is indeed executed.
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #12 on: July 01, 2016, 03:26:42 am »
Then I would start querying various tables to see if information is actually restored. Start from neighbor table. ED must be preset there as authenticated child. 
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #13 on: July 01, 2016, 03:33:19 am »
After Coord reboot, the device ID is neither in my Neighbour table nor my child table :(
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #14 on: July 01, 2016, 03:39:08 am »
Read the device flash memory after it worked for a while (after reboot). At the address 0x400 there will be an area of binary data that starts with "AT". Check that it is actually there, since this area is the persistent storage.

Also, try to disable wear leveling.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #15 on: July 04, 2016, 01:51:00 am »
Sorry for late reply, got taken out of the office late friday.

Read the device flash memory after it worked for a while (after reboot). At the address 0x400 there will be an area of binary data that starts with "AT". Check that it is actually there, since this area is the persistent storage.

At 0x4000 I have in hex "00004154534E7631FEFFFFFF01000000" that translates too "\0\0ATSNv1þÿÿÿ"

I found my node ID in flash after a readout only is 2 possible PDS places. In the 0x0c30 and 0x0D80 addresses.

Also, try to disable wear leveling.

Disabled, same result.
Code: [Select]
#define PDS_ENABLE_WEAR_LEVELING 0Didn't realise this setting used eeprom too. Much slower!
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #16 on: July 04, 2016, 02:13:44 am »
ATSNv1 is a good signature, this means that saving things probably works.

Try to perform PDS_FlushData() before doing warm reset.

Didn't realise this setting used eeprom too. Much slower!
PDS always uses part of the main Falsh array to store data, not EEPROM. This is for historical reasons - first device in this series (ATmega128RFA1) had a horribly low EEPROM write-erase cycle limit due to some technological difficulties of putting EEPROM and RF on the same die. Things improved a lot since then, but PDS never changed.

WL implementation is completely different from the standard PDS, so they have incompatible database format, and WL is much later addition, so I recommend making things work without WL and then switching to WL if needed.
 Wear
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #17 on: July 04, 2016, 03:18:39 am »
Trying with PDS_Flush, assuming I am to use the PDS_ALL_EXISTENT_MEMORY definition. Didn't work, same result :(

Code: [Select]
PDS_FlushData(PDS_ALL_EXISTENT_MEMORY);
HAL_WarmReset();

The reason I say it requires eeprom is in the HAL make files, I made eeprom false since I wanted to use eeprom to store various things that could potentially use 6KB. As such disabling this the compiler didn't like not having definitions for HAL_WriteEeprom and various other eeprom related commands.

EDIT:
Code: [Select]
PDS_BlockingStore(PDS_ALL_EXISTENT_MEMORY);
HAL_WarmReset();
Had the same result. As BlockingStore has superseded FlushData.
« Last Edit: July 04, 2016, 03:22:50 am by rfleming »
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #18 on: July 04, 2016, 03:42:55 am »
I don't really know what may be wrong here.

So you can reproduce this on a stock application with no changes to application, stack, HAL, or Makefiles?

As such disabling this the compiler didn't like not having definitions for HAL_WriteEeprom and various other eeprom related commands.
EEPROM is needed for OTA, since it is used by the application and bootloader for communicating new image availability and some other info. I believe that first 4 bytes are reserved for this purpose.

As BlockingStore has superseded FlushData.
They do exact same thing.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #19 on: July 04, 2016, 05:14:23 am »
Thanks Lex.

Testing just now with fresh Bitcloud 3.3.0 package. I did the following to repeat it (some steps probably didn't need to be mentioned, just trying to be thorough)

Unzip fresh firwmware at: C:\BitCloud_MEGARF_3_3_0\
Open: C:\BitCloud_MEGARF_3_3_0\Applications\HADevice\atmelStudio_projects\Atmega256rfr2.cproj with AS6.2

Set AS6.2 Configuration to All_Stdlink....8Mhz

Set my fuses on my Atmega 256rfr2 RCB Xplained PRO (http://www.atmel.com/tools/ATRCB256RFR2-XPRO.aspx) to
Ext: 0xFE
High: 0x91
Low: 0xE2

In configuration.h, change wear leveling to:
   
Code: [Select]
#define PDS_ENABLE_WEAR_LEVELING 0
At this point I get a compile error, "Error   2   'NWK_SECURITY_COUNTERS_ITEM_ID' undeclared (first use in this function)". By inspection:
In wlPdsMemIds.h:
   
Code: [Select]
#define NWK_SECURITY_COUNTERS_MEM_ID          NWK_SECURITY_COUNTERS_ITEM_ID
In stdPdsMemIds.h manually add defintion:
   
Code: [Select]
#define NWK_SECURITY_COUNTERS_ITEM_ID NWK_SECURITY_COUNTERS_MEM_ID
In zclDevice.c initApp(): Force the extAddress to be 0x1234567890 to ensure it works correctly with my semi-unsupported board.
Code: [Select]
  CS_ReadParameter(CS_UID_ID,&extAddr);
  if (extAddr == 0 || extAddr > APS_MAX_UNICAST_EXT_ADDRESS)
  {
    //BSP_ReadUid(&extAddr); //Will read the UID from chip
extAddr = 0x1234567890;
    CS_WriteParameter(CS_UID_ID, &extAddr); //Writes the read UID to the ram
  }

Added a command to the list so I don't have to fix up the ZAPPSI definitions with "reset"
   
Code: [Select]
{"restart", "", processRestartCmd, "->Restart device\r\n"},
and the reset command is defined as
Code: [Select]
static void processRestartCmd(const ScanValue_t *args)
{
PDS_FlushData(PDS_ALL_EXISTENT_MEMORY);
HAL_WarmReset();
}

I know my short ID by looking at my sniffer
Now I ran the following commands in order:
setPermitJoin 50<lf>
identify -s 0x371e 1 40<lf>
restart<lf>
identify -s 0x371e 1 40<lf>

The first identify works, the second looks does a route discovery.
I hope you can see what is going on from this! Keep me posted.

Cheers once again,
Ryan.
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #20 on: July 04, 2016, 05:25:43 am »
I'll try this myself, but it will take a couple of days - it is 4th of July, so public holiday in the US.

In the mean time, can you dump child table and neighbor table before the reset? I'm interested in the relationship field for that ED.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #21 on: July 04, 2016, 06:12:31 am »
Independence day, roar. Get your partay on :popcorn:!!!!!!

They're having pub deals at some of the bars in Melbourne for US Independence day lol!

No stress mate, will reconvine in the next few days.

Isn't the child table a subset of the Neighbor table?

Nonetheless, I decided to put it all in an excel spreadsheet because it was a bit more readable. I hope you can follow what I have got in there, purely just copies of the variables from my watch list.

For reading out Neighbor list:
Code: [Select]
ZDO_Neib_t NeighborTable[CS_NEIB_TABLE_SIZE]; //Neighbour list
ZDO_GetNeibTable(NeighborTable);

Child List:
Code: [Select]
volatile NodeAddr_t [CS_MAX_CHILDREN_AMOUNT - CS_MAX_CHILDREN_ROUTER_AMOUNT];
volatile ZDO_GetChildrenAddr_t children =
{
  .childrenCount = CS_MAX_CHILDREN_AMOUNT - CS_MAX_CHILDREN_ROUTER_AMOUNT,
  .childrenTable = childAddrTable,
};
ZDO_GetChildrenAddr(&children);
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #22 on: July 04, 2016, 06:35:32 am »
Yeah, children table is a subset.

In this case I assume that only the first record is valid. Bu then, this record indicates that device is on the channel 0x0d, and default channel is 0x0f, so modification described above are not complete.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #23 on: July 04, 2016, 07:12:36 am »
In this case I assume that only the first record is valid. Bu then, this record indicates that device is on the channel 0x0d, and default channel is 0x0f, so modification described above are not complete.

I actually got the neighbor list and child table from my custom app because the functionality was already present and I forgot to memset my local variable before reading.

Please see the the new attached file. Uses the App described above adding a command:
Code: [Select]
{"getTable", "", processGetTableCmd, "->Get Neighbor Table and Child table locally\r\n"},
Code: [Select]
static ZDO_Neib_t NeighborTable[CS_NEIB_TABLE_SIZE];

static NodeAddr_t childAddrTable[CS_MAX_CHILDREN_AMOUNT - CS_MAX_CHILDREN_ROUTER_AMOUNT];
static ZDO_GetChildrenAddr_t children =
{
.childrenCount = CS_MAX_CHILDREN_AMOUNT - CS_MAX_CHILDREN_ROUTER_AMOUNT,
.childrenTable = childAddrTable,
};


static void processGetTableCmd(const ScanValue_t *args)
{
memset(NeighborTable, 0, sizeof(NeighborTable));
memset(childAddrTable, 0, sizeof(childAddrTable));

ZDO_GetChildrenAddr(&children);
ZDO_GetNeibTable(NeighborTable);
volatile int x= 5;
x++;      //break on me. Useful way for me to guarantee break in code.
}
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #24 on: July 04, 2016, 07:19:55 am »
Ok, but the first entry in the first file must have been invalid as well then? Because it had wrong channel and ExtPanId.

To have a place for guaranteed endpoint I use 'asm("nop");'.

I'll try to reproduce this when I'm back in the office.
Alex
 

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #25 on: July 04, 2016, 07:49:04 am »
Ok, but the first entry in the first file must have been invalid as well then? Because it had wrong channel and ExtPanId.
Of the old file based on my own implemented ZigBee app?

The first entry in the table was the only correct one. I changed the channel to 13 since it is quite a clear network according to my sniffer. I also assigned the EPID to be the same as a coordinator MAC.

To have a place for guaranteed endpoint I use 'asm("nop");'.

I'd have to move my little finger too much on the shift key to use that line. But certainly something better to break on!

No worries, eagerly waiting your reply :)
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #26 on: July 05, 2016, 11:38:29 pm »
Here is an update. I did not have a lot of time to work on this. My build for ATmega256RFR2 seem to be broken (I think it is AS7 issue), so I used SAM R21, but logically they should be the same.

I also could not get HADemo to run (or react to serial input, anyway) for some unknown reason.

I used WSNDemo and it works properly by default. I suggest you try that on your hardware first.

On closer examination it looks like there is a parameter clearNeighborTable in the joinControl structure. It should be set properly by the default HADemo application as well, but it may be worth checking that it is actually set correctly.

I'll try to make HADemo work on R21, but I won't mess with my AS7 versions.

If nothing else helps, create myAtmel ticket, they may actually know about some issues and have a fix already.
Alex
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #27 on: July 06, 2016, 12:01:07 am »
I can reproduce this on SAM R21 and HADemo.  I'll looks into this further, but you can basically assume that it is some high-level logic problem, not PDS.

We just need to figure out what is the difference between WSNDemo and HADemo, since the first one works for me.
Alex
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11236
  • Country: us
    • Personal site
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #28 on: July 06, 2016, 12:15:30 am »
Well, that was easy.

In the file zclDevice.c in function setDeviceJoinParameters() replace this line
Code: [Select]
  else if (DEVICE_TYPE_ROUTER == deviceType)with
Code: [Select]
  else if (DEVICE_TYPE_ROUTER == deviceType || DEVICE_TYPE_COORDINATOR == deviceType)and  things will work as expected.

You can enable WL back, it does not make any difference.
Alex
 
The following users thanked this post: rfleming

Offline rflemingTopic starter

  • Regular Contributor
  • *
  • Posts: 73
  • Country: au
Re: Bitcloud - ZigBee End Device lost after parent power cycle
« Reply #29 on: July 06, 2016, 04:30:23 am »
That was the issue! I removed this code in the past because it was only implemented on a router not my coordinator :)

Lex, da man!

While i've got you on the topic of End Devices, I have one other small issue :)

When I do a request for my end device to leave when it is attached to a router, it sends a NWK "Leave" message rather than a ZDO Leave response, this is fine. Though because I receive an Extended address through the Leave event, I need a short address to remove it from my locally held network list. I can't query for the Short address anymore because it has already been removed. Do you know any better way to handle this to get the short address through the above procedure?
« Last Edit: July 06, 2016, 04:38:01 am by rfleming »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf