Netduino Plus Reliability
#1
Posted 11 May 2012 - 04:16 PM
#2
Posted 11 May 2012 - 04:27 PM
#3
Posted 11 May 2012 - 04:39 PM
Could you explain better what was the trouble?
Mario, thank you for the reply.
I am not sure what more I can say, the N+ board is programmed to act as a client to a web server. It is supposed to do a "POST" transaction every few minutes. After several months of operation, this one machine simply stopped making the connections to the web server, although all other functionality apparently remained. There was nothing special happening at the time. The last successful message was sent at 1am, and the customer reports that the machine was not disturbed before or after for several hours. The same with the server, it was not disturbed before or after the failure. Other customer's N+ products continue to operate.
I had the customer return the failing unit to me, I was concerned that his network was the cause of the trouble, but it failed the same way at my location. Also functionality was restored by swapping in a fresh N+ board. Since erasing everything on the bad board and reinstalling everything restored it to working order, I think the failure was not due to damaged hardware.
#4
Posted 11 May 2012 - 04:44 PM
The only thing I could think of is that the MAC address got reset to 0. If that were the case, it would be unable to network. Until you restored it. Does this sound plausible?I had the customer return the failing unit to me, I was concerned that his network was the cause of the trouble, but it failed the same way at my location. Also functionality was restored by swapping in a fresh N+ board. Since erasing everything on the bad board and reinstalling everything restored it to working order, I think the failure was not due to damaged hardware.
My .NETMF projects: .NETMF Toolbox / Gadgeteer Light / Some PCB designs
#5
Posted 11 May 2012 - 05:06 PM
#6
Posted 11 May 2012 - 05:31 PM
Hi Robert,
Very interesting. So the hardware appears to be fine, but you had to erase and reflash it to get it back up and working?
If that's the case, then the most likely scenario is that the network settings somehow got changed (MAC address as Stefan mentioned, or IP address). Erasing and reflashing the board and redploying the app shouldn't change anything other than your network config.
Also, I just want to confirm...after erasing and reflashing, the hardware seems to be working fine?
Chris
Yes, after erasing and reflashing, the hardware seems to be working fine. After I did the reflashing, I sort of "kicked" myself for not checking the MAC address before doing so. The IP in this case is obtained dynamically, so I doubt that was the trouble. But the MAC could have been.
#7
Posted 11 May 2012 - 05:34 PM
#9
Posted 11 May 2012 - 06:22 PM
#10
Posted 12 May 2012 - 01:16 AM
NetworkInterface.PhysicalAddress
I cannot say for sure that the MAC address was wiped out in my case, but it seems a likely cause. To make sure I can recover, should that happen (again), it would seem this code would let my machine recover:
byte[] MACaddress = {0x5C, 0x86, 0x4A, 0x00, 0x28, 0x29 }; IF = Microsoft.SPOT.Net.NetworkInformation.NetworkInterface.GetAllNetworkInterfaces(); IF[0].PhysicalAddress = MACaddress; // make sure MAC address is set IF[0].EnableStaticIP("192.168.5.100", "255.255.255.0", "192.168.5.1"); IF[0].EnableDhcp(); // make sure DHCP is enabled
When I tried setting the MAC without setting the (static/initial) IP address, the IP address was wiped out, and DHCP failed.
This code also has the advantage that there is no need to set the network configuration in MFdeploy after flashing in ER_CONFIG, ER_FLASH, which I find easier to set in the source code. YMMV.
#11
Posted 12 May 2012 - 03:35 AM
#12
Posted 12 May 2012 - 10:59 AM
#13
Posted 12 May 2012 - 01:05 PM
Hi Robert,
If you're going to set the MAC address and/or IP address from code, I would recommend first _checking_ the current MAC/IP settings and only rewriting them if they've changed.
Due to the implementation of config in NETMF (single-sector, possible chaining of config settings), you could theoretically run out of space and/or rewrite cycles...
Chris
Hmmm. would that include the flag to enable DHCP? I notice that there is a check box for enabling DHCP on the network configuration page in MFdeploy. Whenever I reboot either due to a power cycle or my watchdog, I issue this call:
IF[0].EnableDhcp();
Would that call cause a rewriting of the config area, possibly running out of space or write/rewrite cycles? If so, perhaps that is the cause of the loss of the MAC address that I experienced? I was under the (apparently mistaken) impression that changing the network configuration values, was changing a working copy rather than the values stored in FLASH. I would guess that this call to EnableDhcp() would have occurred between 500-1000 times before the failure occurred.
Here is the adjusted code that checks to see if updating the configuration is necessary before doing so:
if(( IF[0].PhysicalAddress != Global.MACaddress ) || (IF[0].IPAddress != "192.168.5.100" ) || (IF[0].GatewayAddress != "192.168.5.1" ) || (IF[0].SubnetMask != "255.255.255.0" ) || (IF[0].IsDhcpEnabled != true ) ) { IF[0].PhysicalAddress = Global.MACaddress; // make sure MAC address is set IF[0].EnableStaticIP("192.168.5.100", "255.255.255.0", "192.168.5.1"); IF[0].EnableDhcp(); // make sure DHCP is enabled }
#14
Posted 12 May 2012 - 03:22 PM
if(( IF[0].PhysicalAddress != Global.MACaddress ) || (IF[0].IPAddress != "192.168.5.100" ) || (IF[0].GatewayAddress != "192.168.5.1" ) || (IF[0].SubnetMask != "255.255.255.0" ) || (IF[0].IsDhcpEnabled != true ) ) { IF[0].PhysicalAddress = Global.MACaddress; // make sure MAC address is set IF[0].EnableStaticIP("192.168.5.100", "255.255.255.0", "192.168.5.1"); IF[0].EnableDhcp(); // make sure DHCP is enabled }
Hmm, I find the above code does not work as desired. It seems that the values returned by the DHCP server are recorded in the FLASH, and survive a power cycle or reboot. So anytime the DHCP server provides an address change, the network configuration area will be rewritten (apparently).
I wonder if that is a good idea if what Chris says is correct about limitations on writing/rewriting to the network configuration values.
If I want to have the application code determine if a re-write is needed based on the IPAddress, GatewayAddress, or SubnetMask, the best I will be able to do is some kind of sanity check on these values, like this:
if(( MAC(IF[0].PhysicalAddress) != MAC(Global.MACaddress) ) || (IF[0].IPAddress == "0.0.0.0" ) || (IF[0].GatewayAddress == "0.0.0.0" ) || (IF[0].IsDhcpEnabled != true ) ) { IF[0].PhysicalAddress = Global.MACaddress; // make sure MAC address is set IF[0].EnableStaticIP("192.168.5.100", "255.255.255.0", "192.168.5.1"); IF[0].EnableDhcp(); // make sure DHCP is enabled } public string MAC( byte[] mac ) { int i; string str = ""; for( i=0; i<6; i++ ) { str += mac[i].ToString("X2"); if( i != 5 ) str += ":"; } return str; }
#15
Posted 14 May 2012 - 12:44 PM
After completely erasing the chip and reinstalling everything, the MAC was able to be changed once again. For this test, I did not update the Network Configuration using MFDeploy.
My conclusions at this point are:
1) changing network values: PhysicalAddress, EnableStaticIP(), or EnableDhcp() do cause the values to be written into the Network Configuration portion of FLASH, even if the values being written match the values already there.
2) repeated updating of the Network Configuration portion of FLASH, will cause the device to be unable to accept further updates.
3) it seems that once the Network Configuration portion of FLASH is "full" the MAC is likely to revert to a value of "00-00-00-00-00-01", but other values may also be possible, further examples are needed to determine this.
4) some networks will let the N+ operate with a MAC of "00-00-00-00-00-01" assuming it is the only device having that address, while others may not allow it to operate with that MAC.
5) Network Configuration values that are set by DHCP do get written into the Network Configuration portion of FLASH. I am not sure if they are written every time, or just when the values change. If written every time a lease is renewed, that could result in the N+ MAC being reset.
Recommendations:
I think the documentation of certain functions should include a warning about the limitations of updating the Network Configuration portion of FLASH.
I worry that using DHCP will eventually fill up the Network Configuration portion of FLASH. If the FLASH is rewritten every time a lease is renewed, the FLASH will likely fill up over the course of months. If rewritten only when values change, the FLASH will likely fill up over the course of years. It would be better if DHCP values did not update the Network Configuration portion of FLASH.
#16
Posted 25 May 2012 - 12:17 PM
I worry that using DHCP will eventually fill up the Network Configuration portion of FLASH. If the FLASH is rewritten every time a lease is renewed, the FLASH will likely fill up over the course of months. If rewritten only when values change, the FLASH will likely fill up over the course of years. It would be better if DHCP values did not update the Network Configuration portion of FLASH.
I have been tracking one of my customer's units since posting in this thread. I do see that his DHCP server is occasionally assigning different IP Addresses to the N+ after restarts. From previous tests, I know that dynamic addresses are written into the FLASH network configuration area. From Chris we learn that this area has some limitations on the number of updates that can be done. Now if we are talking about the (likely) 100,000 rewrites that FLASH typically has for rewrites, that will not be a problem for me, as I am seeing less than 4 watchdog restarts per day. So 100,000 rewrites will give a product life of 68 years. However if there is some other limitations, for example, if writing net configuration values does not over write existing data, but begins to fill up some fixed sized table, then the DHCP updates could mean a much shorter life for the board between "factory" erases.
Does anyone know how many DHCP address changes the N+ can adsorb before the N+ will stop working?
#17
Posted 26 May 2012 - 01:55 AM
#18
Posted 26 May 2012 - 02:28 PM
The config sector can be chained (i.e. new settings written one after another), so the rewrite cycles aren't the issue...it's running out of space that is the potential issue.
So the chaining rather than rewriting is a deliberate longevity strategy?
If so, a more sophisticated strategy would rewrite when there was no more room for chaining.
#19
Posted 26 May 2012 - 04:32 PM
I believe that the strategy relates to how flash works...So the chaining rather than rewriting is a deliberate longevity strategy?
If so, a more sophisticated strategy would rewrite when there was no more room for chaining.
To write new data in an empty section of flash, you can simply write the data. To write new data over old data you have to read all the data you want to keep out of a sector, erase the entire sector, and then write everything back you want (including the new data). Or have two sectors and then switch which one is the "current" sector by erasing the unused one.
Since the config sector is only one sector, and NETMF doesn't want to create a big buffer to read in all that data, and because NETMF doesn't want to have such an important sector get corrupted by a power outage or code failure in the middle of writing it...chaining may have been implemented. By writing new data after old data and looking for the newest data by browsing the whole sector, enhanced reliability and lower memory consumption is achieved.
The config sector was really designed, way-back-when, as something that would rarely if ever be changed. If NETMF writes to it frequently, then that's an inconsistent design that we need to figure out how to fix in a contribution back to the core...
We can analyze a lot of these things more easily on the new Netduino Go hardware, where we can interactively browse the Flash while running NETMF on the board (using the JTAG connector). I'm really looking forward to getting the first hand-built samples of the new Ethernet module for Netduino Go, and digging into the details of the config sector usage.
Chris
#20
Posted 27 May 2012 - 01:44 PM
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users