Netduino home hardware projects downloads community

Jump to content


The Netduino forums have been replaced by new forums at community.wildernesslabs.co. This site has been preserved for archival purposes only and the ability to make new accounts or posts has been turned off.
Photo

Socket Exception 10050 after an hour


  • Please log in to reply
18 replies to this topic

#1 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 29 June 2012 - 07:22 PM

I have a Netduino Plus application that requires a constant connection to another server. It runs fine for about an hour and then all the sudden the socket dies, with error code 10050. It doesn't have any trouble making a new socket and reconnecting. Google searches seem to indicate this error equates to "dead network". I am running firmware 4.2 RC5 and getting an IP address from DHCP. I have other boxes (Linux based) that have no trouble keeping a connection to the same server alive for days on end. So, some questions: - Has anyone else encountered this issue? - Any ideas why this is happening? - Does the N+ renew it's DHCP lease automatically, or does it have to be done in software? (My router hands out 2 hour leases, so this didn't seem to be the issue, but maybe it is?) I have the code up on GitHub if anyone wants to check it out, but beware I'm far from a professional programmer so it may make your head hurt :) Thanks! Alex

#2 Daniel Minnaar

Daniel Minnaar

    Advanced Member

  • Members
  • PipPipPip
  • 46 posts
  • LocationJohannesburg, South Africa

Posted 29 June 2012 - 07:46 PM

I have a Netduino Plus application that requires a constant connection to another server. It runs fine for about an hour and then all the sudden the socket dies, with error code 10050. It doesn't have any trouble making a new socket and reconnecting. Google searches seem to indicate this error equates to "dead network". I am running firmware 4.2 RC5 and getting an IP address from DHCP. I have other boxes (Linux based) that have no trouble keeping a connection to the same server alive for days on end. So, some questions:

- Has anyone else encountered this issue?
- Any ideas why this is happening?
- Does the N+ renew it's DHCP lease automatically, or does it have to be done in software? (My router hands out 2 hour leases, so this didn't seem to be the issue, but maybe it is?)

I have the code up on GitHub if anyone wants to check it out, but beware I'm far from a professional programmer so it may make your head hurt :)

Thanks!
Alex


What happens if you take the network cable out, and then plug it back in - does it wait until there's network available, or does nothing happen at all? (it's frozen)

#3 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 29 June 2012 - 07:51 PM

What happens if you take the network cable out, and then plug it back in - does it wait until there's network available, or does nothing happen at all? (it's frozen)


I'm not sure exactly, I will check it out when I get home.

#4 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 30 June 2012 - 02:07 AM

Hi Alex, How often are you sending data? Are you actively communicating when the connection closes? If you plug the two computers into each other directly, do you still get the same timeout? I'm wondering if there's something happening inside your switch router... Chris

#5 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 30 June 2012 - 04:16 AM

Hi Alex,

How often are you sending data? Are you actively communicating when the connection closes?

If you plug the two computers into each other directly, do you still get the same timeout? I'm wondering if there's something happening inside your switch router...

Chris


The netduino sends data out an average of maybe 10-15 times a minute. The server it's communicating with sends "heartbeat" data at least every 15 seconds. I tried having the netduino send out heartbeat messages during periods of inactivity, but it did not help.

I can't connect them together directly as the other server is out on the internet (not controlled by me). But there is another box on the same network (that the N+ will be replacing if all goes well) that has no problem staying connected. Maybe I can try putting the N+ on the WAN side of the router.

Thanks!
-Alex

#6 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 30 June 2012 - 05:22 PM

Hi Alex, So to confirm--your Netduino Plus has no trouble staying connected to a local box, but the connection to a remote server over the Internet closes from time to time? The absolute best diagnostics tool to use here is WireShark. If you can get a capture of the data on the connection that closes on you, that will give us a lot of data to hunt down the issue. Finally...the Internet is an unstable place for data streams. Connections can be broken from time to time for various reasons, which is why most IP clients/servers which are "always on" have auto-reconnect software. Chris

#7 nakchak

nakchak

    Advanced Member

  • Members
  • PipPipPip
  • 404 posts
  • LocationBristol, UK

Posted 30 June 2012 - 05:41 PM

Hi Alex As an alternative, could you not use a local box on your network to act as a proxy to the internet hosted service? I ask as i have had issues in the past with other embedded ethernet devices (lantronix and digi) with cheap/soho routers starnge things can happen when UPnP and DHCP are enabled with those devices. Especially if there is some subnet wide broadcasts going on... When the socket dies is it always at the same interval or does it vary? How busy is your local network? Nak.

#8 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 30 June 2012 - 09:49 PM

Hi Alex,

So to confirm--your Netduino Plus has no trouble staying connected to a local box, but the connection to a remote server over the Internet closes from time to time?

The absolute best diagnostics tool to use here is WireShark. If you can get a capture of the data on the connection that closes on you, that will give us a lot of data to hunt down the issue.

Finally...the Internet is an unstable place for data streams. Connections can be broken from time to time for various reasons, which is why most IP clients/servers which are "always on" have auto-reconnect software.

Chris

I have not tried keeping it connected to a local box. I could probably rig something up and try it if need be. The connection from the N+ out to the internet dies once an hour like clockwork. It reconnects, but the "network down" exception just seems strange. If one side or the other was closing the connection, shouldn't it be a different exception? There is another box (the one that the N+ will hopefully be replacing) on my network that talks to the same server (round-robin DNS of servers, actually, if it matters) that has no trouble keeping a connection. The logs show it only having to reconnect every few days or so.

I will try and get a WireShark dump of what is going on.

Hi Alex

As an alternative, could you not use a local box on your network to act as a proxy to the internet hosted service?

I ask as i have had issues in the past with other embedded ethernet devices (lantronix and digi) with cheap/soho routers starnge things can happen when UPnP and DHCP are enabled with those devices. Especially if there is some subnet wide broadcasts going on...

When the socket dies is it always at the same interval or does it vary?
How busy is your local network?

Nak.


Yes, a proxy is an option, but probably not necessary. This problem is not a show-stopper, just an annoyance. And I'm not using a cheap/soho router, it's a rackmount server running pfSense. Maybe the N+ and pfSense don't get along?

The socket dies once an hour. I have not timed it with a stopwatch or anything but watching the clock you can tell about when it will go out. The Netduino is connected to my home network, which is not all that busy.

Thanks again for the help, guys.
-Alex

#9 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 01 July 2012 - 02:20 AM

The socket dies once an hour. I have not timed it with a stopwatch or anything but watching the clock you can tell about when it will go out. The Netduino is connected to my home network, which is not all that busy.

It's possible that the target is disconnecting after an hour (lacking some session config setting or keepalive requirement), or that some router in the middle is detaching the connection due to behavioral analytics.

I would recommend including a reconnect strategy in your project, for general IP reliability over the Internet...but it would be good to find out the root cause as well.

Chris

#10 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 02 July 2012 - 12:36 AM

Well, I spent some time with wireshark, and found that the disconnect happens when the Netduino is renewing it's DHCP lease. So, is this a bug, or should I be doing something in software (other than a reconnect routine, that part has been done) to keep this from happening? Thanks! Alex

#11 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 02 July 2012 - 03:43 AM

Hi Alex, Now that is very interesting. Can you tell if it's the router or the mainboard that's closing the network connections during DHCP lease renewal? [If it's being closed by the lwIP networking stack on the mainboard, we can look to see if the lwIP implementation should be tweaked to behave differently...and to see if it's by design.] As long as your device is reconnecting and your application works, you shouldn't need to do anything else. Chris

#12 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 02 July 2012 - 05:51 PM

Hi Alex,

Now that is very interesting. Can you tell if it's the router or the mainboard that's closing the network connections during DHCP lease renewal?

[If it's being closed by the lwIP networking stack on the mainboard, we can look to see if the lwIP implementation should be tweaked to behave differently...and to see if it's by design.]

As long as your device is reconnecting and your application works, you shouldn't need to do anything else.

Chris


I think the Netduino is the one closing the connection. DHCP negotiation happens, and then the Netduino sends a RST packet out to the remote server. I can send you the pcap file if you want to check it out.

Thanks,
Alex

#13 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 02 July 2012 - 06:17 PM

I think the Netduino is the one closing the connection. DHCP negotiation happens, and then the Netduino sends a RST packet out to the remote server. I can send you the pcap file if you want to check it out.

It sounds like lwIP (the network stack in NETMF, used on Netduino Plus) is resetting all network connections whenever it gets an IP address update. That makes sense, since you need to re-establish connections when an address changes...but it might be okay to surpress this and only reset the connections if the IP address actually changed.

Can you file an "issue" (bug report) on this over at netmf.codeplex.com, so everyone can look to see if this is the best practice for a network connection or not? There may be room for improvement in NETMF, or this may be desired behavior...

I would love a copy of the Wireshark log, thank you. We'll also put this on our list of things to look at as part of networking tuning.

Chris

#14 AB0TJ

AB0TJ

    New Member

  • Members
  • Pip
  • 7 posts

Posted 02 July 2012 - 08:44 PM

I will go ahead and file a bug report, thanks.

Chris and anyone else who might be interested, the pcap file is here.
I trimmed it to just the traffic to/from the Netduino (10.0.1.146). There is some "normal" traffic in there, and then the DHCP renew/socket disconnect.

#15 JimmyNet

JimmyNet

    Member

  • Members
  • PipPip
  • 12 posts

Posted 14 July 2012 - 03:47 PM

Did you try to use a static ip instead of dhcp?

#16 mohammad

mohammad

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 29 April 2013 - 07:09 PM

Hi all,

 

I have the same problem with some minor differences.

I am sending heart beats every 1min to a data service. My Netduino Plus has a Valid IP address with DHCP enabled in its network configurations. After some hours (e.g. max 56 hours, mean 32 hours in 10 experiments), the network thread stops with no exception.

I tried 10 different methods as probable solutions, but all failed:

  • Using single threading instead of multi-thread programming (asking Netduino to only process the single networking task).
  • Using C# HttpWebRequest/Response library
  • Socket Programming instead of the C# HTTP library.
  • Considering a fixed value for Thread.sleep() in the infinite loop (while true) of heart beat instead of WaitUntilNextPeriod() function which uses the CPU ticks.
  • Removing the infinite loop and trigger the heart beat function by the event driven function (based on the value received on COM port).
  • Using Debug.GC(true) at the beginning and end of each function to clean the memory.
  • Trying different heart beat frequency such as 1min, 2min, and 3min.
  • Considering Try-Catch in each function followed by relevant logs.
  • Mutex_lock and Mutex_unlock for the networking functions.
  • and so forth ....

Unfortunately, the problem exists after 2months and I couldn't find the solution yet :( Any help is appreciated a lot since this project is a part of my thesis and it should be done asap.

 

If you are interested to view my code, I attached into this message. This is the version with "Socket Programming" and "Thread.Sleep(Constant Value = 2min)" which worked longer than other versions (56 hours).

 

Cheers,

Mohammad

Attached Files



#17 mohammad

mohammad

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 22 May 2013 - 12:17 AM

Hi all,

 

After spending 2 months on this problem, finally I could solve it. It was fixed by using a watchdog solution: http://forums.netdui...lus/#entry49735

Now, my device is working well and continuously for 10 days.

 

Good Luck



#18 BeanAnimal

BeanAnimal

    New Member

  • Members
  • Pip
  • 2 posts

Posted 24 May 2013 - 12:05 AM

Mohammad,

 

I am glad you found a workaround, but still have concern as to why the socket was being dropped with no error, regardless of what you tried. In my application, forced reboots will not be a fix.

 

Is anybody else using tcp/ip  connectivty long term without problems?



#19 ziggurat29

ziggurat29

    Advanced Member

  • Members
  • PipPipPip
  • 244 posts

Posted 24 May 2013 - 12:06 PM

I have used it long terms without problems if all systems are running correctly.

 

By that bold statement (haha, bold font that is) I mean that I have run a netduino connected to, and communicating with, a server on the Internet, for over a week time, without apparent errors.

 

However, I have personally has this bad experience (which is different from yours, but nonetheless here it is):

*  if my daemon, running on the server on the Internet, is taken down.  E.g. to update the binary, whatever

*  then the netduino will properly detect that it has disconnected

*  my code will loop around in a retry loop to re-establish connection (after a brief sleep)

*  and at the 'connect' call, the thread will hang.  forever.  even if the daemon is brought up later.  so I wind up having to implement a watchdog also, but at least only the one thread is hanged.

 

Things that do work correctly for me are:

*  netduino ethernet cable unplugged, connection lost, retry connection repeatedly, plug back in, recovers connectivity to server

*  take down whole server on Internet, netduino detects lost connection, retries, recovers

 

So from my viewpoint, in my case, it seems that somewhere deep in the firmware the 'connection refused' state (when the port is not listening) is causing woes, but the various other problems related to 'host not reachable' are OK.  Moreover, the problems seems to manifest itself when it has once successfully connected, rather than when it has never successfully connected at all.

 

Anyway, I know my problem is different than yours -- yours is interesting -- but I wanted to let you know what did and did not work for me since thats what you last asked.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

home    hardware    projects    downloads    community    where to buy    contact Copyright © 2016 Wilderness Labs Inc.  |  Legal   |   CC BY-SA
This webpage is licensed under a Creative Commons Attribution-ShareAlike License.