Netduino home hardware projects downloads community

Jump to content


Photo

4.2.1 with Nwazet DAQ and Touch Screen


  • Please log in to reply
63 replies to this topic

#21 Fabien Royer

Fabien Royer

    Advanced Member

  • Members
  • PipPipPip
  • 406 posts
  • LocationRedmond, WA

Posted 11 November 2012 - 11:13 PM

Chris,

A straight up SPI transaction, which has always worked, as you know


Ah, that could cause troubles. Please grab a copy of the RGB LED, Potentiometer, or Piezo Buzzer module source code from the Wiki for an example of the standard enumeration method for GoBus. We'll be happy to help you update your driver if you'd like.



Can you please clarify how a straight up SPI transaction is suddenly an issue in a system that is purely SPI-based?

What has changed between 4.2.0 and 4.2.1 that affects SPI in the context of the Go?

Why are GoBus 1.5 changes allowed to break GoBus 1.0 devices, when the claim was made that a side-by-side experience will remain functional?

If 4.2.1 was designed to introduce a fundamental breaking change, which should not be the case for a minor release such as this, why were we not given the courtesy of a warning well in advance to prevent any customer impact?

-Fabien.

#22 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7061 posts
  • LocationNew York, NY

Posted 11 November 2012 - 11:26 PM

Hi Fabien,

Let me see if I can offer some addition details here...

Can you please clarify how a straight up SPI transaction is suddenly an issue in a system that is purely SPI-based?

What has changed between 4.2.0 and 4.2.1 that affects SPI in the context of the Go?

GoBus is transport-agnostics. The first release leveraged the SPI bus (using a simple frame format which was expanded on and formalized with the GoBus 1.5 spec).

There were some bugs in the SPI and GPIO code in the earlier firmware (pin reservation and clock calculations on one half of the board, IIRC) which were fixed in the latest firmware. These bugfixes may be snagging your driver (either causing a new bug or surfacing a bug in the driver).

Why are GoBus 1.5 changes allowed to break GoBus 1.0 devices, when the claim was made that a side-by-side experience will remain functional?

GoBus 1.5 updates do not break GoBus 1.0 devices. GoBus 1.0 compliant devices continue to run on mainboard GoPorts side-by-side with GoBus 1.5+ compliant devices.

The issue we're discussing here is a code glitch--either in the driver or in the newest mainboard firmware. There is no breaking change in GoBus.

If 4.2.1 was designed to introduce a fundamental breaking change, which should not be the case for a minor release such as this, why were we not given the courtesy of a warning well in advance to prevent any customer impact?

If we were to introduce a breaking change into the firmware (which we have no plans to do), we would try to let folks know in advance. We have taken extra steps to help code continue to function unmodified (even with Netduino Plus 2 -- with its new "unified" hardware provider), and we will continue doing so in the future.

Eventually we will want to update all GoBus 1.0 modules to GoBus 1.5+ to take advantage of recently-announced capabilities such as wired and wireless hubs. But for the moment let's focus on driver/firmware fixes so your users can continue to enjoy using their [nwazet touch display.

Chris

Edited by Chris Walker, 11 November 2012 - 11:31 PM.
fixed grammar


#23 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7061 posts
  • LocationNew York, NY

Posted 11 November 2012 - 11:27 PM

Fabien -- So, first steps first: From what you posted, the blocking point in the current driver appears to be during enumeration. If you copy and paste the sample enumeration code from the RGB LED module driver into the [nwazet Touch Display driver, does that fix the issue? If not, what effect does it have? From your logic analyzer, where is the connection getting lost? Looking forward to getting you back up and running, Chris

#24 theTroll

theTroll

    Advanced Member

  • Members
  • PipPipPip
  • 46 posts

Posted 11 November 2012 - 11:34 PM

But for the moment let's focus on driver/firmware fixes so your users can continue to enjoy using their [nwazet touch display.

Chris


And DAQs. But that is just me.

tt.

#25 Lunddahl

Lunddahl

    Advanced Member

  • Members
  • PipPipPip
  • 148 posts
  • LocationEurope, Denmark

Posted 11 November 2012 - 11:55 PM

Thank you for sharing your wisdom with us and showing us the error of our ways. You're clearly experienced in ways that I cannot fathom. Please, by all means, enlighten us all on how to resolve a software regression we have no control over...


I simply reacted to your request of letting the rest of the world wait for you, because i think it's wrong, and i think it's bad for the ecosystem as a whole.

I have already told you how i think the problem is avoided/solved in the future, and i can see Chris is offering help avoiding issues in the future too.

I don't have a solution to you current problems, i'm just happy that i can work on my solutions instead of waiting for issues with yours to be solved.

I'm also convinced that you and Chris will solve the technical issues quickly - also while the rest of the world still moves on.

- Ulrik

#26 Fabien Royer

Fabien Royer

    Advanced Member

  • Members
  • PipPipPip
  • 406 posts
  • LocationRedmond, WA

Posted 12 November 2012 - 12:52 AM

Ulrik,

I felt that your reaction was offensive and unfair. I'm only interested in one thing: getting our users unblocked asap and for issues such as this one to never repeat themselves in the future.


Software releases generally follow a process which involves regression testing to mitigate potential user impact when changes are introduced. It's a reasonable thing to ask if regression-testing occurred because it is a standard best practice in our industry. No one should be offended by that and I am confused as to why anyone would perceive my question to be a wrongful attempt at keeping anyone waiting.


As far as moving on, I'd like that very much too.


-Fabien.

#27 Fabien Royer

Fabien Royer

    Advanced Member

  • Members
  • PipPipPip
  • 406 posts
  • LocationRedmond, WA

Posted 12 November 2012 - 01:02 AM

Chris,

So, first steps first: From what you posted, the blocking point in the current driver appears to be during enumeration. If you copy and paste the sample enumeration code from the RGB LED module driver into the [nwazet Touch Display driver, does that fix the issue?


As a first step, I'd prefer installing the 4.2.1 firmware on a stock Netduino Go and compare what I see on the wire against a 4.2.0 trace, without changing anything else.

A few questions:
Can the 4.2.0 SDK live side-by-side with the 4.2.1 SDK?
Is it possible to deploy and run existing 4.2.0 assemblies / code to a Netduino Go running the 4.2.1 firmware and expect things to work or is "recompiling the World" first a requirement?

-Fabien.


#28 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7061 posts
  • LocationNew York, NY

Posted 12 November 2012 - 01:07 AM

Hi Fabien,

As a first step, I'm prefer installing the 4.2.1 firmware on a stock Netduino Go and compare what I see on the wire against a 4.2.0 trace, without changing anything else.

Sounds like a good plan.

A few questions:
Can the 4.2.0 SDK live side-by-side with the 4.2.1 SDK?

No, although if you want to copy the assemblies from the first one to a separate folder you could certainly host both on your PC manually.

Is it possible to deploy and running existing 4.2.0 assemblies / code to a Netduino Go running the 4.2.1 firmware and expect things to work or is "recompiling the World" first a requirement?

You can re-deploy, as long as you include the updated GoBus.dll in your deployment.

When we fixed the GPIO ReservePin bug in the firmware, it surfaced a bug in the GoBus DLL where we were double-reserving a pin. So you'll need the updated GoBus.dll in your deployment. It's a QFE so the version # and its object model haven't changed and you don't need to recompile anything.

Chris

#29 ErikN

ErikN

    Advanced Member

  • Members
  • PipPipPip
  • 119 posts
  • LocationNew York, NY

Posted 12 November 2012 - 01:17 AM

Ulrik,

I felt that your reaction was offensive and unfair. I'm only interested in one thing: getting our users unblocked asap and for issues such as this one to never repeat themselves in the future.

Aside from how you perceived his words, it sounds like we had an easy, quick answer to unblock your users. Don't upgrade to 4.2.1 until this bug is worked out. Bugs happen. In all industries. This is not unexpected and to continue to claim this is errant behavior is disingenuous.


Software releases generally follow a process which involves regression testing to mitigate potential user impact when changes are introduced. It's a reasonable thing to ask if regression-testing occurred because it is a standard best practice in our industry. No one should be offended by that and I am confused as to why anyone would perceive my question to be a wrongful attempt at keeping anyone waiting.


It sounds like they did do quite a bit of work to ensure the update did not change the way the spec was designed. From my observation it looks as though the driver were written in such a way that it happened to work through a bug or fluke. It is out of spec. Of course it could break as the spec evolves if it isn't conformant. Please don't put the weight of, what appears to be, a poor implementation of the spec on the designers of said spec. It's just divisive and won't win you any favors in trying to get problems resolved.

This has moved well beyond anything of benefit of the average user at this point. I'm sorry you feel as though you're going to have to move mountains to make your driver work as it should but it's just a bit of software. And there's time - there's no absolute need for your users to move to 4.2.1 immediately. They're in good hands while this gets worked out. Take the time to analyze what went wrong, read over the spec and recommendations on how to implement and mull over how your interactions could change in the future to be more conducive to problem solving and be just all around a bit more civil.

Bon chance!


#30 Fabien Royer

Fabien Royer

    Advanced Member

  • Members
  • PipPipPip
  • 406 posts
  • LocationRedmond, WA

Posted 12 November 2012 - 01:36 AM

Erik. I respectfully disagree with your assessment of what is and isn't out of spec or a so-called "poor implementation" of a spec that never was until recently. Breaking changes happen. In all industries. Merci.

#31 Lunddahl

Lunddahl

    Advanced Member

  • Members
  • PipPipPip
  • 148 posts
  • LocationEurope, Denmark

Posted 12 November 2012 - 02:20 AM

I felt that your reaction was offensive and unfair. I'm only interested in one thing: getting our users unblocked asap and for issues such as this one to never repeat themselves in the future.


It was not my intent to be offensive to you, and i don't think it was unfair to voice my opinion, as i think it's a very important matter.

Software releases generally follow a process which involves regression testing to mitigate potential user impact when changes are introduced. It's a reasonable thing to ask if regression-testing occurred because it is a standard best practice in our industry. No one should be offended by that and I am confused as to why anyone would perceive my question to be a wrongful attempt at keeping anyone waiting.


I do know what regression testing is, but i still do not agree to the scope of testing that you asked for. If you are confused, try reading what you wrote again, i even quoted it before my answer.

As far as moving on, I'd like that very much too.


I can see others have voiced their opinion on this matter (the scope of regression testing), and i have given my opinion. I feel Chris is listening, and i trust Secret Labs to make a wise decision on this matter.

I'm not offended in any way, and i hope you are not either...

:)

- Ulrik

#32 theTroll

theTroll

    Advanced Member

  • Members
  • PipPipPip
  • 46 posts

Posted 12 November 2012 - 02:31 AM

I think that everyone puts a lot of their time and souls into these products. So when something doesn't work quiet right people get a little bit sensitive. These are their babies and they want them to do well. I have no doubt that the main goal is just to get this working as it should, but I think people should be careful when responding to these thread because these are not just chunks of electronics and bits of codes but countless hours of works and toil. So I am going to suggest everyone take a few deep breaths, remember that in the end we are all here to make this idea work and become the best system it can be. tt. P.S. Yeah, I make a fairly pathetic internet troll.

#33 phantomtypist

phantomtypist

    Advanced Member

  • Members
  • PipPipPip
  • 130 posts
  • LocationNew York, NY

Posted 12 November 2012 - 02:40 AM

I've developed software since I was a kid. Bugs happen and bugs will be bugs. We as software developers encounter bugs every day whether they were introduced by ourselves or someones. Sometimes we are fortunate and we actually maintain the source code for the product so we can fix it. This is part of what developers do. We fix bugs. They are to be expected. I understand that some context can be lost on forums, but from my point of view, there is no need to get into a heated debate over this bug. This is a prototyping community. Things like this happen. For the sake of civility, Fabien and Secret Labs, might I suggest that you try to resolve the issue outside of the forum? I understand this is frustrating, but I think you, Fabien, should to take this issue outside of the forum and resolve it directly with Secret Labs. Maybe email or some phone calls? I'm sure this problem can be solved better ways than back and forth in this thread.

#34 theTroll

theTroll

    Advanced Member

  • Members
  • PipPipPip
  • 46 posts

Posted 12 November 2012 - 02:42 AM

For the sake of civility, Fabien and Secret Labs, might I suggest that you try to resolve the issue outside of the forum?


I am actually going to disagree with this one. There is a lot of value to this being discussed on an open thread. Most likely there are other devs that can learn from it and avoiding having the same issues in something they are working on.

tt.

#35 ErikN

ErikN

    Advanced Member

  • Members
  • PipPipPip
  • 119 posts
  • LocationNew York, NY

Posted 12 November 2012 - 03:54 AM

I am actually going to disagree with this one. There is a lot of value to this being discussed on an open thread. Most likely there are other devs that can learn from it and avoiding having the same issues in something they are working on.

tt.


I'm going to disagree with your disagreement. :)
This discussion isn't really for devs - it's for module developers. There is a different effort being made on establishing best practices and, I think, write up a Module Builders Guide which would benefit from anything learned during this process.

This particular discussion is just noise on the original question now.

#36 Fabien Royer

Fabien Royer

    Advanced Member

  • Members
  • PipPipPip
  • 406 posts
  • LocationRedmond, WA

Posted 12 November 2012 - 05:25 AM

I just completed the investigation of the issue and here's what I have observed and my recommendations moving forward.

Context
The touch display SPI frames are fixed-size and 8192 bytes long.
The DAQ SPI frames are fixed-size and 576 bytes long.

By default, the SPI bus speed configured by the display driver is 25MHz
By default, the SPI bus speed configured by the DAQ driver is 16MHz

Here's what I have observed using 4.2.1 using the Touch Display module at the default 25 MHz speed:

First SPI request:

Sent: 0,0,3,0,7,86,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.[...]
Received: 128,91,110,119,97,97,97,97,97,97,97,97,97,97,97,97,97,97,122,101,116,46,100,105,115,112,46,49,46,48,54,0,[...]
Expected: 128,91,110,119,97,122,101,116,46,100,105,115,112,46,49,46,48,54,0,[...]

Second SPI request:

Sent: 0,0,3,0,7,86,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.[...]
Received: 0,[...]
Expected: 128,91,110,119,97,122,101,116,46,100,105,115,112,46,49,46,48,54,0,[...]

Subsequent SPI requests:

Sent: 0,0,3,0,7,86,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.[...]
Received: 0,[...]
Expected: 128,91,110,119,97,122,101,116,46,100,105,115,112,46,49,46,48,54,0,[...]

Observations
1. in the first request, the received buffer is being corrupted as seen with the repeated '97'
2. subsequent requests with the same query string only return '0', indefinitely.
3. the data corruption and where it occurs in the buffer is random.
4. on occasions, the data is not corrupted and the display responds...until the next data corruption occurs.
5. connecting the display to either side of the GoBus yields the same results

Resolution
1. Dropping the SPI bus speed to 10MHz eliminates data corruptions and allows the display to work again
2. Anyone who is affected by this issue in 4.2.1 should try initializing their Nwazet modules like so (replace Socketx with the proper socket):

canvas.Initialize(GoSockets.Socketx, 10000); // 10 MHz

or if the issue persists:

canvas.Initialize(GoSockets.Socketx, 5000); // 5 MHz

3. The same recommendation applies to the DAQ

Conclusion
With release 4.2.1, the Netduino Go firmware introduced a change which made SPI communication above 10MHz unstable.
The Nwazet display and DAQ modules use ARM Cortex chips capable of high SPI frequencies and are configured to maximize SPI throughput by default, well above 10 MHz.
Other STM8S-based modules are unaffected by the change introduced in 4.2.1 because they work at much lower frequencies, typically much lower than 5MHz.

#37 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7061 posts
  • LocationNew York, NY

Posted 12 November 2012 - 07:10 AM

Hi Fabien, Thank you for digging into this further. Glad that a simple code change seems to get users back up and running. If I remember correctly, the old firmware had a bug where it would send SPI traffic at either half or double the requested speed (depending on how you look at it). This was fixed in 4.2.1 via a bugfix check-in from the STM32 core. It may have been an issue with only one of the two buses (since they run on different peripheral clocks which could run at different speeds). Can you check the actual throughput speed on your logic analyzer (both with the old firmware and with the new firmware) and verify that it is now correct? If the bugfix shifted the speed the wrong way, then we'll want to fix that. If it shifted it the right way, then the speed was incorrect before and I would recommend updating your code. With GoBus 1.5, these sorts of issues go away. The I/O frames are taken care of by the framework...so you can focus on your application. This also enables your gear to be used on different transports. One other thing: the STM32 chips run SPI at an even divisor. With the 168MHz MCU and internal 84MHz peripheral clock, that means that the SPI bus can run at 21MHz or 10.5MHz. So selecting 25MHz will generate a 21MHz SPI clock and selecting 16MHz will generate a 10.5MHz SPI clock. Chris

#38 Lunddahl

Lunddahl

    Advanced Member

  • Members
  • PipPipPip
  • 148 posts
  • LocationEurope, Denmark

Posted 12 November 2012 - 10:43 AM

I am actually going to disagree with this one. There is a lot of value to this being discussed on an open thread. Most likely there are other devs that can learn from it and avoiding having the same issues in something they are working on.


I totally agree.

A lot of the threads on the GO forum is actually about the inner workings of the go, gobus, and modules, it could be a good idea to split the forum section, so we have a separate go module makers section, novice users could then skip the module develepment talks as they are often quite advanced and might look scary.

I won't mind contributing by spending a night or two, trawling it all trough, moving development post to a subforum if that is needed.

That this got a little heated is very rare, and as long as it keeps rare, it's really not a problem.

#39 Lunddahl

Lunddahl

    Advanced Member

  • Members
  • PipPipPip
  • 148 posts
  • LocationEurope, Denmark

Posted 12 November 2012 - 10:54 AM

With GoBus 1.5, these sorts of issues go away. The I/O frames are taken care of by the framework...so you can focus on your application. This also enables your gear to be used on different transports.


Are their any hints as to how the GoBus 1.5 will handle frames, how long can they be, how fast can SPI signal to the hub, will there be a buffer so multiply frames can be sent without an ack from the modules.

As I understand it GoBus is simplex, hosts send to the module in one SPI transfer, wait for it to interrupt then the host make another SPI transfer to receive, right?

Will those transfers be buffered in the hub, or cut-trough switched.

- Ulrik

#40 neslekkim

neslekkim

    Advanced Member

  • Members
  • PipPipPip
  • 306 posts
  • LocationOslo, Norway

Posted 12 November 2012 - 11:01 AM

I really wish there was an public source repository where we could check what changes are happening to the netduino and the modules, this way, we could test against new version underways, possibly before it's released to see if we need to change stuff. Arduino have done this since ages, allowing others to change code so the repositoryowner can pull inn sourcechanges if they accept it. Nwazet already have an repository so we can watch, but I havent found any from Secretlabs other than zip files. Is this opensource or not?

--
Asbjørn





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

home    hardware    projects    downloads    community    where to buy    contact Copyright © 2010-2014 Secret Labs LLC  |  Legal   |   CC BY-SA
This webpage is licensed under a Creative Commons Attribution-ShareAlike License.