The Netduino forums have been replaced by new forums at community.wildernesslabs.co.
This site has been preserved for archival purposes only
and the ability to make new accounts or posts has been turned off.
I think you just pointed out many of the items from my "dream wish list"
Is there a single thread of execution? If more than one were available, it would seem ideal to dedicate one thread to unmanaged/native execution and one to managed code, allowing the native code to run tight, fast loops while pulling messages off a queue coming from the other thread (as opposed to direct method calls that cross context boundaries).
If multiple threads aren't available, would some form of preemptive multitasking to simulate this make sense? And if so, could the developer configure the priority or time sharing divisions based on the needs of the project? For example, in a quadcopter, 80% of the time might be needed for time-sensitive stabilization logic (running native code) and the remaining 20% could handle command processing, waypoint nav, telemetry and data logging (running managed code).
In the world of low-end microcontrollers, there is exactly one thread of execution. .NET MF has a cooperative multitasking scheme which allows multiple managed code threads--but behind the scenes there is only one native code thread.
That said, the way to accomplish something along these lines is to use a hardware timer interrupt to have the microcontroller interrupt the thread of native code execution every ### nano/micro/milliseconds--to run a higher-priority native code function. That's what we're working on as an option for high-priority "control loops" for things like quadcopters. You could take up 50%-80% of the MCU time in this function (which .NET MF knows nothing about--we're just "slowing down the MCU" from its perspective) and leave the rest of the microcontroller's time to the CLR.
With dual core smart phones arriving, is a dual core Netduino being considered?
That would be awesome. Unfortunately, high-end dual-core microprocessors are way way more expensive than microcontrollers (and generally require external flah and RAM). We could build that, but you'd end up with a multi-hundred-dollar product. At which point a low-end dual-core embedded PC starts to make more sense.
What kind of performance gains could be achieved by compiling managed code and deploying the native image (instead of interpreting)? I understand it consumes more storage space for compiled assemblies, but in some cases this might not matter: some programs might be small anyway, and an SD card or other storage device could be used to store larger programs. By making this a configuration option per project, developers would be able to consider the trade-offs and make the right decision based on the needs of individual projects. The bulk of a large program could be interpreted while time critical routines could live in a project configured for native pre-compilation.
I would love to do this. Unfortunately, that's one of those things that would take an immensive amount of engineering and Microsoft would probably need to spearhead. We've talked to them about it. It could happen someday. But it would essentially require a rewrite of a large part of the .NET MF system/SDK and may require a more expensive microcontroller.
What do you folks think about Suspend & Resume functions for garbage collection? In a desktop, web, or phone app this would be pretty crazy, I know, but on a dedicated embedded device, I wonder if this is the right way to go in advanced scenarios. It would be good then to have another function to manually deallocate memory (maybe by passing in a root object in an object graph to reclaim).
Suspend and Resume for the GC is actually a pretty cool idea. If enough users ask for it, we could probably create an extension method which enabled this fairly easily. It's dangerous and for power users only, but we'd be happy to look into it
How many of these decisions are made by the Micro Framework team at Microsoft, and which are open for modification by Secret Labs and the broader development community? Some of the benefits of the managed world come at a high price (as the need for this thread attests). Having GC as an option is great, but in the world of tiny devices and resource constraints, we should probably evaluate the necessity of imposing it on all projects at all times. What could we peel away to get as close to the bare metal without losing the ability to use our favorite languages, tools, and libraries? If we suspend GC and mismanage our objects, we won't crash our work computers or brick our cell phones, so how big of a deal would it be to relax some of these constraints?
For the Netduino firmware, we make all the decisions based on community input and contributions. But we also represent the community with a seat on the .NET MF Core Tech Team (open source project). We coordinate with Microsoft pretty much weekly on the directly of .NET MF and have a lot of input--and contribute a lot of code--but ultimately they approve/disapprove of contributions. And if we ask them to write a huge chunk of code, it's really up to them to spare the resources
Besides garbage collection, what kind of overhead is imposed by the Micro Framework?
Mostly the overhead of running an interpreter and the task switcher. This is a pretty big hit from a numbers standpoint (100x+) but for most applications you don't really run more than a few hundred/thousand instructions per second--and the Netduino's MCU provides much faster managed code execution than that...
<Opinion> Netduino is a fantastic, exciting platform precisely because it brings .NET to the physical world. We'd be better off adapting NETMF to meet the demands of this low level world than in devising clever but complicated mechanisms to circumvent it, creating high performance NETMF capabilities rather than fragmenting Netduino with a new, non-.NET code base. Once we head down the road that goes around NETMF, there will be less incentive for the NETMF team to make the sort of changes I'm referring to here. </Opinion>
Thanks, Dan We'll try to find the right balance. Since most/all of our enhancements are either offered to or contributed back to the .NET MF core (and we have a representative on the .NET MF core tech team), hopefully we can achieve the best of both worlds for both Netduino community members and the .NET Micro Framework itself.
Hm, I'm not sure your queue suggestion solves the problem unless you can describe a little more how it ought to work. How should C# post to such a queue without itself crossing one of the context boundaries you are concerned about avoiding?
If a native compiler like NGEN existed, along with whatever support libraries such code would need, that would indeed be wonderful.
I don't much care for your Suspend/Resume idea. The garbage collector is invoked at a time when the program had made a memory allocation request and the system doesn't have any more (uncompacted) memory. In that scenario, if your "Suspend" mode is in force, I don't see how the system will have any remaining option other than to throw an OutOfMemoryException.
Likewise, I don't think it helps the system much to tell it about which objects to reclaim. First, the programmer might be wrong (this is one rationale for getting away from the malloc/free model in the first place) and second, I'm not sure it helps all that much anyway. Decent systems (I'm not sure if NETMF actually does it this way) like to have a big pool of contiguous unallocated memory so that the allocation operation is dirt-cheap (more or less just advancing a "next free location" pointer), rather than doing it like C does, which is to root around in a bunch of free pools to find a block of the appropriate size.
Anyway, the cost of GC is that of finding which references are still in use (which is proportional to the number of live objects), plus the cost of compacting them. Unreachable objects don't cost you anything, so this is another argument why the Release() proposal probably won't help much.
Approaches that I think work better are:
A programming style that limits allocations, or does them all at a certain time
Forcing the GC to run at certain times when you know you have some extra CPU time and can afford to freeze your framework
But if you really want to fix NETMF's GC, you should advocate replacing its 50-year-old mark-and-sweep algorithm with something more modern and efficient, like generational or parallel GC. Like your "native code" suggestion, this is totally possible, and is merely "a simple matter of programming" (i.e. someone spends a lot of time and money on it).
I should say for the sake of total disclosure that that is the way things should work in a sane system. I'm only at the beginning of understanding the NETMF source code, so I'm sure I'll find all kinds of surprises.
You're right that message passing doesn't eliminate crossing context boundaries altogether. What it would do is to ensure that all such crossings happen at one carefully crafted place in firmware code. All interop calls using the mechanism could leverage this and obtain looser coupling, providing an async-only-like calling mechanism. This really only makes sense when native and managed code is running in separate, parallel threads.
Your point about GC strategy sounds valid. I'm not familiar with how GC works in NETCF. In normal .NET, it runs well before you run out of all memory. I'm not trying to suggest definitive answers, just trying to brainstorm to figure out how to make NETCF code more predictable and performant. In another forum post here, I read about work on a quadcopter where the GC was "only" running a sweep once every 100 ms or something along those lines. But if allocating all objects up front can eliminate GC activity, I'd agree that a more conscientious programming style is the better approach.
This is a great thread! I've been working with this subject for some time now
Btw in regards to the ARM/THUMB issue. From what I've read, ARM codes can be compiled as PIC/PIE. THUMB cannot. (Might be worth considering.)
If you cannot compile the code as position independent, you'd have to preallocate a fixed size flash area for the potiential native code. (Would still be worth it.)
Btw in regards to the ARM/THUMB issue. From what I've read, ARM codes can be compiled as PIC/PIE. THUMB cannot. (Might be worth considering.)
I'm confused. All the code generated by my "Fluent" library is Thumb, and it is all position-independent. Can you provide more detail about what you're referring to or post a URL to what you were reading?
Many users write Arduino code using the Wiring libraries (an abstraction on top of C) but for more powerful features they call into native C code. In our case, most users will write Netduino code using C# (an interpeted runtime on top of C) but for more powerful features they'll be able to call into native C/C++ code.
For those that haven't used the Arduino I can give a good example of this. Check out the WriteLEDArray() funciton in the following link. It is directly accessing PORTB on the arduino (pins 8-13). You could cut the code in half by using digitalWrite and specifying the Arduino pin but behind the scenes digitalWrite has to look up the port and pin. So going this route he wanted to push data to the TLC5947 as fast as possible by directly accessing the pins but still be readable code.
It also states however that it *is* possible to create pic. Both with THUMB and ARM. And it gives away that the RVDS compiler has solved the issue.
We however, have to make it work with GNU. I'll see if I can dig up my other source.
The following is not one of my 'readings' but it does hint at the issue:
OK, I see the sentence "It [the former, deprecated standard] also supports only ARM code (not Thumb)" in that article, but so far as I can tell the only reason it would say that is because their silly little hack makes heavy use of register 9, which is awkward though hardly impossible to access from Thumb (one merely needs to move it to one of R0-R7 first).
I'm certainly curious what their "improved procedure call standard and a new relocatable/reentrant code mechanism" is, which apparently obviate the issue altogether. If you have any references which define these standards, I'd like to read them.
Are you saying that gcc fails to follow this new standard? That's a bummer.
Thank you for this. I think this will help a great deal in the cases where near realtime is necessary. I know this requires alot of work to integrate and glad your team are doing it. I would really hope that the NETMF team will take you and GHI's lead as far as providing interops and push it into NETMF. I love managed code and will always write in it when i can, even when knowing i am sacrificing speed for elegance, but given my recent project DotCopter.NET there is no way it can all be done in managed. So in an appeal to the NETMF team, to avoid further fragmentation, they need to pull this into the framework for the cases where its a last resort.
That assembly would also have an embedded resource of compiled native code.
Just out of curiosity, does anybody know the usage of MetaDataProcessor.exe "-patchNative" parameter? I have found some interesting code (see WatchAssemblyBuilder::Linker::Generate(...) function) that may be related to the topic - i.e. loading a file and storing it in a 'tableResourceData'. Maybe the native interop is already there...
Just to add a suggestion for the native side compilation, using VALA (its basically a c#-like preprocessor that spits out c code) in some way might be cute. Maybe you could have stub managed classes for intelisense for talking to the device, which get swapped out for native ones written in vala when compiling via vala + gcc.
Although i suspect statically linking with stuff vala needs may cause license issues to arise.
Do you have any more news/documentation about the new Runtime Native Code Interop feature for the v4.1.2 firmware release?
It's something I'm really looking forward to. Do you know when the v4.1.2 firmware will be released?