Netduino home hardware projects downloads community

Jump to content


The Netduino forums have been replaced by new forums at community.wildernesslabs.co. This site has been preserved for archival purposes only and the ability to make new accounts or posts has been turned off.
Photo

A different approach to speeding up managed code


  • Please log in to reply
23 replies to this topic

#1 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 27 May 2011 - 07:14 PM

On these forums I have been reading through various ideas, projects and speculations about how to increase the slow .Net MF performance on Netduino. Projects like Corey Kosak's Fluent and SimpleNGen are very ambitious and could help out, but I wish that the solution was to be able to compile our standard .Net code from IL to native code without requiring special coding styles for those kinds of approaches. This is not to diminish people like Corey's efforts at all, this is just an attempt to throw out additional ideas to see what other solutions there might be.

Since the MF is new to me (I'm quite familiar with the full .Net though), I'm still not sure what can and can't be done, so if this idea seems non-workable I won't be surprised.

So basically my idea is that what if one can use some of the existing tools that can compile a .Net assembly down to native ARM code? I searched around and found AOT (Ahead Of Time), a Mono utility that can take compiled .Net executables (in theory this should also work with VS created executables) and compile it to native code. It also happens to support the ARM platform. There are also ways that Mono can combine all assemblies into one executable, so my idea is to first combine all assemblies into one .Net executable (including all the .Net MF DLLs themselves and also your own executable). Then we run the AOT compiler that compiles this single .Net assembly to native ARM code.

Of course you won't be able to debug this, so one way to do it is to have the usual debug build, but then also have a project configuration like "Release Native" (or whatever), that runs these tools as post-build steps and ultimately flashes the Netduino with the native code.

Would this work, or is this simply not possible at all?

#2 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 27 May 2011 - 07:23 PM

It's a good start. The problem is that different ARM microcontrollers require different code to use features like SPI, I2C, UARTs, USB, etc. So while you could compile the "add, remove, manipulate objects" code to native code...you'd have to have a whole support library of micrcontroller-specific peripheral code to link to. I think that SimpleNGEN helps us get down this path quite far as well... Chris

#3 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 27 May 2011 - 07:38 PM

It's a good start. The problem is that different ARM microcontrollers require different code to use features like SPI, I2C, UARTs, USB, etc. So while you could compile the "add, remove, manipulate objects" code to native code...you'd have to have a whole support library of micrcontroller-specific peripheral code to link to.

I think that SimpleNGEN helps us get down this path quite far as well...

Chris


But wouldn't those be already in the Netduino .Net assemblies that are going to be included in the AOT compile? Or are you saying that those are native libraries so it might not be included? I would think that there has to be a way that AOT will deal with this sort of case, right?

Or maybe there could be a way to include all of the required Netduino specific functionality (SPI, I2C, UARTs, etc) in such a way that it automatically gets included in the AOT compile.

Once again I'm just trying to understand all the issues related to such an approach, and whether there are ways to work around the issues that do come up. I believe that if we can make this to work, it would really be a great solution as it would be completely transparent to the developer and we don't have to worry about special coding required for those areas you want to speed up since all of it will be sped up, including the .Net MF code itself.

#4 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 27 May 2011 - 07:44 PM

If all the managed code assemblies were native compiled, then yes the microcontroller-specific native code ones that already exist would work just fine. AOT is definitely a good path to explore, especially for speed-sensitive sections of code. Chris

#5 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 27 May 2011 - 07:58 PM

Chris, Thanks for the feedback, it is very helpful since as you can tell I'm not familiar with the .Net MF and I've only started looking at Netduino a few weeks ago. My background though is in electronics, including programming 8051's many many years ago. For the last 15 years or so I've been doing fultime software development (.Net for most of it). One of my previous projects was to implement "Reverse P/Invoke" where I define a new attribute "DllExport" so that you can call C# .Net DLLs directly from unmanaged code without requiring a C++/CLI intermediate library (the 1st time an unmanaged call is made, the .Net framework is loaded automatically). This involved modifying the IL and recompiling it. Not really relevant to this discussion but just want to point out that while I'm a noob at Netduino, I could have some past experience that can help me if I decide to tackle a project such as this. Having said that, would anybody be able to give me a high-level overview on what kind of tools I would need if I wanted to experiment with the above mentioned approach? Obviously I won't be buying the $6600 compiler you guys use to compile the .Net MF, but I assume there are other approaches (well, assuming I even need to recompile .Net MF at all). Thanks again for the helpful feedback.

#6 ItsDan

ItsDan

    Advanced Member

  • Members
  • PipPipPip
  • 101 posts

Posted 27 May 2011 - 08:31 PM

I'd be interested in following these efforts. A great article for the new wiki would be a beginners guide to native development. Especially an overview of cheap/free compilers and their strengths weaknesses, and a "compiling your first firmware" section walking you through adding something to the framework and utilizing it from managed .net.
Follow the adventures of the Box of Crappy Surplus

Total BOCS Traveled Distance: 9708 miles | States Visited: 5
Track the Box

#7 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 27 May 2011 - 09:20 PM

BTW, this is the same approach that is used to run .Net on the iPhone, which also uses an ARM processor.

#8 ItsDan

ItsDan

    Advanced Member

  • Members
  • PipPipPip
  • 101 posts

Posted 27 May 2011 - 10:12 PM

Is that the MonoTouch project? I used that a little bit, never released anything using it though.
Follow the adventures of the Box of Crappy Surplus

Total BOCS Traveled Distance: 9708 miles | States Visited: 5
Track the Box

#9 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 27 May 2011 - 10:58 PM

Is that the MonoTouch project? I used that a little bit, never released anything using it though.


Yes, that appears to be correct. See here. I think this could be made to work for Netduino but I'm too unfamiliar with the pieces of the puzzle that I'm not sure how much effort it would require. I'm willing to put some work into this.

From what I can tell, you would:

  • Create/debug the Netduino project in VS just like before.
  • Switch to a special project configuration (something called "Release Native" or similar) that has post-build steps to do the following:
  • Use the linker that is used to create libraries that only have the minimal code required for your project.
  • Use AOT (called mtouch in the case of iPhone) to compile this down to native code.
  • Deply the native code to the Netduino.

EDIT: I updated the list with clearer steps.

EDIT EDIT: I found the following comment from here very interesting, and basically describes the process very clearly:

1. MonoTouch doesn't compile directly to ARM assembler. The C# compiler compiles to IL (which means you can create projects that generate .dll files that you can then distribute/share between other solutions).

Then, for *device* deployment, all the assemblies used by your program (e.g. monotouch.dll, mscorlib.dll, YourApp.exe) are linked together, unused types and members are removed, and ARM assembly is generated. Method IL is removed from the assemblies, leaving only metadata (so you can use Reflection to list types & members


I'm even more convinced now that such an approach is the correct way to get excellent performance on Netduino using "pure" C#. So far I have not been able to find benchmarks between MonoTouch and Objective-C, but from a few comments I found, the conclusion was that they were "similar" enough that it didn't really matter. That's a far cry better than the 100 to 1000 times slower the current .Net MF runs on Netduino vs native code. I say that's worth exploring.

Basically, we need a Netduino version of MonoTouch.

#10 Corey Kosak

Corey Kosak

    Advanced Member

  • Members
  • PipPipPip
  • 276 posts
  • LocationHoboken, NJ

Posted 28 May 2011 - 04:05 AM

Basically, we need a Netduino version of MonoTouch.

I am hardly an expert in these matters (more like an interested layman) but my feeling is that the hardest part of this approach is not the AOT compilation, but rather the requirement to provide a runtime for that native code. That is, you have to build runtime support to support garbage collection, preemptive threading, interface dispatch, and other stuff, that now has to work with native code rather than interpreted code. Also I worry that in this approach there is more processor-specific code (and therefore less of a motivation for someone to work on it, since it will only work on the Netduino's processor family rather than the full array of processors targeted by NETMF). It strikes me as a huge amount of work, but I as always I could be ignorant and I would love to be proven wrong.

I would consider it a less risky move to either limit oneself to the simple subset of C# (like what I was doing in my project); or do something a little more radical and, say, put a FORTH interpreter into the firmware. I do think FORTH code could be made to run pretty fast (though I don't have any hard data that would quantify this)

#11 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 28 May 2011 - 05:33 PM

I am hardly an expert in these matters (more like an interested layman) but my feeling is that the hardest part of this approach is not the AOT compilation, but rather the requirement to provide a runtime for that native code. That is, you have to build runtime support to support garbage collection, preemptive threading, interface dispatch, and other stuff, that now has to work with native code rather than interpreted code.


Well if you look at what MonoTouch does, it completely compiles everything to native code. We know this because on iPhone any form of interpreted code is strictly forbidden. So this includes the GC, threading support, etc. It all ends up as native code.

Also I worry that in this approach there is more processor-specific code (and therefore less of a motivation for someone to work on it, since it will only work on the Netduino's processor family rather than the full array of processors targeted by NETMF). It strikes me as a huge amount of work, but I as always I could be ignorant and I would love to be proven wrong.


Well I wasn't thinking that this would be something that would be done by for instance MS, but a project that was done by one or more Netduino community members. So the motivation would be to use VS with C# but end up with near-native speed end results with almost no extra developer effort (other than switching to the "Release Native" project configuration once development and debugging is completed).

As far as being a lot of work, that might be true. We see that there are already tools available to fill in all of the building blocks for such an approach. We have tools that can merge .Net libraries into a single executable while stripping away all unused code (mkbundle). We also have a tool that can take such an executable and compile it down to native ARM code (AOT), stripping out the IL code and any other unneeded bits and pieces, while leaving metadata etc intact. That metadata will allow things like reflection etc to keep working just like it did before, even though the code is now all native.

The part that I think will need work is to tailor these tools to the Netduino use case. For instance, I took a quick look at the MonoTouch compiler code and I saw that they added support for new command line switches for the iPhone. A lot of the MonoTouch work revolves around creating C# wrappers for all of the iPhone APIs. That is something that would be much easier in this case because there are only a handful of Netduino specific APIs (ports etc). Since this is all open source, we could modify it to work with Netduino.

Of course how much work all this would really require is unknown, but if this could work we could potentially get near-native speed from our standard C# code, and not just parts of it - all of it.

I think a real effort to at least get an understanding of how much work this would require is not a bad idea. I mean, who would not want all of their code to run ~100 times faster without any extra effort or special coding?

#12 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 28 May 2011 - 11:18 PM

Here's a wild idea...

We can see that the .Net MF has an ARM JITter. I wonder if one can use this code as a basis for a custom IL -> native compiler. What are the licensing implications? How hard would it be? Is it just too crazy to work?

It could even be converted to a C# tool that can be run as a post-build step that takes as input a managed DLL and produce a native ARM file.

Or something along those lines...

#13 jameshardiman

jameshardiman

    New Member

  • Members
  • Pip
  • 1 posts

Posted 28 June 2011 - 02:32 AM

As an extreme Netduino and .Net MF user, I was deeply shocked when I benchmarked my dear little Arduino (8 bit, 16MHz) against my shiny new Netduino (32 bit, 48MHz) and discovered that code which runs in 20 seconds on the Arduino takes over 70 seconds on the Netduino. So I'm very interested to hear what has become of this discussion!

#14 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 28 June 2011 - 03:00 AM

Hi jameshardiman,

As an extreme Netduino and .Net MF user, I was deeply shocked when I benchmarked my dear little Arduino (8 bit, 16MHz) against my shiny new Netduino (32 bit, 48MHz) and discovered that code which runs in 20 seconds on the Arduino takes over 70 seconds on the Netduino. So I'm very interested to hear what has become of this discussion!

There is a lot of raw horsepower in there just waiting to be unleashed :) You can do it today by mixing in some C++ code (compiled into your Netduino firmware), and we're working as a community to enable runtime native code interop as well.

It's generally things like bit-banging that benefit from native code speed. High-level control of microcontroller operations like SPI and PWM (v4.2) are faster already, of course.

Welcome to the Netduino community,

Chris

#15 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 28 June 2011 - 03:50 PM

Well I was side-tracked by another project, but plan to get back to this within the next couple of days. This is going to be a big project though, so don't expect anything usable anytime soon, if ever...

To be honest, I'm surprised that code that takes 20 seconds to run natively "only" takes 70 seconds to run as managed code (of course it is a different processor but still). My understanding is that managed code runs anywhere from 100 to 1000 times slower. This is not a problem with .Net in general, as managed code on the full .Net framework runs within 10% of native code speed (and if you do lots of small memory allocations, managed code can actually be faster than native code). The problem is specific to the Micro Framework.

I looked at the .Net MF code and I can see why it runs so slow. Each IL instruction is sent through a giant switch statement and everything is emulated in C++. Of course that is going to be extremely slow. There is no attempt at mapping IL instructions to corresponding ARM instructions at all.

#16 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 01 July 2011 - 06:02 PM

I was looking at the MF code again and once again I can clearly see why this thing executes 100 to 1000 times slower than native code. For instance, look at this source file, and specifically the ArmEmulator::Execute call. Inside it has a giant for loop that executes individual instructions. Then inside the loop, for each instruction, it calls ArmProcessor::Opcode::Execute in this source file. This code has slow written all over it. My feeling is that it was fast enough for the SPOT watches, so optomizing the code just wasn't a priority.

Now I'm thinking that another approach to speeding MF up might be to rewrite this code into something more efficient. At least for me this won't be a very attractive project to work on because the reason I like playing around with this stuff is because I like coding in C#. Taking this approach one would be purely coding in C++, something I've done a lot in the past but don't care so much about anymore. In addition, I have no idea where to start setting up a dev environment to be able to compile the MF. Supposedly the ARM compiler that is used to compile MF costs $6000, so that is out of the question for me. GCC can be used but I read somewhere that it produces larger executables that might cause problems in some cases.

A variation of such an approach might be to write an experimental .Net interpreter in C++ for x86, and if it seems to be a workable solution, port that over to ARM, which should be much easier from an experimental point of view. This is especially true for debugging, since the last thing I'd want to do is debug using GCC.

#17 Corey Kosak

Corey Kosak

    Advanced Member

  • Members
  • PipPipPip
  • 276 posts
  • LocationHoboken, NJ

Posted 02 July 2011 - 04:04 PM

I was looking at the MF code again and once again I can clearly see why this thing executes 100 to 1000 times slower than native code. For instance, look at this source file, and specifically the ArmEmulator::Execute call.


I'm slightly confused by this statement. Do we believe that the Netduino has a code path that calls into ArmEmulator::Execute? Why would it?

#18 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 02 July 2011 - 06:02 PM

I'm slightly confused by this statement. Do we believe that the Netduino has a code path that calls into ArmEmulator::Execute? Why would it?


No, not the Netduino itself, but the unmanaged code that runs the virtual machine (something has to interpret the IL instructions). From what I can determine, when you run .Net MF on the Netduino, the code I linked to gets called to execute the managed code. So it takes each IL instruction and executes it via the code path I pointed out.

Is there a different part of the code that does the actual IL instruction interpretation on the Netduino? I might have missed it since there is a lot of code there.

#19 Chris Walker

Chris Walker

    Secret Labs Staff

  • Moderators
  • 7767 posts
  • LocationNew York, NY

Posted 03 July 2011 - 04:48 AM

The .NET Micro Framework is an interpreted runtime. Basically: each IL instruction is read--and then emulated with native code. This allows .NET MF to use really small IL code, and makes it cross-platform (ARM, THUMB, X86, or most any other MCU/CPU architecture). There may be quite a bit of room for performance improvements. If a few community members are interested in working on performance enhancements, I can propose them to the NETMF team as part of the next .NET MF release (and we can test them here). Chris

#20 BitFlipper

BitFlipper

    Advanced Member

  • Members
  • PipPipPip
  • 61 posts

Posted 03 July 2011 - 06:33 AM

Chris, yes that is something I would be interested in.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

home    hardware    projects    downloads    community    where to buy    contact Copyright © 2016 Wilderness Labs Inc.  |  Legal   |   CC BY-SA
This webpage is licensed under a Creative Commons Attribution-ShareAlike License.