Javascript Disabled Detected

You currently have javascript disabled. Several functions may not work. Please re-enable javascript to access full functionality.

The Netduino forums have been replaced by new forums at community.wildernesslabs.co. This site has been preserved for archival purposes only and the ability to make new accounts or posts has been turned off.

Fluent Interop: Proof of Concept and Request for Comments

Started by Corey Kosak, Nov 22 2010 12:58 AM

Please log in to reply

13 replies to this topic

#1 Corey Kosak

Advanced Member

Members
276 posts

LocationHoboken, NJ

Posted 22 November 2010 - 12:58 AM

Executive summary:

Many people want some support for native (ARM) programming of the Netduino in a user-friendly way, usually to support some time-critical requirement. I claim that the best way to do so is via a "fluent" library. As a proof of concept, I have written such a library, and tested it by using it to reimplement the popular "BitBanger".

Lengthy manifesto:

The topic of generating ARM code keeps coming up (e.g. here or here or here or here).

There have been a couple of approaches discussed, but so far they seem to have significant drawbacks. The approaches (and their drawbacks) are:

Come up with a smorgasboard of standard routines, such as BitBanger, which are general-purpose enough to solve common problems (the drawback is you might be slowed down or stuck if you have a custom requirement unmet by these routines)
Compile your custom code into the firmware (requires major knowledge of ARM programming, NETMF internals, and toolchain; plus significantly slows down the compile-test-fix cycle; easy to introduce subtle bugs; impairs your ability to share your project with others)
SecretLabs makes a standard entry point into the firmware which allows storing ARM code there; the user's program compiles a C function externally and ships its bytes over (this is an improvement over the above, but still requires knowledge of C programming, toolchain, and perhaps also NETMF internals)

I claim that most people don't need to program the ARM in a 100% general-purpose way. What they need is the ability to create tight loops that act on the I/O pins with custom logic, and perhaps also to make responsive interrupt handlers. They also need a quick compile-test-fix cycle. Most Netduino users are learning as they go, and so they don't actually mind if their program crashes, but when it does crash they want to quickly fix it and try again. I think it's safe to say that any solution that involves reflashing the firmware is out.

Given all the above, I claim the best way to provide ARM functionality is via a Fluent Interface. This allows people to write their logic inside their familiar IDE environment. Although we cannot change the C# language, we can use various techniques to make the interface feel as much like regular programming as possible.

As a proof of concept, I've written such a fluent interface for the Netduino, and tested it by using it to implement part of BitBanger. Because it is a proof of concept, it does not actually execute code yet. However, it produces assembly output that is sufficiently legitimate to validate the approach.

The system assumes the existence of three parts:

An API on the firmware side that can allocate RAM, copy opcodes into that RAM, and execute (move the instruction pointer to) that RAM. (This part does not exist yet)
The ability to output real opcodes rather than assembly syntax (this also does not exist yet)
The ability to translate fluent programming constructs into assembly (This is what the proof of concept does)

To give you a better feel for what I am going for and what this program does, I think it may help to talk about some examples. I will do several in order of increasing complexity:

Calculating a+b
Turning a pin on and off X times
Finding a value in an array
BitBanger

I won't talk about interrupt handling, or the continuation/coroutine issues that Chris Walker brought up here. The reasons are because this is just a prototype, and there is a lot more work required to turn this into a valid approach. I would first like to find out if people like this idea. Maybe there are opportunities to improve it or collaborate.

Example 1: Calculating a+b:

Here is the code on the C# side:

using FluentInterop.CodeGeneration;
using FluentInterop.Deployment;
using Microsoft.SPOT;

namespace Driver {
  public static class Example1 {
    public static void Test() {
      //This is the function we want to build:
      //
      //int add(int x, int y) {
      //  return x+y;
      //}
      //
      //In order to simplify the system, the function we build
      //always fits into a standard signature with 8 arguments.
      //So really, the function we are bulding is
      //
      //int add(int x, int y, ignore, ignore, ignore, ignore, ignore, ignore) {
      //  return x+y;
      //}
      //
      //Hopefully that will make sense in the below:
      //(1) we provide 8 argument names (this is used for human readability)
      //(2) we get 8 arguments (plus the code generator) in our build callback
      //(3) we call the resulting delegate with 8 arguments

      //Now, the magic happens:
      var code=InteropBuilder.Create(
        "x", "y", null, null, null, null, null, null, //your argument names
        (g, x, y, _, __, ___, ____, _____, ______) => { //the code generator, plus your arguments
          g.Return(x+y); //similar to saying "return x+y;"
        });

      //we also have to call our function with 8 arguments
      var result=code.Invoke(3, 4, 0, 0, null, null, null, null); //calculates 7
      Debug.Print("result: "+result);

      code.Dispose();
    }
  }
}

There is clearly some boilerplate here that may be a little distracting, but the heart of the program is here:

  g.Return(x+y); //similar to saying "return x+y;"

By the way, this is what is both beautiful and eerie about the fluent interface. This code does not add x and y, nor does it return. This code "captures" the concept of return x+y so that a machine language program can be generated from it and executed later. We are writing a program that builds other programs. This is the first step towards the robot apocalypse.

By the way, the reason there are so many arguments is because I wanted to simplify the interface to the native side. From the native side point of view, every method you are generating has exactly the same signature. That signature is:

  int YourMethod(int i0, int i1, int i2, int i3, byte[] ba0, byte[] ba1, int[] wa0, int[] wa1);

The purpose of this is to keep things simple for both sides. Hopefully there are enough arguments in there for whatever people want to do. If you don't need an argument, just ignore it!

When you run this program, it produces the following output:

  MOV R0,#FUNCTION_ARGUMENT_BASE_ADDRESS //something like this
  MOV R1,#SCRATCH_MEMORY_BASE_ADDRESS //something like this
  LDR R2,[R0,0] //arg0 (x)
  LDR R3,[R0,4] //arg1 (y)
  ADD R0,R2,R3
  RETURN //I assume the ARM wants the return value in R0
  MOV R0,#0
  RETURN //I assume the ARM wants the return value in R0

Some points to observe:

As I said above, I am producing textual assembly output only. The work of producing numerical opcodes (as well as actually copying them over to the ARM side) still remains to be done.
Because I am new to ARM assembly, I've figured out what I could from reading ARM manuals, then I've made some assumptions. Some of what is produced is probably wrong but hopefully can be fixed. This is why the above assembly output says "something like this". I'm not sure what the convention is for passing arguments, but I'm going to pretend for now that they're all lined up starting at some base address.
I've tried to make certain simple optimizations, but inefficient stuff does creep in (like the double-return)

Example 2: Turning a pin on and off X times

Here is the C# code:

  public class Example2 {
    public static void Test() {
      //This is the function we want to build:
      //int OnOffALot(int count) {
      //  while(count>0) {
      //    pin.Write(true);
      //    pin.False(false);
      //    count--;
      //  }
      //}

      const Cpu.Pin pin=Pins.GPIO_PIN_D0;
      var code=InteropBuilder.Create(
        "count", null, null, null, null, null, null, null, //your argument names
        (g, count, _, __, ___, ____, _____, ______, _______) => { //the code generator, plus your arguments
          g.While(count>0, () => {
            g.SetPinState(pin, true);
            g.SetPinState(pin, false);
            g.Assign(count, count-1);
          });
        });

      code.Invoke(5000, 0, 0, 0, null, null, null, null); //turns the pin on and off 5000 times

      code.Dispose();
    }
  }

Here we see our first use of a control flow statement. The format is

g.While(condition, whileBodyLambda);

The "condition" looks sort of familiar, but the "whileBodyLambda" may seem sort of strange. Early on, I made the decision to do all my control flow by passing lambdas around. This is a very cool way to do things actually, and it makes the implementation extremely pleasant. I realize it might seem strange to people unfamiliar with functional programming. I hope people don't dislike this too much. As a preview, you can probably guess what "If" looks like. Yes, that's right:

g.If(condition, truePartLambda, falsePartLambda);

Anyway, here is the assembly output for the above program.

  MOV R0,#FUNCTION_ARGUMENT_BASE_ADDRESS //something like this
  MOV R1,#SCRATCH_MEMORY_BASE_ADDRESS //something like this
  B while0_condition
while0_body:
  CALL_SOMETHING (to set cpu pin 27 to True)
  CALL_SOMETHING (to set cpu pin 27 to False)
  LDR R3,[R0,0] //arg0 (count)
  SUB R2,R3,#1
  STR R2,[R0,0] //arg0 (count)
while0_condition:
  LDR R2,[R0,0] //arg0 (count)
  CMP R2,#0
  BGT while0_body
  MOV R0,#0
  RETURN //I assume the ARM wants the return value in R0

Here again we seem to be producing semi-reasonable code. Two things to notice:

I don't actually know what code I need to emit to turn on a pin, so I have stubbed this out as "CALL_SOMETHING"
There are a lot of optimization opportunities in this code, but my feeling is that we don't want to reimplement an optimizing compiler. It's possible that this code is "good enough" for many people's purposes

Example 3: Finding a value in an array

This is not likely to be useful in practice (because the overhead of copying data to the ARM side makes it pointless). However, it still hopefully has educational value.

using FluentInterop.CodeGeneration;
using FluentInterop.Deployment;

namespace Driver {
  public static class Example3 {
    private delegate int FriendlySignature(int offset, int length, int target, int[] wordData);

    public static void Test() {
      //This is what we are building
      //
      //int find(int offset, int length, int target, int[] wordData) {
      //  for(var index=offset; index<length; ++index) {
      //    if(wordData[index]==target) {
      //      return index;
      //    }
      //  }
      //  return -1;
      var handle=InteropBuilder.Create(
        "offset", "length", "target", null, null, null, "wordData", null,
        (g, offset, length, target, _, __, ___, wordData, _____) => {
          g.For(offset, length, 1, index =>
            g.If(wordData[index]==target,
              () => g.Return(index)));
          g.Return(-1);
        });

      //make a little "adapter" to make calling this thing more friendly
      FriendlySignature friendly=(offset, length, target, wordData) =>
        handle.Invoke(offset, length, target, 0, null, null, wordData, null);

      var data=new int[] { 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 };
      var index0=friendly(0, data.Length, 3, data); //find 3  (should return 7)
      var index1=friendly(0, data.Length, 77, data); //find 77 (should return -1)

      handle.Dispose();
    }
  }
}

If you look closely at the above, you can see some crazy stuff going on. I've got a "for", an "if", some array indexing(!!!), all in a relatively convenient package. By the way, the version of "for" I implemented looks like this:

For(inclusiveStart, exclusiveEnd, increment, loopIndexer => {body});

The cool (or strange, depending on your point of view) thing is that the "For" library allocates your loop indexer variable for you, and passes it to your lambda. This is a departure from C# style but it should feel sort of "functional". By the way, there are other variants of "For" that could be implemented easily enough. You could implement your own! If you look at the implementation of "For", you'll see it's not using any black magic; rather, it is implemented via the same API you've already seen. Here is a peek at the implementation of For:

    public static void For(this CodeGenerator g, Expression inclusiveStart, Expression exclusiveEnd,
      Expression increment, ActionWithExpression action) {

      g.AllocateTemporary("loopIndex", loopIndex => {
        g.Assign(loopIndex, inclusiveStart);
        g.While(loopIndex<exclusiveEnd, () => {
          action(loopIndex);
          g.Assign(loopIndex, loopIndex+increment);
        });
      });
    }

Anyway, before things get too confusing, here is the output of the original program:

  MOV R0,#FUNCTION_ARGUMENT_BASE_ADDRESS //something like this
  MOV R1,#SCRATCH_MEMORY_BASE_ADDRESS //something like this
  LDR R2,[R0,0] //arg0 (offset)
  STR R2,[R1,0] //scratch0 (loopIndex)
  B while0_condition
while0_body:
  LDR R3,[R0,24] //arg6 (wordData)
  LDR R4,[R1,0] //scratch0 (loopIndex)
  LDR R2,[R3,+R4,LSL 2]
  LDR R3,[R0,8] //arg2 (target)
  CMP R2,R3
  BNE conditional1_endif
  LDR R0,[R1,0] //scratch0 (loopIndex)
  RETURN //I assume the ARM wants the return value in R0
conditional1_endif:
  LDR R3,[R1,0] //scratch0 (loopIndex)
  ADD R2,R3,#1
  STR R2,[R1,0] //scratch0 (loopIndex)
while0_condition:
  LDR R2,[R1,0] //scratch0 (loopIndex)
  LDR R3,[R0,4] //arg1 (length)
  CMP R2,R3
  BLT while0_body
  MOV R0,#-1
  RETURN //I assume the ARM wants the return value in R0
  MOV R0,#0
  RETURN //I assume the ARM wants the return value in R0

If you've gotten good at reading assembly code which is perhaps-or-perhaps-not actually conformant to the ARM specification, you will agree that this code seems to be doing the right thing. Not too shabby. As my final example I want to show my implementation of BitBanger in this fluent style. This is my most complicated example. To best understand it, we should think of there being three different players involved:

The Fluent library
The Fluent BitBanger author
The customer

First, let us see the code as the customer would use it:

    public static void FluentBangerTest() {
      const Cpu.Pin clkPin=Pins.GPIO_PIN_D0;
      const Cpu.Pin dataPin=Pins.GPIO_PIN_D1;
      var data=new byte[]{1,2,3,4,5};
      using(var bb=new FluentBanger(clkPin, dataPin, true, false)) {
        bb.Write(data, 0, data.Length);
      }
    }

This is similar to the examples provided for the original BitBanger, by sweetlimre.

Now we can look at how the FluentBanger class is implemented:

using System;
using FluentInterop.CodeGeneration;
using FluentInterop.Deployment;
using Microsoft.SPOT.Hardware;

namespace FluentBitBanger {
  public sealed class FluentBanger : IDisposable {
    private readonly InteropHandle handle;

    public FluentBanger(Cpu.Pin clockPin, Cpu.Pin dataPin, bool risingClock, bool bigEndian) {
      //use dynamic code generation techniques with a "fluent" syntax,
      //in order build a routine with the following pseudocode:
      //
      //int doit(byte[] data, int offset, int length)
      //  while(length>0) {
      //    var nextByte=data[offset]
      //    foreach(nextBit in nextByte, going in the direction of "bigEndian") {
      //      SetPinState(clockPin, !risingClock);
      //      if(nextBit) {
      //        SetPinState(data, true);
      //      } else {
      //        SetPinState(data, false);
      //      }
      //      SetPinState(clockPin, risingClock);
      //    }
      //    ++offset;
      //    --length;
      //  }
      //  return 0; (the default return value is zero unless you do something)
      this.handle=InteropBuilder.Create(
        "offset", "length", null, null, "data", null, null, null,
        (g, offset, length, _, __, data, ___, ____, _____) => {
        g.While(length>0, () => {
          g.AllocateTemporary("nextByte", nextByte => {
            g.Assign(nextByte, data[offset]);
            g.ForEachBit(nextByte, 0, 8, bigEndian, nextBit => {
              g.SetPinState(clockPin, !risingClock);
              g.If(nextBit!=0,
                () => g.SetPinState(dataPin, true),
                () => g.SetPinState(dataPin, false));
              g.SetPinState(clockPin, risingClock);
            });
          });
          g.Assign(offset, offset+1);
          g.Assign(length, length-1);
        });
      });
    }

    public void Dispose() {
      handle.Dispose();
    }

    /// <summary>
    /// A user-friendly interface to our compiled code, which
    /// adapts the signature we want to the standard signature
    /// </summary>
    public void Write(byte[] data, int offset, int length) {
      handle.Invoke(offset, length, 0, 0, data, null, null, null);
    }
  }
}

This looks sort of like stuff we've seen so far. There is one juicy extension method "ForEachBit", which was provided to make the code easier to write. As with everything else, it is built using the library API:

    public static void ForEachBit(this CodeGenerator g, Expression expr, int offset, int count,
      bool bigEndian, ActionWithExpression action) {

      int inclusiveStart;
      int exclusiveEnd;
      int increment;
      if(!bigEndian) {
        inclusiveStart=offset;
        exclusiveEnd=offset+count;
        increment=1;
      } else {
        inclusiveStart=offset+count-1;
        exclusiveEnd=offset-1;
        increment=-1;
      }

      g.For(inclusiveStart, exclusiveEnd, increment, loopCounter => g.AllocateTemporary("mask", mask => {
        g.Assign(mask, Expression.ShiftLeft(1, loopCounter));
        action(mask);
      }));
    }

The ability that anyone can provide composable building blocks such as ForEachBit is perhaps the biggest win of this library.

And now, the assembly output of BitBanger:

  MOV R0,#FUNCTION_ARGUMENT_BASE_ADDRESS //something like this
  MOV R1,#SCRATCH_MEMORY_BASE_ADDRESS //something like this
  B while0_condition
while0_body:
  LDR R3,[R0,16] //arg4 (data)
  LDR R4,[R0,0] //arg0 (offset)
  LDRB R2,[R3,+R4]
  STR R2,[R1,0] //scratch0 (nextByte)
  MOV R2,#0
  STR R2,[R1,4] //scratch1 (loopIndex)
  B while1_condition
while1_body:
  MOV R3,#1
  LDR R4,[R1,4] //scratch1 (loopIndex)
  MOV R2,R3 LSL R4
  STR R2,[R1,8] //scratch2 (mask)
  CALL_SOMETHING (to set cpu pin 27 to False)
  LDR R2,[R1,8] //scratch2 (mask)
  CMP R2,#0
  BNE conditional2_then
  CALL_SOMETHING (to set cpu pin 28 to False)
  B conditional2_endif
conditional2_then:
  CALL_SOMETHING (to set cpu pin 28 to True)
conditional2_endif:
  CALL_SOMETHING (to set cpu pin 27 to True)
  LDR R3,[R1,4] //scratch1 (loopIndex)
  ADD R2,R3,#1
  STR R2,[R1,4] //scratch1 (loopIndex)
while1_condition:
  LDR R2,[R1,4] //scratch1 (loopIndex)
  CMP R2,#8
  BLT while1_body
  LDR R3,[R0,0] //arg0 (offset)
  ADD R2,R3,#1
  STR R2,[R0,0] //arg0 (offset)
  LDR R3,[R0,4] //arg1 (length)
  SUB R2,R3,#1
  STR R2,[R0,4] //arg1 (length)
while0_condition:
  LDR R2,[R0,4] //arg1 (length)
  CMP R2,#0
  BGT while0_body
  MOV R0,#0
  RETURN //I assume the ARM wants the return value in R0

Whew! That's the end of my manifesto. The solution file that created all of these examples is attached. (You'll want to open Driver\Driver.sln)

Please remember that there are a lot of things missing. I just built enough structure to make the above examples work. I would love to hear from the community regarding their reactions, whether people think this is a worthwhile approach, whether people want to work on making it real, etc. etc.

Attached Files

FluentInterop.zip 22.54KB 8 downloads

Back to top

#2 Chris Walker

Secret Labs Staff

Moderators
7767 posts

LocationNew York, NY

Posted 22 November 2010 - 01:03 AM

That's a pretty cool concept, Corey. Perhaps if we implemented #3 then you could deploy and call your embedded Fluent routines at runtime--all without recompiling the firmware? Chris

Back to top

#3 Charles

Advanced Member

Members
192 posts

Posted 22 November 2010 - 03:07 AM

Corey - That's a pretty slick system you have come up with there! The only down side to it is that if people are not paying close attention when using it, and it's not constructed just right, it opens the door to full compromise of the N+ system by viral, worm, or other code that could now survive reboot (since it could break out and write the flash memory). Take's away Chris Seto's contention that our projects would be safe from worms... http://forums.netdui...c/512-security/ Charles

Back to top

#4 Corey Kosak

Advanced Member

Members
276 posts

LocationHoboken, NJ

Posted 22 November 2010 - 05:19 AM

That's a pretty cool concept, Corey. Perhaps if we implemented #3 then you could deploy and call your embedded Fluent routines at runtime--all without recompiling the firmware?

Chris

Chris, that would be totally great! I'm not sure what the API should look like, but I think it should conform to these principles.

Be simple and not impose any policy. Policy should be imposed on the C# side. For example, the C# side may want to divide the native memory region into 10 different sub-buffers and put different executable code into each one. A simple API allows us to try out various competing approaches on the C# side (Fluent vs copying bytecodes vs whatever else we cook up) without being locked into one of them.

Allow copying between managed and unmanaged of both code and data. The copy routines may want to look like this:

void CopyToUnmanaged(byte[] sourceArray, int sourceOffset, int sourceCount, int targetAddress)
void CopyFromUnmanaged(int sourceAddress, int sourceLength, byte[] targetArray)

(this ensures that we can copy little chunks as small as we want without running out of memory on the C# side)

The reason we also want to copy data is that it is possible that some of our little native routines collect data that we want to read at a later time (for example, maybe they are taking some high-frequency samples)

Have some way to invoke code: I'm not sure how this should look. There needs to be a way to specify the execution address, pass arguments, and get a return value. In my proposal I assumed that there would be some interop call that looked like this:

int Invoke(int startAddress, int arg0, int arg1, int arg2, int arg3,
  byte[] ba0, byte[] ba1, int[] wa0, int[] wa1)

But I was making a lot of assumptions and hadn't really worked it out. Another API that would be more flexible if it could be made to work is:

int Invoke(int startAddress, object[] args)

There would need to be some interop code that picked through the args array and did the right thing for the types it encountered there. Here I'm still unsure what the return type should be (int? object?)

Another, completely different approach. Here's a brainwave. Suppose you set up some simple concepts (buffers, and some structured types to use for communication back and forth), and then allow C# to get unsafe pointers to the memory on the unmanaged side. This would give C# the freedom to do all the diddling it wanted to, and make it more likely that the firmware doesn't have to keep changing. I'm imagining code like the following:

   public class Program {
    public unsafe struct MagicControlStruct {
      public int command;
      public int data0;
      public int data1;
      public byte *array0;
      //...whatever
    }

    private static unsafe void Invoke(int data0, int data1) {
      MagicControlStruct* ssc=GetPointerToMagicControlStructOnTheNativeSide();
      //diddle
      ssc->command=0x514;
      ssc->data0=data0;
      ssc->data1=data1;
      ssc->array0=GetPointerToSomeData();
      InvokeNativeSide();
    }

    public static unsafe void ZeroBuffer(int count) {
      int* p=GetAddressOfSomeNativeBuffer();
      for(var i=0; i<count; ++i) {
        p[i]=0;
      }
    }
  }

Back to top

#5 Corey Kosak

Advanced Member

Members
276 posts

LocationHoboken, NJ

Posted 22 November 2010 - 05:52 AM

Another little comment. I'm really liking more and more the idea of shared memory of being the way to go for this. Making a refinement of the above program, it would even allow me to pass arrays to the native side without copying, which would be a massive advantage for getting data to go back and forth. Using the same arguments I was talking about in my proposal (although many variants are possible), we might have:

  public class Program {
    public unsafe struct InvocationStruct {
      public int data0;
      public int data1;
      public int data2;
      public int data3;
      public byte* ba0;
      public byte* ba1;
      public int* wa0;
      public int* wa1;

      public InvocationStruct(int data0, int data1, int data2, int data3,
        byte* ba0, byte* ba1, int* wa0, int* wa1) {

        this.data0=data0;
        this.data1=data1;
        this.data2=data2;
        this.data3=data3;
        this.ba0=ba0;
        this.ba1=ba1;
        this.wa0=wa0;
        this.wa1=wa1;
      }
    }

    public static unsafe void Invoke(int startAddress, int data0, int data1, int data2, int data3,
      byte[] ba0, byte[] ba1, int[] wa0, int[] wa1) {

      fixed(byte *ba0p=ba0, ba1p=ba1) {
        fixed(int *wa0p=wa0, wa1p=wa1) {
          var s=new InvocationStruct(data0, data1, data2, data3, ba0p, ba1p, wa0p, wa1p);
          MagicCallToInvokeNativeSide(startAddress, &s);
        }
      }
    }
  }

I hope this triggers some ideas.

Back to top

#6 Chris Seto

Advanced Member

Members
405 posts

Posted 22 November 2010 - 06:26 AM

RE: Security, I would be wiling to bet nobody in their right mind would ever target any embedded device specifically with a worm. It would just take too much effort for most likely, zero payback. Executing foreign code that reflashes the unit is a bit out there. It's not like the device even poses an attractive target; once you take control of it, there is nothing you can do. No offense to said embedded device, but a botnet running on ~3 (that's extremely generous!) low resource targets is going to be fairly useless. The target most likely doesn't hold any interesting information (RE: credit card numbers) so there is nothing interesting to harvest, either It's just too far off the path of something useful for anyone to spend any time trying it. Even if someone does invest a hefty amount of R&D into such a system and somehow makes something that works, how does the target even acquire the foreign code in the first place? Things don't just go crawling around on Ethernet looking for stuff in "infect".

Back to top

#7 CW2

Advanced Member

Members
1592 posts

LocationCzech Republic

Posted 22 November 2010 - 02:47 PM

I would love to hear from the community regarding their reactions, whether people think this is a worthwhile approach, whether people want to work on making it real, etc. etc.

This is an interesting concept, but I don't really understand how it differs from Just-In-Time (JIT) compiler (which is already implemented in .NET MF, but disabled for certain reasons)?

Back to top

#8 CW2

Advanced Member

Members
1592 posts

LocationCzech Republic

Posted 22 November 2010 - 03:18 PM

IMVHO it should be possible to modify the existing runtime to actually JIT a managed method, for example marked with a special attribute (or perhaps using existing one, such as [MethodImpl(MethodImplAttributes.Native)]) (?)

Back to top

#9 Corey Kosak

Advanced Member

Members
276 posts

LocationHoboken, NJ

Posted 22 November 2010 - 03:28 PM

This is an interesting concept, but I don't really understand how it differs from Just-In-Time (JIT) compiler (which is already implemented in .NET MF, but disabled for certain reasons)?

Indeed, the whole approach is predicated on my belief that there is no JIT (nor even an offline native image generator like NGEN) for the .NET Micro Framework.

Certainly no one has mentioned it before. If you're saying there is, how can we get it turned on? I'd be very interested in any pointers you have to web pages about how to do this. People have certainly been complaining about Netduino performance for long enough.

Back to top

#10 CW2

Advanced Member

Members
1592 posts

LocationCzech Republic

Posted 22 November 2010 - 04:06 PM

Certainly no one has mentioned it before. If you're saying there is, how can we get it turned on? I'd be very interested in any pointers you have to web pages about how to do this. People have certainly been complaining about Netduino performance for long enough.

There is a lot of code enclosed by #ifdef TINYCLR_JITTER conditions (mentioned in JIT support in TinyCLR? :- ). I have got some build errors with that symbol defined, but I have not had a chance yet to investigate it further. Collin Miller said in Hanselminutes episode that JIT was not implemented mainly due to flash issues - the generated native code is about 10 times larger than IL, and this caused performance problems when the code needs to be constantly reflashed because of limited size of available flash memory, and flashing is slow operation. However, I am not sure this is really an issue on platforms with megabytes of memory.

I think it should be possible to start with JIT that outputs generated code into RAM, to avoid flash-related issues (at least at the beginning) - there is not really a lot of RAM available on Netduino, but it should be enough for simple experiments, especially if they are done per-method. Another approach would be the NGEN-way (aka Ahead-Of-Time), perhaps some functionality from MetaDataProcessor combined with the JIT-ter...

Back to top

#11 sweetlilmre

Advanced Member

Members
62 posts

Posted 22 November 2010 - 07:53 PM

Hi, Wow... some really interesting ideas flowing here. I was thinking along the lines of writing a naked C function with some kind of function support table (read/write pin etc.) and then passing that as a byte stream to the firmware to execute. As far as the JIT approach goes, I am weary. Even with a JIT to native there is still the potential overhead of all the calls through the managed layers i.e. in the BitBanger code, everything is marshalled to native pointers and then the pins are toggled to provide the best speed advantage. Manged code even in JIT form would be calling other managed functions (Read / Write for pins) that would then be marshalling and calling native functions. The Fluent approach is very attractive from an easy-to-modify (and extend) point of view. Something that may be worth having a look at are the dynarec compilers used in emulators. These dynamically recompile code from a foreign instruction set into a native one. Looking at this type of code could give some good pointers in how to emit efficient assembly. I know that there exist dynarecs for Z80 <--> ARM and various others and perhaps this could be a starting point. Corey: I'll have a look at the code when I can get some time

-(e)

Back to top

#12 gedw99

New Member

Members
2 posts

Posted 13 January 2011 - 02:23 PM

Indeed, the whole approach is predicated on my belief that there is no JIT (nor even an offline native image generator like NGEN) for the .NET Micro Framework.

Certainly no one has mentioned it before. If you're saying there is, how can we get it turned on? I'd be very interested in any pointers you have to web pages about how to do this. People have certainly been complaining about Netduino performance for long enough.

Hey Corey,

Great to see this work.
1. i cant believe they are not NGening.. Explains so much why its so slow with just c#.

2. ELF invocation have been done on another .NET MF board.
http://www.microfram...g-rlp/#more-516
It would be nice for your framework to allow both. By this i mean go down the ELF route, or use your DSL.

I really hope that you put this on Github or something to get others involved in this.
I know a few guys that did PH'd in FPGA design; mostly because they hate ARM for various obfuscation reasons.

Also on another note. Secret Labs (Christ i beleive) can we get a SCM for Netduino ?
CodePlex has some amazing libraries now for .NET MF:
http://mftoolkit.codeplex.com/

Regars

Ged

Back to top

#13 Chris Walker

Secret Labs Staff

Moderators
7767 posts

LocationNew York, NY

Posted 13 January 2011 - 07:15 PM

...can we get a SCM for Netduino ?

Not familiar with the acronym (unless you're speaking of repositories...source control management). We are working with Microsoft to get the Netduino source up on CodePlex.

BTW, welcome to the Netduino community!

Chris

Back to top

#14 Corey Kosak

Advanced Member

Members
276 posts

LocationHoboken, NJ

Posted 14 January 2011 - 02:34 AM

Hey Corey,
Great to see this work.

Thanks ged! By the way, I've got a version of this which actually works. The discussion is over on the other thread

Thanks a lot for the RLP pointer. I will check it out!!!