I have been working on a prototype for a little while and, although it is still in the early stages, I've just gotten it to do something useful and so I wanted to share my results.
Here is a Saleae screenshot of an implementation of BitBanger sending the bytes 'Hello' (if you look closely at the screenshot, just under the time axis, you can actually see the Saleae UI showing the ASCII characters)
I'm sending 40 bits in 16.6667 microseconds for a rate of 2.4 megabits/second. Not bad!
What follows is a walkthrough of how I think the system ought to work. The items in black text are working now; the items in red text are not working yet (and so one must work around them, e.g. by doing a bunch of laborious cutting and pasting). In a future version things will work more smoothly. The steps are somewhat involved, so I will try to go over them in detail.
These instructions are meant to show how the system will eventually work. The instructions are included here for completeness, but it is probably too early for you to try them on your system unless you are a glutton for punishment.
- Flash your Netduino with an interop-capable version of the firmware. Chris Walker has graciously built a custom version of the firmware which is based on v4.1.1 beta 1 but which includes the needed entry point. I have attached the firmware to this post.
- Download and build my SimpleNgen "compiler". This is a full-framework program meant to be run on your PC.
- Make a new C# project (that is, a Netduino project, not a full-framework project) which references my SimpleInterop library. Set the build options to "allow unsafe code" and "optimize code". The latter is not necessary but will probably make for better ARM output
- In the IDE, under "Add Class", make two new files: BitBanger.cs and BitBanger.Designer.cs
- The first file is the one into which you will write your code. [The second file will be written to by my compiler]
- Write your code conforming to the limitations outlined in the next section.
- [Configure a "post-build step" in your project to invoke my compiler. The result of this post-build step will be to write an array representing compiled code to BitBanger.Designer.cs]
- Compile your code once [if successful, it will invoke the post-build step and rewrite BitBanger.Designer.cs]. Then compile it a second time to include the compiled code in your solution.
Here is the source code I wrote for BitBanger.cs which appears in the attached NGgenSandbox.sln:
using SecretLabs.NETMF.Hardware.Netduino; namespace NgenSandbox { public unsafe static partial class BitBanger { private const int data=(int)Pins.GPIO_PIN_D0; private const int clock=(int)Pins.GPIO_PIN_D1; private const int enable=(int)Pins.GPIO_PIN_D2; public static int Bang(int length, int dummy1, int dummy2, int dummy3, byte *buffer) { //set up the three pins I need for SPI: MOSI, CLOCK, and ENABLE var dataPio=(int*)(Pio.BankBase+((data&0x20)<<4)); const int dataBitmask=1<<(data&0x1f); var clockPio=(int*)(Pio.BankBase+((clock&0x20)<<4)); const int clockBitmask=1<<(clock&0x1f); var enablePio=(int*)(Pio.BankBase+((enable&0x20)<<4)); const int enableBitmask=1<<(enable&0x1f); //initialize the hardware registers dataPio[Pio.PER]=dataBitmask; //enable data pin dataPio[Pio.OER]=dataBitmask; //configure data for output //don't care about the initial state of the data pin clockPio[Pio.PER]=clockBitmask; //enable clock pin clockPio[Pio.OER]=clockBitmask; //configure clock pin for output clockPio[Pio.CODR]=clockBitmask; //set clock initially low enablePio[Pio.PER]=enableBitmask; //enable SPI 'enable' pin enablePio[Pio.OER]=enableBitmask; //configure 'enable' pin for output enablePio[Pio.CODR]=enableBitmask; //set enable pin low to begin transaction //loop over my buffer for(var i=0; i<length; ++i) { var nextByte=buffer[i]; //loop over the bits in my byte for(var bitmask=0x80; bitmask!=0; bitmask>>=1) { clockPio[Pio.SODR]=clockBitmask; //set clock high if((nextByte&(bitmask))!=0) { dataPio[Pio.SODR]=dataBitmask; //set data } else { dataPio[Pio.CODR]=dataBitmask; //clear data } clockPio[Pio.CODR]=clockBitmask; //set clock low } } enablePio[Pio.SODR]=enableBitmask; //set enable pin high to end transaction return 0; } } public static class Pio { public const uint BankBase=0xfffff400; public const int PER=0; public const int OER=4; public const int ODR=5; public const int SODR=12; public const int CODR=13; public const int PDSR=15; } }
There are several things to note here:
- Most importantly, this is all written in C# (though admittedly making heavy use of unsafe constructs like pointers)
- The routine to be compiled is called "Bang" (in this case). The interop entry point requires a return type of int, and the following types (or some prefix of it) as the arguments: (int, int, int, int, Array, Array, Array, Array)
- Those of you old-timers who recall my 17-element Unicode argument list are probably breathing a sigh of relief
- For the Array items, you can mix-and-match any simple-type arrays you like (e.g. byte[], short[], int[]), so long as the caller and callee agree. There's no type-checking at this level, so take care to get the types right.
- The only types allowed in Bang are int, byte, int*, and byte*. I might support short if I get around to it. In particular there's certainly no support for any reference type, so forget about calling new here or using strings. You can't even use arrays (a reference type); you have to use pointer types. This latter concern is what forces you to mark the class with the unsafe keyword. In this case, a completely apt keyword.
- The class is also marked with the partial keyword. [This is because the compiler will write its output to another part of the partial class defined in BitBanger.Designer.cs]
- Notice I can reference other classes (as I did here with the Pio class) but only in a very limited way: to pull in constant values and enumeration values.
- I have hardcoded the pin numbers at compile time (not necessary but they make the code run fast)
- I'm directly accessing the I/O registers for fast throughput
Once you compile the above program in Visual Studio, the C# compiler will write out the MSIL (".NET") opcodes to the DLL/EXE/wherever it puts them. In the final system, the SimpleNgen compiler would at this point automatically be invoked to read those opcodes and translate them into ARM. In the current system that doesn't exist yet. So what we have to do instead is this extremely laborious and time-intensive process.
- Run ildasm
- Copy the opcodes to the clipboard
- Paste them into the source code of the compiler, and rewrite them to the compiler's taste
- Run the compiler
- Paste its output back into BitBanger.Designer.cs
- Finally, run your program on the Netduino
Again, in the real system most of the above would be automated
Here is the output of ildasm on the above program:
.method public hidebysig static int32 Bang(int32 length, int32 dummy1, int32 dummy2, int32 dummy3, uint8* buffer) cil managed { // Code size 185 (0xb9) .maxstack 2 .locals init ([0] int32* dataPio, [1] int32* clockPio, [2] int32* enablePio, [3] int32 i, [4] uint8 nextByte, [5] int32 bitmask) IL_0000: ldc.i4 0xfffff400 IL_0005: conv.u IL_0006: stloc.0 IL_0007: ldc.i4 0xfffff400 IL_000c: conv.u IL_000d: stloc.1 IL_000e: ldc.i4 0xfffff400 IL_0013: conv.u IL_0014: stloc.2 IL_0015: ldloc.0 IL_0016: ldc.i4 0x8000000 IL_001b: stind.i4 IL_001c: ldloc.0 IL_001d: ldc.i4.s 16 IL_001f: conv.i IL_0020: add IL_0021: ldc.i4 0x8000000 IL_0026: stind.i4 IL_0027: ldloc.1 IL_0028: ldc.i4 0x10000000 IL_002d: stind.i4 IL_002e: ldloc.1 IL_002f: ldc.i4.s 16 IL_0031: conv.i IL_0032: add IL_0033: ldc.i4 0x10000000 IL_0038: stind.i4 IL_0039: ldloc.1 IL_003a: ldc.i4.s 52 IL_003c: conv.i IL_003d: add IL_003e: ldc.i4 0x10000000 IL_0043: stind.i4 IL_0044: ldloc.2 IL_0045: ldc.i4.1 IL_0046: stind.i4 IL_0047: ldloc.2 IL_0048: ldc.i4.s 16 IL_004a: conv.i IL_004b: add IL_004c: ldc.i4.1 IL_004d: stind.i4 IL_004e: ldloc.2 IL_004f: ldc.i4.s 52 IL_0051: conv.i IL_0052: add IL_0053: ldc.i4.1 IL_0054: stind.i4 IL_0055: ldc.i4.0 IL_0056: stloc.3 IL_0057: br.s IL_00ac IL_0059: ldarg.s buffer IL_005b: ldloc.3 IL_005c: add IL_005d: ldind.u1 IL_005e: stloc.s nextByte IL_0060: ldc.i4 0x80 IL_0065: stloc.s bitmask IL_0067: br.s IL_00a4 IL_0069: ldloc.1 IL_006a: ldc.i4.s 48 IL_006c: conv.i IL_006d: add IL_006e: ldc.i4 0x10000000 IL_0073: stind.i4 IL_0074: ldloc.s nextByte IL_0076: ldloc.s bitmask IL_0078: and IL_0079: brfalse.s IL_0088 IL_007b: ldloc.0 IL_007c: ldc.i4.s 48 IL_007e: conv.i IL_007f: add IL_0080: ldc.i4 0x8000000 IL_0085: stind.i4 IL_0086: br.s IL_0093 IL_0088: ldloc.0 IL_0089: ldc.i4.s 52 IL_008b: conv.i IL_008c: add IL_008d: ldc.i4 0x8000000 IL_0092: stind.i4 IL_0093: ldloc.1 IL_0094: ldc.i4.s 52 IL_0096: conv.i IL_0097: add IL_0098: ldc.i4 0x10000000 IL_009d: stind.i4 IL_009e: ldloc.s bitmask IL_00a0: ldc.i4.1 IL_00a1: shr IL_00a2: stloc.s bitmask IL_00a4: ldloc.s bitmask IL_00a6: brtrue.s IL_0069 IL_00a8: ldloc.3 IL_00a9: ldc.i4.1 IL_00aa: add IL_00ab: stloc.3 IL_00ac: ldloc.3 IL_00ad: ldarg.0 IL_00ae: blt.s IL_0059 IL_00b0: ldloc.2 IL_00b1: ldc.i4.s 48 IL_00b3: conv.i IL_00b4: add IL_00b5: ldc.i4.1 IL_00b6: stind.i4 IL_00b7: ldc.i4.0 IL_00b8: ret } // end of method BitBanger::Bang
And now, for the sake of completeness/sadism, here is that code after being "pasted"/"translated" into what the compiler currently needs (again, this will be much improved in future versions). Again, you should think of this as simply a giant crutch that I need to use in order to get the system to do something. In the real version you will not have to do any of this. This code appears in SimpleNGen.sln in the file testCases/BitBanger.cs
using System; using System.Collections.Generic; using System.Linq; using System.Text; using Kosak.SimpleNGen.compiler; namespace Kosak.SimpleNGen.testCases { public static class BitBanger { public static void Doit() { // .method public hidebysig static int32 BitBanger(int32 length, // int32 dummy1, // int32 dummy2, // int32 dummy3, // uint8* buffer) cil managed var args=new[] { typeof(int), typeof(int), typeof(int), typeof(int), typeof(byte*) }; // Code size 185 (0xb9) // .maxstack 2 // .locals init ([0] int32* dataPio, // [1] int32* clockPio, // [2] int32* enablePio, // [3] int32 i, // [4] uint8 nextByte, // [5] int32 bitmask) var locals=new[] { typeof(int*), typeof(int*), typeof(int*), typeof(int), typeof(byte), typeof(int) }; var mb=new TinyMethodBuilder(args, locals); // IL_0000: ldc.i4 0xfffff400 mb.Emit.Ldc_I4(unchecked((int)0xfffff400)); // IL_0005: conv.u mb.Emit.Conv_U(); // IL_0006: stloc.0 mb.Emit.Stloc(0); // IL_0007: ldc.i4 0xfffff400 mb.Emit.Ldc_I4(unchecked((int)0xfffff400)); // IL_000c: conv.u mb.Emit.Conv_U(); // IL_000d: stloc.1 mb.Emit.Stloc(1); // IL_000e: ldc.i4 0xfffff400 mb.Emit.Ldc_I4(unchecked((int)0xfffff400)); // IL_0013: conv.u mb.Emit.Conv_U(); // IL_0014: stloc.2 mb.Emit.Stloc(2); // IL_0015: ldloc.0 mb.Emit.Ldloc(0); // IL_0016: ldc.i4 0x8000000 mb.Emit.Ldc_I4(0x8000000); // IL_001b: stind.i4 mb.Emit.Stind_I4(); // IL_001c: ldloc.0 mb.Emit.Ldloc(0); // IL_001d: ldc.i4.s 16 mb.Emit.Ldc_I4(16); // IL_001f: conv.i mb.Emit.Conv_I(); // IL_0020: add mb.Emit.Add(); // IL_0021: ldc.i4 0x8000000 mb.Emit.Ldc_I4(0x8000000); // IL_0026: stind.i4 mb.Emit.Stind_I4(); // IL_0027: ldloc.1 mb.Emit.Ldloc(1); // IL_0028: ldc.i4 0x10000000 mb.Emit.Ldc_I4(0x10000000); // IL_002d: stind.i4 mb.Emit.Stind_I4(); // IL_002e: ldloc.1 mb.Emit.Ldloc(1); // IL_002f: ldc.i4.s 16 mb.Emit.Ldc_I4(16); // IL_0031: conv.i mb.Emit.Conv_I(); // IL_0032: add mb.Emit.Add(); // IL_0033: ldc.i4 0x10000000 mb.Emit.Ldc_I4(0x10000000); // IL_0038: stind.i4 mb.Emit.Stind_I4(); // IL_0039: ldloc.1 mb.Emit.Ldloc(1); // IL_003a: ldc.i4.s 52 mb.Emit.Ldc_I4(52); // IL_003c: conv.i mb.Emit.Conv_I(); // IL_003d: add mb.Emit.Add(); // IL_003e: ldc.i4 0x10000000 mb.Emit.Ldc_I4(0x10000000); // IL_0043: stind.i4 mb.Emit.Stind_I4(); // IL_0044: ldloc.2 mb.Emit.Ldloc(2); // IL_0045: ldc.i4.1 mb.Emit.Ldc_I4(1); // IL_0046: stind.i4 mb.Emit.Stind_I4(); // IL_0047: ldloc.2 mb.Emit.Ldloc(2); // IL_0048: ldc.i4.s 16 mb.Emit.Ldc_I4(16); // IL_004a: conv.i mb.Emit.Conv_I(); // IL_004b: add mb.Emit.Add(); // IL_004c: ldc.i4.1 mb.Emit.Ldc_I4(1); // IL_004d: stind.i4 mb.Emit.Stind_I4(); // IL_004e: ldloc.2 mb.Emit.Ldloc(2); // IL_004f: ldc.i4.s 52 mb.Emit.Ldc_I4(52); // IL_0051: conv.i mb.Emit.Conv_I(); // IL_0052: add mb.Emit.Add(); // IL_0053: ldc.i4.1 mb.Emit.Ldc_I4(1); // IL_0054: stind.i4 mb.Emit.Stind_I4(); // IL_0055: ldc.i4.0 mb.Emit.Ldc_I4(0); // IL_0056: stloc.3 mb.Emit.Stloc(3); // IL_0057: br.s IL_00ac mb.Emit.Br("IL_00ac"); // IL_0059: ldarg.s buffer mb.Label("IL_0059").Emit.Ldarg(4); // IL_005b: ldloc.3 mb.Emit.Ldloc(3); // IL_005c: add mb.Emit.Add(); // IL_005d: ldind.u1 mb.Emit.Ldind_U1(); // IL_005e: stloc.s nextByte mb.Emit.Stloc(4); // IL_0060: ldc.i4 0x80 mb.Emit.Ldc_I4(0x80); // IL_0065: stloc.s bitmask mb.Emit.Stloc(5); // IL_0067: br.s IL_00a4 mb.Emit.Br("IL_00a4"); // IL_0069: ldloc.1 mb.Label("IL_0069").Emit.Ldloc(1); // IL_006a: ldc.i4.s 48 mb.Emit.Ldc_I4(48); // IL_006c: conv.i mb.Emit.Conv_I(); // IL_006d: add mb.Emit.Add(); // IL_006e: ldc.i4 0x10000000 mb.Emit.Ldc_I4(0x10000000); // IL_0073: stind.i4 mb.Emit.Stind_I4(); // IL_0074: ldloc.s nextByte mb.Emit.Ldloc(4); // IL_0076: ldloc.s bitmask mb.Emit.Ldloc(5); // IL_0078: and mb.Emit.And(); // IL_0079: brfalse.s IL_0088 mb.Emit.Brfalse("IL_0088"); // IL_007b: ldloc.0 mb.Emit.Ldloc(0); // IL_007c: ldc.i4.s 48 mb.Emit.Ldc_I4(48); // IL_007e: conv.i mb.Emit.Conv_I(); // IL_007f: add mb.Emit.Add(); // IL_0080: ldc.i4 0x8000000 mb.Emit.Ldc_I4(0x8000000); // IL_0085: stind.i4 mb.Emit.Stind_I4(); // IL_0086: br.s IL_0093 mb.Emit.Br("IL_0093"); // IL_0088: ldloc.0 mb.Label("IL_0088").Emit.Ldloc(0); // IL_0089: ldc.i4.s 52 mb.Emit.Ldc_I4(52); // IL_008b: conv.i mb.Emit.Conv_I(); // IL_008c: add mb.Emit.Add(); // IL_008d: ldc.i4 0x8000000 mb.Emit.Ldc_I4(0x8000000); // IL_0092: stind.i4 mb.Emit.Stind_I4(); // IL_0093: ldloc.1 mb.Label("IL_0093").Emit.Ldloc(1); // IL_0094: ldc.i4.s 52 mb.Emit.Ldc_I4(52); // IL_0096: conv.i mb.Emit.Conv_I(); // IL_0097: add mb.Emit.Add(); // IL_0098: ldc.i4 0x10000000 mb.Emit.Ldc_I4(0x10000000); // IL_009d: stind.i4 mb.Emit.Stind_I4(); // IL_009e: ldloc.s bitmask mb.Emit.Ldloc(5); // IL_00a0: ldc.i4.1 mb.Emit.Ldc_I4(1); // IL_00a1: shr mb.Emit.Shr(); // IL_00a2: stloc.s bitmask mb.Emit.Stloc(5); // IL_00a4: ldloc.s bitmask mb.Label("IL_00a4").Emit.Ldloc(5); // IL_00a6: brtrue.s IL_0069 mb.Emit.Brtrue("IL_0069"); // IL_00a8: ldloc.3 mb.Emit.Ldloc(3); // IL_00a9: ldc.i4.1 mb.Emit.Ldc_I4(1); // IL_00aa: add mb.Emit.Add(); // IL_00ab: stloc.3 mb.Emit.Stloc(3); // IL_00ac: ldloc.3 mb.Label("IL_00ac").Emit.Ldloc(3); // IL_00ad: ldarg.0 mb.Emit.Ldarg(0); // IL_00ae: blt.s IL_0059 mb.Emit.Blt("IL_0059"); // IL_00b0: ldloc.2 mb.Emit.Ldloc(2); // IL_00b1: ldc.i4.s 48 mb.Emit.Ldc_I4(48); // IL_00b3: conv.i mb.Emit.Conv_I(); // IL_00b4: add mb.Emit.Add(); // IL_00b5: ldc.i4.1 mb.Emit.Ldc_I4(1); // IL_00b6: stind.i4 mb.Emit.Stind_I4(); // IL_00b7: ldc.i4.0 mb.Emit.Ldc_I4(0); // IL_00b8: ret mb.Emit.Ret(); TinyMethodCompiler.Compile(mb.Finish()); } } }
If you've read this far without wanting to end your own life, then I am very impressed and wish to reward you with the output of the compiler (which you will also see if you run the attached SimpleNgen.sln on your computer):
*** Final Output *** 0000: E92D07F0 enter STMDB SP!,{R4-R10} 1110|100|1|0|0|1|0|1101|0000011111110000 0004: E1A07000 PRO0 MOV R7,R0 1110|00|0|1101|0|0000|0111|00000|00|0|0000 0008: E59D501C PRO4 LDR R5,[SP,#0x1C] 1110|01|0|1|1|0|0|1|1101|0101|000000011100 000C: E59F9080 L0.1 LDR R9,[PC,const(FFFFF400)] 1110|01|0|1|1|0|0|1|1111|1001|000010000000 0010: E3A03302 L10.1 MOV R3,#0x8000000 1110|00|1|1101|0|0000|0011|0011|00000010 0014: E5893000 L11 STR R3,[R9,#0x0] 1110|01|0|1|1|0|0|0|1001|0011|000000000000 0018: E5893010 L17 STR R3,[R9,#0x10] 1110|01|0|1|1|0|0|0|1001|0011|000000010000 001C: E3A04201 L19.1 MOV R4,#0x10000000 1110|00|1|1101|0|0000|0100|0010|00000001 0020: E5894000 L20 STR R4,[R9,#0x0] 1110|01|0|1|1|0|0|0|1001|0100|000000000000 0024: E5894010 L26 STR R4,[R9,#0x10] 1110|01|0|1|1|0|0|0|1001|0100|000000010000 0028: E5894034 L32 STR R4,[R9,#0x34] 1110|01|0|1|1|0|0|0|1001|0100|000000110100 002C: E3A08001 L34.1 MOV R8,#0x1 1110|00|1|1101|0|0000|1000|0000|00000001 0030: E5898000 L35 STR R8,[R9,#0x0] 1110|01|0|1|1|0|0|0|1001|1000|000000000000 0034: E5898010 L41 STR R8,[R9,#0x10] 1110|01|0|1|1|0|0|0|1001|1000|000000010000 0038: E5898034 L47 STR R8,[R9,#0x34] 1110|01|0|1|1|0|0|0|1001|1000|000000110100 003C: E3A0A000 L48.1 MOV R10,#0x0 1110|00|1|1101|0|0000|1010|0000|00000000 0040: E3A06000 L49 MOV R6,#0x0 1110|00|1|1101|0|0000|0110|0000|00000000 0044: EA00000C L50 B L94 1110|101|0|000000000000000000001100 0048: E7D51006 L53 LDRB R1,[R5,R6] 1110|01|1|1|1|1|0|1|0101|0001|00000|00|0|0110 004C: E3A02080 L56 MOV R2,#0x80 1110|00|1|1101|0|0000|0010|0000|10000000 0050: EA000006 L57 B L88.2 1110|101|0|000000000000000000000110 0054: E5894030 L62 STR R4,[R9,#0x30] 1110|01|0|1|1|0|0|0|1001|0100|000000110000 0058: E0010002 L65 AND R0,R1,R2 1110|00|0|0000|0|0001|0000|00000|00|0|0010 005C: E3500000 L66.2 CMPS R0,#0x0 1110|00|1|1010|1|0000|0000|0000|00000000 0060: 15893030 L72 STRNE R3,[R9,#0x30] 0001|01|0|1|1|0|0|0|1001|0011|000000110000 0064: 05893034 L78 STREQ R3,[R9,#0x34] 0000|01|0|1|1|0|0|0|1001|0011|000000110100 0068: E5894034 L83 STR R4,[R9,#0x34] 1110|01|0|1|1|0|0|0|1001|0100|000000110100 006C: E1A02852 L86 MOV R2,R2 ASR R8 1110|00|0|1101|0|0000|0010|1000|0|10|1|0010 0070: E3520000 L88.2 CMPS R2,#0x0 1110|00|1|1010|1|0010|0000|0000|00000000 0074: 1AFFFFF6 L88.2.1 BNE L62 0001|101|0|111111111111111111110110 0078: E2866001 L91 ADD R6,R6,#0x1 1110|00|1|0100|0|0110|0110|0000|00000001 007C: E1560007 L94 CMPS R6,R7 1110|00|0|1010|1|0110|0000|00000|00|0|0111 0080: BAFFFFF0 L94.1 BLT L53 1011|101|0|111111111111111111110000 0084: E5898030 L100 STR R8,[R9,#0x30] 1110|01|0|1|1|0|0|0|1001|1000|000000110000 0088: E1A0000A L102 MOV R0,R10 1110|00|0|1101|0|0000|0000|00000|00|0|1010 008C: E8BD07F0 leave LDMUA SP!,{R4-R10} 1110|100|0|1|0|1|1|1101|0000011111110000 0090: E12FFF1E leave.1 BX LR 1110|000100101111111111110001|1110 0094: FFFFF400 const(FFFFF400)WORD 0xFFFFF400 11111111111111111111010000000000 *** C# Representation *** var code=new uint[] { 0xE92D07F0,0xE1A07000,0xE59D501C,0xE59F9080,0xE3A03302,0xE5893000,0xE5893010,0xE3A04201, 0xE5894000,0xE5894010,0xE5894034,0xE3A08001,0xE5898000,0xE5898010,0xE5898034,0xE3A0A000, 0xE3A06000,0xEA00000C,0xE7D51006,0xE3A02080,0xEA000006,0xE5894030,0xE0010002,0xE3500000, 0x15893030,0x05893034,0xE5894034,0xE1A02852,0xE3520000,0x1AFFFFF6,0xE2866001,0xE1560007, 0xBAFFFFF0,0xE5898030,0xE1A0000A,0xE8BD07F0,0xE12FFF1E,0xFFFFF400 }
It is just this little array at the end (the thing starting with "var code=new uint[]...") which the SimpleNgen compiler will eventually paste into BitBanger.Designer.cs. Until then, we will paste it manually.
Fans of assembly language will notice that I am generating ARM opcodes rather than Thumb as I did in Fluent. The reason for this is that (a) I wanted to get some experience generating a different set of opcodes, and (b ) I wanted easy access to more registers (this code uses 11 registers though if you read closely you can find a couple of places where my "optimizer" went a little crazy).
The bulk of the work in SimpleNgen is due to these optimizations. It's actually relatively easy to do a naive mapping from MSIL opcodes to ARM, but the resultant code would be a lot slower. My own optimizations are kind of hit and miss. In some cases I'm doing something awesome (notice I was able to use the conditional prefixes to do a branch-free if-then-else at lines 0060 and 0064) but in other cases I'm doing something wasteful (like the compare against zero in line 005C.
Anyway, back to the plot. [Once the compiler has written the code into BitBanger.Designer.cs] or you've hand-pasted it there, the invocation looks like this (this is in NGenSandbox.sln in Program.cs)
using Kosak.SimpleInterop; namespace NgenSandbox { public class Program { public static void Main() { var data=new [] {(byte)'H', (byte)'e', (byte)'l', (byte)'l', (byte)'o'}; NativeInterface.Execute(BitBanger.Bang_compiled, data.Length, 0, 0, 0, data, null, null, null); } } }
The arguments for this Execute method are, first, the reference to the compiled code, and then the four ints and four arrays as I mentioned above (this is why the length appears in the first int position and the data appears in the first array position). If it helps, this is the prototype to NativeInterface.Execute:
[MethodImpl(MethodImplOptions.InternalCall)] public static extern int Execute(uint[] code, int i0, int i1, int i2, int i3, Array a0, Array a1, Array a2, Array a3);
(The entry point compiled into the firmware turns the four array references into four pointers, though as I mentioned there is no type checking)
Thank you for reading and I hope you'll let me know what you think! I'll attach all the referenced files to my next post.