Recently, while looking for some unneeded junk on ebay, I stumbled across a listing for an antique voice synthesiser chip, the SP0256. It made me remember that I might have some on-hand, and I looked in my junkbox and Lo! and Behold! amongst decaying antistatic foam were a couple samples, replete with the odd-valued crystal it prefers. So I thought I'd amuse myself for a couple hours and hook it up and reminisce the scarcely intelligible sounds from yesteryear
After the fun of a buzzy 'Hello' had worn off, I found myself inspired to do a particular Doors song, but one thing about this chip is that you have to manually specify a sequence of phoneme variants called 'allophones'. This is a bit time consuming, and so drains the fun, but by now I was doomed because improvement obsession had set in, and so I wasted several days implementing a text-to-speech algorithm for it (don't be too impressed, this is derivative from some research done in the 70s, and some concrete code done in years hence).
Anyway, I thought I'd share for your amusement if you're bored. (well, I can't really promise this will truly cure that, but I did try!):
While a trivial pursuit, there were some useful things I did learn that I want to share:
1) C#/NetMF is not great at constant data, like for tables of... 'stuff'
By 'not great' I mean 'horrible'. The text2speech uses about 2000 rules, consisting of 4 parts: three patterns and a phoneme sequence to emit for the match. This is the sort of stuff that in C, etc, you declare something like:
[color=#4b0082;][font="'courier new', courier, monospace;"] struct thingy { const char* first; const char* second; const char* third; const char* fourth; };[/color][/font]
[color=#4b0082;][font="'courier new', courier, monospace;"] static const _rules[] = { "l1", "m1", "r1", "f1" }, { "l2", "m2", "r2", "f2" }, };[/color][/font]
and when you did that, the linker would have the sense to put it all in ROM, thereby consuming no RAM. Because it is read-only; what do you need it in RAM for? (on some processors you need to add some compiler-specific qualifiers, or some linker magikry, but that is not pertinent here).
On netmf, apparently 'const' and 'readonly' are neither, and netmf will take such a naively constructed array and blithely instantiate it in RAM. Goodbye RAM, we'll miss you. And miss it, I did, because I consumed it all (about 105k available to user programs on the NP2) and had none with which to run my program. This was when I had a little over 300 rules that were derived from the US Navy research.
So in eagerness I cut a few, and it sounded like crap, but it ran, a little, with about 20K RAM.
So 'caveat programmer' on tables of data, I'll describe how I coped and with what results after explaining a few more things....
2) Netmf apps need more than about 15K RAM
If you go below this, the system will become unstable. You will need to furiously litter your code with
[color=#000080;][font="'courier new', courier, monospace;"] Debug.GC(true);[/color][/font]
[font="'courier new', courier, monospace;"][font="arial, helvetica, sans-serif;"] just to keep it alive. You'll see memory allocations fail on the debugger output, and your program will run for a little while, but things will come to an end soon enough.[/font][/font]
Nothing to say here other than realize that you will need to help netmf out by not getting to rammy with your impls, and that a [color=#000080;][font="'courier new', courier, monospace;"]GC(true)[/color][/font] now and then is most effectacious.
3) An array (i.e. []) of something will cost you RAM
all the const and readonly in the world apparently will not help you, there will be a RAM based array of references. Hmm!
4) Too many initializers makes your metadataprocessor mad
A companion to the RAM consumption problem is a build tool called the 'MetaDataProcessor'. It evidently processes the data that is meta, but it doesn't like it if you have too much. It will give you abstruse error messages (one of which I do not have a sample at the ready just now, alas), so like you just cut a few. Like when you have too many notes in your music score. Well, you coded them, and it's not like you're just typing them in to keep the blood flowing to your fingers on a cold winter's eve, they're important!
I found that the count of items is important, and the type drastically even more so. Which leads to the next point...
5) NetMF is pretty good at [color=rgb(0,0,128);][font="'courier new', courier, monospace;"]string[/color][/font]s
strings, which in dotnet and other recent languages of the javainian persuasion, are immutable, and the runtime seems to have some sense in that regard. So, if you declare a string constant, it's value actually will be stored in the Flash, and not come into RAM until you fiddle with it......
So, OK, this last bit is call
What I Did To Cram 2000 Structs Of Read Only Data Into My App And Only Waste 25k Of RAM,
or alternatively
thank Goodness My Processor Is Fast And Doesn't Have Better Things To Do.
I'll spare you the details of the 5 or so experiments I did before this final one -- you've been kind (or bored!) enough to read this far. Ultimately, I:
* flattened my jagged array into single array
and a secondary array of offsets to the starts of sections, and then manually treated it as an array-of-arrays in code.
This helped a little with the RAM usage; a small gain, but a gain nonetheless.
* transcoded my binary fields into text fields
this was surprisingly helpful, because evidently metadataprocessor gets really miffed at initializers for binary fields. way more so than for text fields. Hmmm. I'm going to have to look at this metadataprocessor someday. Anyway, this made is possible to even compile with a reasonable-sized rule set. I find it rather puts a damper on your fun if you can't compile, but maybe that's just me....
* flattened my structs into a string
this was a much more tedious, because I was effectively encoding the struct into a delimited format, and scootching the character set to avoid the field delimiter.
this was the big win by far. ultimately it was what made it possible to include the total rule set, and still have a healthy bit of RAM left over for the app. Mind you, in a sane world the RAM cost would be zero. Thats a '0' with infinitely many '0' after it. So I'm still a little disappointed (and some of my crypto code is as well because I need tables for my S-boxes etc and compiles fail sporadically because of it but that's a separate rant), but this project was for fun so whatevs.
Then, I punished my STM32F4 by making it depersist the encoded rule on-the-fly every time it was needed. Which really is a lot. And speech is slow to emit, and so there's time abounding to translate text into speech, and so you can keep it quite fluid if you have a worker thread that emits speech sequences from a queue, separate from your text-to-speech translation process
Oh! That reminds me that there is one other thing I learned in all this:
* Event Notifications Come In On a High(est!) Priority Thread
To make the demo, I added an [font="'courier new', courier, monospace;"][color=rgb(0,0,128);]InterruptPort[/color][/font] to execute one of my test sequences. Click the button, and text to speech something fun, and out comes sound. EasyPeasy. But, my speech came out sllooowwww, for a couple seconds, and then picked up the normal pace. Well, the reason why is that the button event handler gets invoked on a thread executing at priority 4 (highest), and my speech rendering worker thread was running on priority 2 (normal). So, my text-to-speech was running in a higher-priority thread than my output-to-synthesiser worker thread, and so was taking time away feeding the synth. Actually, I am impressed that the synthesizer thread was not starved out-right, so kudos to the scheduler implementer, but nonetheless the text to speech needed to be at or less than the priority of the output. The interrupt handler had nowhere to go but down, and anyway I felt dirty fiddling with it's priority since it was not my thread, and so I made yet another queue for text to speech work.
Oh, there were two last things I learned that are worth mentioning:
* arrays cost
For fun I tried setting all my rules to null. guess what? I still used 25k. So that ram usage seems to be for the array of object references, and it WILL be in RAM. No ROMming for you!
* netmf uses utf8
hahaha I was worried that encoding as a string would incur a UCS16 penalty, but it does seem that netmf uses UT8 for at least it's const string literals. Hollow victory, though, because once its rom-able, the space is less an issue. Still, interesting fact to note.
Anyway, there it is. I have spent a little more time than I should have on this, and now that I have posted, perhaps I can dismantle my breadboard and move on with more useful things!....