Hi, all,
I have a problem with encode.utf8 with a chars ľščťžýáíé.
Please help me.
Thx
encode.utf8 problem
Started by stafil, Nov 02 2011 08:16 AM
5 replies to this topic
#1
Posted 02 November 2011 - 08:16 AM
#2
Posted 02 November 2011 - 08:17 AM
Hi stafil,
Can you post some code, so we could try to see what's going on?
Which firmware version do you have on your Netduino?
"Fact that I'm a moderator doesn't make me an expert in things." Stefan, the eternal newb!
My .NETMF projects: .NETMF Toolbox / Gadgeteer Light / Some PCB designs
My .NETMF projects: .NETMF Toolbox / Gadgeteer Light / Some PCB designs
#3
Posted 02 November 2011 - 08:19 AM
I used 4.1 framework.
#4
Posted 02 November 2011 - 08:21 AM
using (FileStream fStr = new FileStream(@"\SD\aa.txt", FileMode.Open, FileAccess.Read)) { const int lenBuff = 1; byte[] buff = new byte[lenBuff]; int bRead = 0; char[] buffc = new char[lenBuff]; string line = ""; Debug.Print(DateTime.Now.TimeOfDay.ToString()); while ((bRead = fStr.Read(buff, 0, lenBuff)) > 0) { line += new string(Encoding.UTF8.GetChars(buff)); try { curSocket.Send(Encoding.UTF8.GetBytes(line), line.Length, SocketFlags.None); } catch (SocketException e) { Debug.Print("THROW"); // break; } } }
#5
Posted 02 November 2011 - 08:53 AM
Are you sure the file content is UTF8 encoded? If yes, you may need to skip the preamble, if present (0xEF, 0xBB, 0xBF). Also, one byte buffer is not enough for reading UTF8 encoded characters (they can take up to 4 bytes). Try reading and decoding the whole line.
#6
Posted 02 November 2011 - 09:36 AM
To elaborate further on that: There is no way to read a fixed number of bytes and then decode this as UTF-8, you might always end up in the middle of a codepoint. If the file is too big to fit in memory whole, you could, if the lines have a maximum size that fits in memory, do as CW2 says and read until end-of-line (assuming they use \n for end-of-line and not the more elaborate unicode line/page breaks ) and decode the line, which will work since the "usual" end-of-line character is a single byte, or you really have to buffer and specifically look for codepoint boundaries.
I believe that no discovery of fact, however trivial, can be wholly useless to the race, and that no trumpeting of falsehood, however virtuous in intent, can be anything but vicious.
-- H.L. Mencken, "What I Believe"
-- H.L. Mencken, "What I Believe"
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users