Javascript Disabled Detected

You currently have javascript disabled. Several functions may not work. Please re-enable javascript to access full functionality.

The Netduino forums have been replaced by new forums at community.wildernesslabs.co. This site has been preserved for archival purposes only and the ability to make new accounts or posts has been turned off.

encode.utf8 problem

Started by stafil, Nov 02 2011 08:16 AM

Please log in to reply

5 replies to this topic

#1 stafil

Member

Members
27 posts

Posted 02 November 2011 - 08:16 AM

Hi, all, I have a problem with encode.utf8 with a chars ľščťžýáíé. Please help me. Thx

Back to top

#2 Stefan

Moderator

Members
1965 posts

LocationBreda, the Netherlands

Posted 02 November 2011 - 08:17 AM

Hi stafil, Can you post some code, so we could try to see what's going on? Which firmware version do you have on your Netduino?

_{"Fact that I'm a moderator doesn't make me an expert in things." Stefan, the eternal newb!}
My .NETMF projects: .NETMF Toolbox / Gadgeteer Light / Some PCB designs

Back to top

#3 stafil

Member

Members
27 posts

Posted 02 November 2011 - 08:19 AM

I used 4.1 framework.

Back to top

#4 stafil

Member

Members
27 posts

Posted 02 November 2011 - 08:21 AM

using (FileStream fStr = new FileStream(@"\SD\aa.txt", FileMode.Open, FileAccess.Read))
                {
                    const int lenBuff = 1;
                    byte[] buff = new byte[lenBuff];
                    int bRead = 0;
                    char[] buffc = new char[lenBuff];
                    string line = "";


                    Debug.Print(DateTime.Now.TimeOfDay.ToString());

                    while ((bRead = fStr.Read(buff, 0, lenBuff)) > 0)
                    {
                             line += new string(Encoding.UTF8.GetChars(buff));
                            try
                            {
                                curSocket.Send(Encoding.UTF8.GetBytes(line), line.Length, SocketFlags.None);
                            }
                            catch (SocketException e)
                            {
                                Debug.Print("THROW");
                                //                   break;
                            }

}
}

Back to top

#5 CW2

Advanced Member

Members
1592 posts

LocationCzech Republic

Posted 02 November 2011 - 08:53 AM

Are you sure the file content is UTF8 encoded? If yes, you may need to skip the preamble, if present (0xEF, 0xBB, 0xBF). Also, one byte buffer is not enough for reading UTF8 encoded characters (they can take up to 4 bytes). Try reading and decoding the whole line.

Back to top

#6 Stefan W.

Advanced Member

Members
153 posts

Posted 02 November 2011 - 09:36 AM

To elaborate further on that: There is no way to read a fixed number of bytes and then decode this as UTF-8, you might always end up in the middle of a codepoint. If the file is too big to fit in memory whole, you could, if the lines have a maximum size that fits in memory, do as CW2 says and read until end-of-line (assuming they use \n for end-of-line and not the more elaborate unicode line/page breaks

) and decode the line, which will work since the "usual" end-of-line character is a single byte, or you really have to buffer and specifically look for codepoint boundaries.

I believe that no discovery of fact, however trivial, can be wholly useless to the race, and that no trumpeting of falsehood, however virtuous in intent, can be anything but vicious.
-- H.L. Mencken, "What I Believe"