Netduino home hardware projects downloads community

Jump to content


The Netduino forums have been replaced by new forums at community.wildernesslabs.co. This site has been preserved for archival purposes only and the ability to make new accounts or posts has been turned off.
Photo

encode.utf8 problem


  • Please log in to reply
5 replies to this topic

#1 stafil

stafil

    Member

  • Members
  • PipPip
  • 27 posts

Posted 02 November 2011 - 08:16 AM

Hi, all, I have a problem with encode.utf8 with a chars ľščťžýáíé. Please help me. Thx

#2 Stefan

Stefan

    Moderator

  • Members
  • PipPipPip
  • 1965 posts
  • LocationBreda, the Netherlands

Posted 02 November 2011 - 08:17 AM

Hi stafil, Can you post some code, so we could try to see what's going on? Which firmware version do you have on your Netduino?
"Fact that I'm a moderator doesn't make me an expert in things." Stefan, the eternal newb!
My .NETMF projects: .NETMF Toolbox / Gadgeteer Light / Some PCB designs

#3 stafil

stafil

    Member

  • Members
  • PipPip
  • 27 posts

Posted 02 November 2011 - 08:19 AM

I used 4.1 framework.

#4 stafil

stafil

    Member

  • Members
  • PipPip
  • 27 posts

Posted 02 November 2011 - 08:21 AM

using (FileStream fStr = new FileStream(@"\SD\aa.txt", FileMode.Open, FileAccess.Read))
                {
                    const int lenBuff = 1;
                    byte[] buff = new byte[lenBuff];
                    int bRead = 0;
                    char[] buffc = new char[lenBuff];
                    string line = "";


                    Debug.Print(DateTime.Now.TimeOfDay.ToString());

                    while ((bRead = fStr.Read(buff, 0, lenBuff)) > 0)
                    {
                             line += new string(Encoding.UTF8.GetChars(buff));
                            try
                            {
                                curSocket.Send(Encoding.UTF8.GetBytes(line), line.Length, SocketFlags.None);
                            }
                            catch (SocketException e)
                            {
                                Debug.Print("THROW");
                                //                   break;
                            }

}
}


#5 CW2

CW2

    Advanced Member

  • Members
  • PipPipPip
  • 1592 posts
  • LocationCzech Republic

Posted 02 November 2011 - 08:53 AM

Are you sure the file content is UTF8 encoded? If yes, you may need to skip the preamble, if present (0xEF, 0xBB, 0xBF). Also, one byte buffer is not enough for reading UTF8 encoded characters (they can take up to 4 bytes). Try reading and decoding the whole line.

#6 Stefan W.

Stefan W.

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 02 November 2011 - 09:36 AM

To elaborate further on that: There is no way to read a fixed number of bytes and then decode this as UTF-8, you might always end up in the middle of a codepoint. If the file is too big to fit in memory whole, you could, if the lines have a maximum size that fits in memory, do as CW2 says and read until end-of-line (assuming they use \n for end-of-line and not the more elaborate unicode line/page breaks ;)) and decode the line, which will work since the "usual" end-of-line character is a single byte, or you really have to buffer and specifically look for codepoint boundaries.
I believe that no discovery of fact, however trivial, can be wholly useless to the race, and that no trumpeting of falsehood, however virtuous in intent, can be anything but vicious.
-- H.L. Mencken, "What I Believe"




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

home    hardware    projects    downloads    community    where to buy    contact Copyright © 2016 Wilderness Labs Inc.  |  Legal   |   CC BY-SA
This webpage is licensed under a Creative Commons Attribution-ShareAlike License.