This article has been
excerpted from book "The Complete Visual C# Programmer's Guide" from the Authors
of C# Corner.
All strings in a .NET Framework program are stored as 16-bit Unicode characters.
At times you might need to convert from Unicode to some other character
encoding, or from some other character encoding to Unicode. The .NET Framework
provides several classes for encoding (converting Unicode characters to a block
of bytes in another encoding) and decoding (converting a block of bytes in
another encoding to Unicode characters.
The System.Text namespace has a number of Encoding implementations:
- The ASCIIEncoding class encodes Unicode
characters as single 7-bit ASCII characters. This class supports only
character values between U+0000 and U+007F.
- The UnicodeEncoding class encodes each
Unicode character as two consecutive bytes. This supports both little-endian
(code page 1200) and big-endian (code page 1201) byte orders.
- The UTF7Encoding class encodes Unicode
characters using UTF-7 encoding (UTF-7 stands for UCS Transformation Format,
8-bit form). This supports all Unicode character values and can also be
accessed as code page 65000.
- The UTF8Encoding class encodes Unicode
characters using UTF-8 encoding (UTF-8 stands for UCS Transformation Format,
8-bit form). This supports all Unicode character values and can also be
accessed as code page 65001.
Each of these classes has methods for both
encoding (such as GetBytes) and decoding (such as GetChars) a single array all
at once. In addition, each supports GetEncoder and GetDecoder, which return
encoders and decoders capable of maintaining shift state so they can be used
with streams and blocks.
Listing 20.33 shows various forms of the Encoding class.
Listing 20.33: Encoding and Decoding
// writing
FileStream fs =
new FileStream("text.txt",
FileMode.OpenOrCreate);
StreamWriter t =
new StreamWriter(fs,
Encoding.UTF8);
t.Write("This is in UTF8");
//or
// reading
FileStream fs =
new FileStream("text.txt",
FileMode.Open);
StreamReader t =
new StreamReader(fs,
Encoding.UTF8);
String s = t.ReadLine();
Listing 20.34 makes a Web page request and then encodes the bytes returned/read
as ASCII characters.
Listing 20.34: String Encoding
// encoding example
using System;
using System.Net;
using System.IO;
using System.Text;
class
MyApp
{
static void
Main()
{
try
{
WebRequest theRequest =
WebRequest.Create(@"http://www.mindcracker.com");
WebResponse theResponse =
theRequest.GetResponse();
int BytesRead = 0;
Byte[] Buffer =
new Byte[256];//
Buffer Size
Stream ResponseStream =
theResponse.GetResponseStream();
BytesRead = ResponseStream.Read(Buffer, 0, 256);
StringBuilder strResponse =
new StringBuilder(@"");
while (BytesRead != 0)
{
// Returns an encoding for the ASCII
(7 bit) character set
// ASCII characters are limited
to the lowest 128 Unicode
// characters
// , from U+0000 to U+007f.
strResponse.Append(Encoding.ASCII.GetString(Buffer,
0, BytesRead));
BytesRead = ResponseStream.Read(Buffer, 0, 256);
}
Console.Write(strResponse.ToString());
}
catch (Exception
e)
{
Console.Write("Exception
Occured!{0}", e.ToString());
}
}
}
Conclusion
Hope this article would have helped you in understanding the
String Encoding/Decoding and Conversions in C#. See other articles on the website on .NET and C#.
|
The Complete Visual
C# Programmer's Guide covers most of the major components that make
up C# and the .net environment. The book is geared toward the
intermediate programmer, but contains enough material to satisfy the
advanced developer. |