[Serializable] |
A number of Encoding implementations are provided in the System.Text namespace, including:
The ASCIIEncoding class encodes Unicode characters as single 7-bit ASCII characters. This encoding only supports character values between U+0000 and U+007F.
The UnicodeEncoding class encodes each Unicode character as two consecutive bytes. Both little-endian (code page 1200) and big-endian (code page 1201) byte orders are supported.
The UTF7Encoding class encodes Unicode characters using the UTF-7 encoding (UTF-7 stands for UCS Transformation Format, 7-bit form). This encoding supports all Unicode character values, and can also be accessed as code page 65000.
The UTF8Encoding class encodes Unicode characters using the UTF-8 encoding (UTF-8 stands for UCS Transformation Format, 8-bit form). This encoding supports all Unicode character values, and can also be accessed as code page 65001.
Use the Encoding.GetEncoding method with a code page or name parameter to obtain other encodings.
When the data to be converted is only available in sequential blocks (such as data read from a stream), an application can use a Decoder or an Encoder to perform the conversion. This is also useful when the amount of data is so large that it needs to be divided into smaller blocks. Decoders and encoders are obtained using the Encoding.GetDecoder and Encoding.GetEncoder methods. An application can use the properties of this class such as Encoding.ASCII, Encoding.Default, Encoding.Unicode, Encoding.UTF7, and Encoding.UTF8 to obtain encodings. Applications can initialize new instances of Encoding objects through the ASCIIEncoding, UnicodeEncoding, UTF7Encoding, and UTF8Encoding classes.
Through an encoding, the Encoding.GetBytes method is used to convert arrays of Unicode characters to arrays of bytes, and the Encoding.GetChars method is used to convert arrays of bytes to arrays of Unicode characters. The Encoding.GetBytes and Encoding.GetChars methods maintain no state between conversions.
When the data to be converted is only available in sequential blocks (such as data read from a stream) or when the amount of data is so large that it needs to be divided into smaller blocks, an application can use a Decoder or an Encoder to perform the conversion. Decoders and encoders allow sequential blocks of data to be converted and they maintain the state required to support conversions of data that spans adjacent blocks. Decoders and encoders are obtained using the Encoding.GetDecoder and Encoding.GetEncoder methods.
The core Encoding.GetBytes and Encoding.GetChars methods require the caller to provide the destination buffer and ensure that the buffer is large enough to hold the entire result of the conversion. An application can use one of the following methods to calculate the required size of the destination buffer.
The first method generally uses less memory, whereas the second method generally executes faster.
ASCII | Read-only Gets an encoding for the ASCII (7 bit) character set. |
BigEndianUnicode | Read-only Gets an encoding for the Unicode format in the big-endian byte order. |
BodyName | Read-only Gets the name for this encoding that can be used with mail agent body tags. |
CodePage | Read-only When overridden in a derived class, gets the code page identifier of this encoding. |
Default | Read-only Gets an encoding for the system's current ANSI code page. |
EncodingName | Read-only Gets the human-readable description of the encoding. |
HeaderName | Read-only Gets the name for this encoding that can be used with mail agent header tags. |
IsBrowserDisplay | Read-only Gets an indication whether this encoding can be used for display by browser clients. |
IsBrowserSave | Read-only Gets an indication whether this encoding can be used for saving by browser clients. |
IsMailNewsDisplay | Read-only Gets and indication whether this encoding can be used for display by mail and news clients. |
IsMailNewsSave | Read-only Gets an indication whether this encoding can be used for saving by mail and news clients. |
Unicode | Read-only Gets an encoding for the Unicode format in little-endian byte order. |
UTF7 | Read-only Gets an encoding for the UTF-7 format. |
UTF8 | Read-only Gets an encoding for the UTF-8 format. |
WebName | Read-only Gets the name registered with the Internet Assigned Numbers Authority (IANA) for this encoding. |
WindowsCodePage | Read-only Gets the Windows operating system code page that most closely corresponds to this encoding. |
Convert | Overloaded:Convert(Encoding srcEncoding, Encoding dstEncoding, byte[] bytes) Converts a byte array from one encoding to another. |
Convert | Overloaded:Convert(Encoding srcEncoding, Encoding dstEncoding, byte[] bytes, int index, int count) Converts a range of bytes in a byte array from one encoding to another. |
Equals | Overridden: Determines whether the current instance and the specified Object represent the same type and value. |
GetByteCount | Overloaded:GetByteCount(char[] chars) Calculates the number of bytes required to encode a specified character array. |
GetByteCount | Overloaded:GetByteCount(string s) Calculates the number of bytes required to encode the specified String. |
GetByteCount | Overloaded:GetByteCount(char[] chars, int index, int count) When overridden in a derived class, returns the number of bytes required to encode a range of characters in the specified character array. |
GetBytes | Overloaded:GetBytes(char[] chars) Encodes a specified character array into a byte array. |
GetBytes | Overloaded:GetBytes(string s) Encodes a specified String into an array of bytes. |
GetBytes | Overloaded:GetBytes(char[] chars, int index, int count) Encodes a range of characters from a character array into a byte array. |
GetBytes | Overloaded:GetBytes(char[] chars, int charIndex, int charCount, byte[] bytes, int byteIndex) When overridden in a derived class, encodes a range of characters from a character array into a byte array. |
GetBytes | Overloaded:GetBytes(string s, int charIndex, int charCount, byte[] bytes, int byteIndex) Encodes the specified range of a String into the specified range of a byte array. |
GetCharCount | Overloaded:GetCharCount(byte[] bytes) Calculates the number of characters produced by decoding an array of bytes. |
GetCharCount | Overloaded:GetCharCount(byte[] bytes, int index, int count) When overridden in a derived class, calculates the number of characters produced by decoding a specified range of elements in an array of bytes. |
GetChars | Overloaded:GetChars(byte[] bytes) Decodes a byte array into an array of characters. |
GetChars | Overloaded:GetChars(byte[] bytes, int index, int count) Decodes a range of bytes from a byte array into a character array. |
GetChars | Overloaded:GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex) When overridden in a derived class, decodes a range of bytes in a byte array into a range of characters in a character array. |
GetDecoder | Returns a Decoder for this encoding. |
GetEncoder | An Encoder for this encoding. |
GetEncoding | Overloaded:GetEncoding(int codepage) Returns an Encoding that corresponds to the specified code page value. |
GetEncoding | Overloaded:GetEncoding(string name) Returns an Encoding for the specified name. |
GetHashCode | Overridden: Returns the hash code for this instance. |
GetMaxByteCount | When overridden in a derived class, returns the maximum number of bytes required to encode a given number of characters. |
GetMaxCharCount | When overridden in a derived class, returns the maximum number of characters produced by decoding a given number of bytes. |
GetPreamble | Returns a set of bytes used at the beginning of a stream to determine which encoding a file was created with. This can include the Unicode byte order mark. |
GetString | Overloaded:GetString(byte[] bytes) Returns a string containing the decoded representation of the specified byte array. |
GetString | Overloaded:GetString(byte[] bytes, int index, int count) Returns a string containing the decoded representation of a range of bytes in a byte array. |
GetType (inherited from System.Object) |
See base class member description: System.Object.GetType Derived from System.Object, the primary base class for all objects. |
ToString (inherited from System.Object) |
See base class member description: System.Object.ToString Derived from System.Object, the primary base class for all objects. |
ctor #1 | Overloaded:.ctor() Default constructor. This constructor is called by derived class constructors to initialize state in this type.Initializes a new instance of the Encoding class. |
ctor #2 | Overloaded:.ctor(int codePage) Initializes a new instance of the Encoding class. |
Finalize (inherited from System.Object) |
See base class member description: System.Object.Finalize Derived from System.Object, the primary base class for all objects. |
MemberwiseClone (inherited from System.Object) |
See base class member description: System.Object.MemberwiseClone Derived from System.Object, the primary base class for all objects. |
Hierarchy:
protected Encoding(); |
protected Encoding( |
codePage
Exception Type | Condition |
---|---|
ArgumentOutOfRangeException | codePage is less than zero. |
public static Encoding ASCII {get;}
|
public static Encoding BigEndianUnicode {get;}
|
Unicode files can be distinguished by the presence of the byte order mark (U+FEFF), which is represented as hexadecimal 0xFE 0xFF on big-endian platforms and hexadecimal 0xFF 0xFE on little-endian platforms.
public virtual string BodyName {get;}
|
If the encoding cannot be used, the property value is the empty string ("").
public virtual int CodePage {get;}
|
public static Encoding Default {get;}
|
public virtual string EncodingName {get;}
|
public virtual string HeaderName {get;}
|
If the encoding cannot be used, the string is empty.
public virtual bool IsBrowserDisplay {get;}
|
public virtual bool IsBrowserSave {get;}
|
public virtual bool IsMailNewsDisplay {get;}
|
public virtual bool IsMailNewsSave {get;}
|
public static Encoding Unicode {get;}
|
Unicode files can be distinguished by the presence of the byte order mark (U+FEFF), which is represented as hexadecimal 0xFE 0xFF on big-endian platforms and hexadecimal 0xFF 0xFE on little-endian platforms.
public static Encoding UTF7 {get;}
|
public static Encoding UTF8 {get;}
|
public virtual string WebName {get;}
|
public virtual int WindowsCodePage {get;}
|
srcEncoding
dstEncoding
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | srcEncoding, dstEncoding, or bytes is null. |
public static byte[] Convert( |
srcEncoding
dstEncoding
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | srcEncoding, dstEncoding, or bytes arguments are null. |
ArgumentOutOfRangeException | index and count do not denote a valid range in the byte array. |
value
~Encoding(); |
chars
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
Alternatively, Encoding.GetMaxByteCount can be used to determine the maximum number of bytes that will be produced from converting a given number of characters. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetByteCount method generally uses less memory, whereas the Encoding.GetMaxByteCount method generally executes faster.
s
Alternatively, Encoding.GetMaxByteCount can be used to determine the maximum number of bytes that will be produced from converting a given String. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetByteCount method generally uses less memory, whereas the Encoding.GetMaxByteCount method generally executes faster.
chars
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
ArgumentOutOfRangeException | index and count do not denote a valid range in the character array. |
Alternatively, Encoding.GetMaxByteCount can be used to determine the maximum number of bytes that will be produced from converting a given number of characters. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetByteCount method generally uses less memory, whereas the Encoding.GetMaxByteCount method generally executes faster.
chars
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
s
Exception Type | Condition |
---|---|
ArgumentNullException | s is null. |
chars
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
ArgumentOutOfRangeException | The index and count parameters do not denote a valid range in chars. |
public abstract int GetBytes( |
chars
charIndex
charCount
bytes
byteIndex
Exception Type | Condition |
---|---|
ArgumentNullException | chars or bytes is null. |
ArgumentOutOfRangeException | charIndex, charCount or byteIndex is less than zero. -or- charIndex + charCount is greater than the length of chars. -or- byteIndex + charCount is greater than the length of bytes. |
Encoding.GetByteCount can be used to determine the exact number of bytes that will be produced for a given range of characters. Alternatively, Encoding.GetMaxByteCount can be used to determine the maximum number of bytes that will be produced for a given number of characters, regardless of the actual character values.
public virtual int GetBytes( |
s
charIndex
charCount
bytes
byteIndex
Exception Type | Condition |
---|---|
ArgumentNullException | s or bytes is null. |
ArgumentOutOfRangeException | charIndex, charCount, or byteIndex is less than zero. -or- charIndex and charCount do not specify a valid range in s (that is, (charIndex + charCount) is greater than the length of s). -or- byteIndex and charCount do not specify a valid range in bytes (that is, (byteIndex + charCount) is greater than the length of bytes). |
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
Alternatively, the Encoding.GetMaxCharCount method can be used to determine the maximum number of characters that will produced for a given number of bytes. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetCharCount method generally uses less memory, whereas the Encoding.GetMaxCharCount method generally executes faster.
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
ArgumentOutOfRangeException | index and count do not denote a valid range in the byte array. |
Alternatively, the Encoding.GetMaxCharCount method can be used to determine the maximum number of characters that will produced for a given number of bytes. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetCharCount method generally uses less memory, whereas the Encoding.GetMaxCharCount method generally executes faster.
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
ArgumentOutOfRangeException | index and count do not denote a valid range in the byte array. |
public abstract int GetChars( |
bytes
byteIndex
byteCount
chars
charIndex
Exception Type | Condition |
---|---|
ArgumentNullException | bytes or chars is null. |
ArgumentOutOfRangeException | byteIndex, byteCount, or charIndex is less than zero. -or- byteIndex + byteCount is greater than the length of bytes. -or- charIndex + byteCount is greater than the length of chars. |
The Encoding.GetChars method requires the caller to provide the destination buffer and ensure that the buffer is large enough to hold the entire result of the conversion. An application can use Encoding.GetCharCount or Encoding.GetMaxCharCount to calculate the required size of the destination buffer.
You can use this method to determine the exact number of characters that will be produced for a given range of bytes. Alternatively, the Encoding.GetMaxCharCount method can be used to determine the maximum number of characters that will be produced for a given number of bytes, regardless of the actual byte values.
public virtual Decoder GetDecoder(); |
This default implementation returns a Decoder that forwards calls to the Encoding.GetCharCount and Encoding.GetChars methods to the corresponding methods of this encoding. Encodings that require state to be maintained between successive conversions can override this method and return an instance of an appropriate Decoder implementation.
public virtual Encoder GetEncoder(); |
This default implementation returns an Encoder that forwards calls to Encoding.GetByteCount and Encoding.GetBytes to the corresponding methods of this encoding. Encodings that require state to be maintained between successive conversions can override this method and return an instance of an appropriate Encoder implementation.
codepage
Exception Type | Condition |
---|---|
ArgumentOutOfRangeException | codepage is less than zero or greater than 65535. |
NotSupportedException | codepage is not supported by the current regional options of the computer executing this method. |
For example, the encoding for the windows-1252 code page (code page value 1252) can be created by the following C# code:
Encoding enc = Encoding.GetEncoding(1252);
A specific code page might not be supported by certain platforms. For example, the Japanese shift-jis code page (code page 932) might not be supported in the United States version of Windows 98. In that case, the Encoding.GetEncoding method throws NotSupportedException when the following C# code is executed:
Encoding enc = Encoding.GetEncoding(932);
name
Exception Type | Condition |
---|---|
NotSupportedException | The name encoding is not supported by the current regional options of the computer executing this method. |
Specify one of the names listed in the following table to obtain the system supported encoding with the corresponding code page.
code page | name |
---|---|
1200 | "UTF-16LE", "utf-16", "ucs-2", "unicode", or "ISO-10646-UCS-2" |
1201 | "UTF-16BE" or "unicodeFFFE" |
1252 | "windows-1252" |
65000 | "utf-7", "csUnicode11UTF7", "unicode-1-1-utf-7", "unicode-2-0-utf-7", "x-unicode-1-1-utf-7", or "x-unicode-2-0-utf-7" |
65001 | "utf-8", "unicode-1-1-utf-8", "unicode-2-0-utf-8", "x-unicode-1-1-utf-8", or "x-unicode-2-0-utf-8" |
20127 | "us-ascii", "us", "ascii", "ANSI_X3.4-1968", "ANSI_X3.4-1986", "cp367", "csASCII", "IBM367", "iso-ir-6", "ISO646-US", or "ISO_646.irv:1991" |
54936 | "GB18030" |
A specific code page might not be supported by certain platforms. For example, the Japanese shift-jis code page (code page 932) might not be supported in the United States version of Windows 98. In that case, the Encoding.GetEncoding method throws NotSupportedException when the following C# code is executed:
Encoding enc = Encoding.GetEncoding("shift-jis");
public override int GetHashCode(); |
charCount
All encoding must guarantee that no buffer overflow exceptions will occur if buffers are sized according to the results of this method.
byteCount
public virtual byte[] GetPreamble(); |
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | The bytes parameter is null. |
private string ReadAuthor(Stream binary_file) { System.Text.Encoding encoding = System.Text.Encoding.UTF8; // Read string from binary file with UTF8 encoding byte[] buffer = new byte[30]; binary_file.Read(buffer, 0, 30); return encoding.GetString(buffer); }
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | The bytes parameter is null. |
ArgumentOutOfRangeException | The index and count parameters do not denote a valid range in the byte array. |
private string ReadAuthor(Stream binary_file) { System.Text.Encoding encoding = System.Text.Encoding.UTF8; // Read string from binary file with UTF8 encoding byte[] buffer = new byte[30]; binary_file.Read(buffer, 0, 30); return encoding.GetString(buffer); }
public Type GetType(); |
protected object MemberwiseClone(); |
public virtual string ToString(); |