[Serializable] |
This class contains the UTF8Encoding.GetCharCount method that reports the number of Unicode characters that result from decoding an array of bytes, and the UTF8Encoding.GetChars method that actually decodes an array of bytes. The UTF8Encoding.GetByteCount method reports the number of bytes that result from encoding strings or arrays of Unicode characters, and the UTF8Encoding.GetBytes method actually encodes characters into an array of bytes.
The UTF8Encoding.GetDecoder method obtains an object to convert (decode) UTF-8 encoded bytes into Unicode characters, while the UTF8Encoding.GetEncoder method obtains an object to convert (encode) Unicode characters into UTF-8 encoded bytes. The UTF8Encoding.GetPreamble method can obtain a Unicode byte order mark, which when prefixed to a series of bytes, indicates how those bytes are encoded.
UTF-8 encodes Unicode characters with a variable number of bytes per character. This encoding is optimized for the lower 127 ASCII characters, yielding an efficient mechanism to encode English in an international way. The UTF-8 identifier is the Unicode byte order mark, hexadecimal 0xFEFF, which is represented in UTF-8 as hexadecimal 0xEF 0xBB 0xBF. The byte order mark is used to distinguish UTF-8 text from other encodings.
This class offers an error detection feature that can be turned on when an instance of the class is constructed. Certain methods in this class check for invalid sequences of surrogate pairs. If error detection is turned on and an invalid sequence is found, ArgumentException is thrown. If error detection is not turned on and an invalid sequence is found, no exception is thrown and execution continues in a manner defined by that method.
The error detection feature also works during decoding operations. If error detection is on and an invalid byte sequence is found, ArgumentException is thrown. Examples of invalid byte sequence are invalid leading or trailing UTF-8 bytes, UTF-8 byte sequence consisting of more than four bytes, and the non-shortest form as defined in Unicode 3.0.1. When error detection is off, invalid bytes are discarded.
This class inherits from the Encoding class.
ctor #1 | Overloaded:.ctor() Default constructor. This constructor is called by derived class constructors to initialize state in this type.Initializes a new instance of the UTF8Encoding class. |
ctor #2 | Overloaded:.ctor(bool encoderShouldEmitUTF8Identifier) Initializes a new instance of the UTF8Encoding class. A parameter specifies whether to prefix an encoding with a Unicode byte order mark. |
ctor #3 | Overloaded:.ctor(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes) Initializes a new instance of the UTF8Encoding class. Parameters specify whether to prefix an encoding with a Unicode byte order mark, and whether to throw an exception when an invalid encoding is detected. |
BodyName (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.BodyName Gets the name for this encoding that can be used with mail agent body tags. |
CodePage (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.CodePage When overridden in a derived class, gets the code page identifier of this encoding. |
EncodingName (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.EncodingName Gets the human-readable description of the encoding. |
HeaderName (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.HeaderName Gets the name for this encoding that can be used with mail agent header tags. |
IsBrowserDisplay (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.IsBrowserDisplay Gets an indication whether this encoding can be used for display by browser clients. |
IsBrowserSave (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.IsBrowserSave Gets an indication whether this encoding can be used for saving by browser clients. |
IsMailNewsDisplay (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.IsMailNewsDisplay Gets and indication whether this encoding can be used for display by mail and news clients. |
IsMailNewsSave (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.IsMailNewsSave Gets an indication whether this encoding can be used for saving by mail and news clients. |
WebName (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.WebName Gets the name registered with the Internet Assigned Numbers Authority (IANA) for this encoding. |
WindowsCodePage (inherited from System.Text.Encoding) |
Read-only See base class member description: System.Text.Encoding.WindowsCodePage Gets the Windows operating system code page that most closely corresponds to this encoding. |
Equals | Overridden: Returns a value indicating whether this instance is equal to a specified object. |
GetByteCount (inherited from System.Text.Encoding) |
Overloaded:GetByteCount(char[] chars) See base class member description: System.Text.Encoding.GetByteCountCalculates the number of bytes required to encode a specified character array. |
GetByteCount | Overloaded:GetByteCount(string chars) Overridden: Calculates the number of bytes required to store the results of encoding the characters from a specified String. |
GetByteCount | Overloaded:GetByteCount(char[] chars, int index, int count) Overridden: Calculates the number of bytes required to store the results of encoding a set of characters from a specified Unicode character array. |
GetBytes (inherited from System.Text.Encoding) |
Overloaded:GetBytes(char[] chars) See base class member description: System.Text.Encoding.GetBytesEncodes a specified character array into a byte array. |
GetBytes | Overloaded:GetBytes(string s) Overridden: Encodes the characters from a specified String and returns the results in a byte array. |
GetBytes (inherited from System.Text.Encoding) |
Overloaded:GetBytes(char[] chars, int index, int count) See base class member description: System.Text.Encoding.GetBytesEncodes a range of characters from a character array into a byte array. |
GetBytes | Overloaded:GetBytes(char[] chars, int charIndex, int charCount, byte[] bytes, int byteIndex) Overridden: Encodes a specified range of elements from a Unicode character array and stores the results in a specified range of elements in a byte array. |
GetBytes | Overloaded:GetBytes(string s, int charIndex, int charCount, byte[] bytes, int byteIndex) Overridden: Encodes a specified range of characters from a String and stores the results in a specified range of elements in a byte array. |
GetCharCount (inherited from System.Text.Encoding) |
Overloaded:GetCharCount(byte[] bytes) See base class member description: System.Text.Encoding.GetCharCountCalculates the number of characters produced by decoding an array of bytes. |
GetCharCount | Overloaded:GetCharCount(byte[] bytes, int index, int count) Overridden: Calculates the number of characters that would result from decoding a specified range of elements in a byte array. |
GetChars (inherited from System.Text.Encoding) |
Overloaded:GetChars(byte[] bytes) See base class member description: System.Text.Encoding.GetCharsDecodes a byte array into an array of characters. |
GetChars (inherited from System.Text.Encoding) |
Overloaded:GetChars(byte[] bytes, int index, int count) See base class member description: System.Text.Encoding.GetCharsDecodes a range of bytes from a byte array into a character array. |
GetChars | Overloaded:GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex) Overridden: Decodes a range of elements from a specified byte array and stores the result into a specified range of elements in a Unicode character array. |
GetDecoder | Overridden: Obtains a decoder that can convert a UTF-8 encoded sequence of bytes into a sequence of Unicode characters. |
GetEncoder | Overridden: Obtains an encoder that can convert a sequence of Unicode characters into a UTF-8 encoded sequence of bytes. |
GetHashCode | Overridden: Returns the hash code for this instance. |
GetMaxByteCount | Overridden: Calculates the maximum number of bytes required to encode a specified number of characters. |
GetMaxCharCount | Overridden: Calculates the maximum number of characters that can result from decoding a specified number of bytes. |
GetPreamble | Overridden: Returns a Unicode byte order mark encoded in UTF-8 format, if the constructor for this instance requested byte order mark support. |
GetString (inherited from System.Text.Encoding) |
Overloaded:GetString(byte[] bytes) See base class member description: System.Text.Encoding.GetStringReturns a string containing the decoded representation of the specified byte array. |
GetString (inherited from System.Text.Encoding) |
Overloaded:GetString(byte[] bytes, int index, int count) See base class member description: System.Text.Encoding.GetStringReturns a string containing the decoded representation of a range of bytes in a byte array. |
GetType (inherited from System.Object) |
See base class member description: System.Object.GetType Derived from System.Object, the primary base class for all objects. |
ToString (inherited from System.Object) |
See base class member description: System.Object.ToString Derived from System.Object, the primary base class for all objects. |
Finalize (inherited from System.Object) |
See base class member description: System.Object.Finalize Derived from System.Object, the primary base class for all objects. |
MemberwiseClone (inherited from System.Object) |
See base class member description: System.Object.MemberwiseClone Derived from System.Object, the primary base class for all objects. |
Hierarchy:
public UTF8Encoding(); |
public UTF8Encoding( |
encoderShouldEmitUTF8Identifier
encoderShouldEmitUTF8Identifier
throwOnInvalidBytes
public virtual string BodyName {get;}
|
If the encoding cannot be used, the property value is the empty string ("").
public virtual int CodePage {get;}
|
public virtual string EncodingName {get;}
|
public virtual string HeaderName {get;}
|
If the encoding cannot be used, the string is empty.
public virtual bool IsBrowserDisplay {get;}
|
public virtual bool IsBrowserSave {get;}
|
public virtual bool IsMailNewsDisplay {get;}
|
public virtual bool IsMailNewsSave {get;}
|
public virtual string WebName {get;}
|
public virtual int WindowsCodePage {get;}
|
value
~UTF8Encoding(); |
chars
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
Alternatively, Encoding.GetMaxByteCount can be used to determine the maximum number of bytes that will be produced from converting a given number of characters. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetByteCount method generally uses less memory, whereas the Encoding.GetMaxByteCount method generally executes faster.
chars
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
ArgumentException | Return value is greater than Int32.MaxValue. -or- chars contains an invalid Unicode surrogate character. |
If error detection is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and do not affect the return value, and no exception is thrown.
chars
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
ArgumentOutOfRangeException | index or count is less than zero. -or- index plus count is greater than the length of chars. -or- Return value overflowed. |
ArgumentException | chars contains an invalid sequence of characters and the UTF8Encoding.#ctor constructor for this instance specified throwing an exception when an invalid encoding is detected. |
If error detection is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and do not affect the return value, and no exception is thrown.
chars
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
s
Exception Type | Condition |
---|---|
ArgumentNullException | s is null. |
ArgumentException | An invalid high or low member of a surrogate pair was encountered during encoding. |
chars
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | chars is null. |
ArgumentOutOfRangeException | The index and count parameters do not denote a valid range in chars. |
public override int GetBytes( |
chars
charIndex
charCount
bytes
byteIndex
Exception Type | Condition |
---|---|
ArgumentNullException | chars or bytes is null. |
ArgumentOutOfRangeException | charIndex, charCount, or byteIndex is less than zero. -or- The sum of charIndex and charCount is greater than the length of chars. -or- byteIndex is greater than the length of bytes. |
ArgumentException | byteIndex is equal to the length of bytes. -or- No bytes have been stored in bytes. -or- An invalid high or low member of a surrogate pair was encountered during encoding. |
If error detection is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and do not affect the return value, and no exception is thrown.
public override int GetBytes( |
s
charIndex
charCount
bytes
byteIndex
Exception Type | Condition |
---|---|
ArgumentNullException | s or bytes is null. |
ArgumentOutOfRangeException | charIndex, charCount, or byteIndex is less than zero. -or- charIndex plus charCount is greater than the length of s. -or- byteIndex is greater than the length of bytes. |
ArgumentException | byteIndex is equal to the length of bytes. -or- No bytes have been stored in bytes. -or- An invalid high or low member of a surrogate pair was encountered during encoding. |
If error detection is turned off and an invalid surrogate sequence is detected, the invalid characters are ignored and do not affect the return value, and no exception is thrown.
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
Alternatively, the Encoding.GetMaxCharCount method can be used to determine the maximum number of characters that will produced for a given number of bytes. A buffer of that size can then be reused for multiple conversions.
The Encoding.GetCharCount method generally uses less memory, whereas the Encoding.GetMaxCharCount method generally executes faster.
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
ArgumentOutOfRangeException | index or count is less than zero. -or- index plus count is greater than the length of bytes. |
ArgumentException | An invalid surrogate pair sequence was detected. |
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | bytes is null. |
ArgumentOutOfRangeException | index and count do not denote a valid range in the byte array. |
public override int GetChars( |
bytes
byteIndex
byteCount
chars
charIndex
Exception Type | Condition |
---|---|
ArgumentNullException | bytes or chars is null. |
ArgumentOutOfRangeException | byteIndex, byteCount, or charIndex is less than zero. -or- byteIndex plus byteCount is greater than the length of bytes. -or- charIndex is greater than the length of chars. |
ArgumentException | bytes contains an invalid sequence of bytes, and the UTF8Encoding.#ctor constructor for this instance specified throwing an exception when an invalid encoding is detected. |
If error detection is turned off and an invalid UTF-8 byte sequence is detected, the invalid bytes are ignored and are not encoded into chars, and no exception is thrown.
public override Decoder GetDecoder(); |
If this UTF8Encoding.#ctor is constructed with error detection turned on (that is, the throwOnInvalidBytes parameter is true), the Decoder returned by this method also has error detection turned on. If an error is detected, the decoder is in an undefined state and should not be reused. Use error detection if you intend to stop processing after an error is encountered. Otherwise, if you intend to continue processing after an error is found, do not use error detection.
The UTF8Encoding.GetDecoder method obtains a Decoder that preserves trailing bytes at the end of decoded blocks and uses the trailing bytes in the next decoding operation. UTF8Encoding.GetDecoder and UTF8Encoding.GetEncoder are useful for network transmission and file operations since those operations often deal with blocks of data instead of a complete stream.
public override Encoder GetEncoder(); |
If this UTF8Encoding.#ctor is constructed with error detection turned on (that is, the throwOnInvalidBytes parameter is true), the Encoder returned by this method also has error detection turned on.
The UTF8Encoding.GetEncoder method obtains an Encoder that preserves trailing characters (such as a high-surrogate), at the end of a block and uses the trailing Unicode characters in the next encoding operation. UTF8Encoding.GetDecoder and UTF8Encoding.GetEncoder are useful for network transmission and file operations since those operations often deal with blocks of data instead of a complete stream.
public override int GetHashCode(); |
charCount
Exception Type | Condition |
---|---|
ArgumentOutOfRangeException | charCount is less than zero. -or- Return value is greater than Int32.MaxValue. |
byteCount
Exception Type | Condition |
---|---|
ArgumentOutOfRangeException | byteCount is less than zero. |
public override byte[] GetPreamble(); |
Concatenate this return value to the beginning of a UTF-8 encoded sequence of bytes if the constructor for this instance requested support for a byte order mark.
bytes
Exception Type | Condition |
---|---|
ArgumentNullException | The bytes parameter is null. |
private string ReadAuthor(Stream binary_file) { System.Text.Encoding encoding = System.Text.Encoding.UTF8; // Read string from binary file with UTF8 encoding byte[] buffer = new byte[30]; binary_file.Read(buffer, 0, 30); return encoding.GetString(buffer); }
bytes
index
count
Exception Type | Condition |
---|---|
ArgumentNullException | The bytes parameter is null. |
ArgumentOutOfRangeException | The index and count parameters do not denote a valid range in the byte array. |
private string ReadAuthor(Stream binary_file) { System.Text.Encoding encoding = System.Text.Encoding.UTF8; // Read string from binary file with UTF8 encoding byte[] buffer = new byte[30]; binary_file.Read(buffer, 0, 30); return encoding.GetString(buffer); }
public Type GetType(); |
protected object MemberwiseClone(); |
public virtual string ToString(); |