Type: System.Text.UTF8Encoding

System.Text.UTF8Encoding Class

Assembly: Mscorlib.dll
Namespace: System.Text

Summary: Represents a UTF-8 encoding of Unicode characters.

C# Syntax:

[Serializable]
public class UTF8Encoding : Encoding

Remarks

This class encodes Unicode characters using UCS Transformation Format, 8-bit form (UTF-8). This encoding supports all Unicode character values and surrogates. For more information regarding surrogate pairs, see UnicodeCategory.

This class contains the UTF8Encoding.GetCharCount method that reports the number of Unicode characters that result from decoding an array of bytes, and the UTF8Encoding.GetChars method that actually decodes an array of bytes. The UTF8Encoding.GetByteCount method reports the number of bytes that result from encoding strings or arrays of Unicode characters, and the UTF8Encoding.GetBytes method actually encodes characters into an array of bytes.

The UTF8Encoding.GetDecoder method obtains an object to convert (decode) UTF-8 encoded bytes into Unicode characters, while the UTF8Encoding.GetEncoder method obtains an object to convert (encode) Unicode characters into UTF-8 encoded bytes. The UTF8Encoding.GetPreamble method can obtain a Unicode byte order mark, which when prefixed to a series of bytes, indicates how those bytes are encoded.

UTF-8 encodes Unicode characters with a variable number of bytes per character. This encoding is optimized for the lower 127 ASCII characters, yielding an efficient mechanism to encode English in an international way. The UTF-8 identifier is the Unicode byte order mark, hexadecimal 0xFEFF, which is represented in UTF-8 as hexadecimal 0xEF 0xBB 0xBF. The byte order mark is used to distinguish UTF-8 text from other encodings.

This class offers an error detection feature that can be turned on when an instance of the class is constructed. Certain methods in this class check for invalid sequences of surrogate pairs. If error detection is turned on and an invalid sequence is found, ArgumentException is thrown. If error detection is not turned on and an invalid sequence is found, no exception is thrown and execution continues in a manner defined by that method.

The error detection feature also works during decoding operations. If error detection is on and an invalid byte sequence is found, ArgumentException is thrown. Examples of invalid byte sequence are invalid leading or trailing UTF-8 bytes, UTF-8 byte sequence consisting of more than four bytes, and the non-shortest form as defined in Unicode 3.0.1. When error detection is off, invalid bytes are discarded.

This class inherits from the Encoding class.

System.Text.UTF8Encoding Member List:

Public Constructors

ctor #1	Overloaded: `.ctor()`Default constructor. This constructor is called by derived class constructors to initialize state in this type. Initializes a new instance of the UTF8Encoding class.
ctor #2	Overloaded: `.ctor(bool encoderShouldEmitUTF8Identifier)` Initializes a new instance of the UTF8Encoding class. A parameter specifies whether to prefix an encoding with a Unicode byte order mark.
ctor #3	Overloaded: `.ctor(bool encoderShouldEmitUTF8Identifier, bool throwOnInvalidBytes)` Initializes a new instance of the UTF8Encoding class. Parameters specify whether to prefix an encoding with a Unicode byte order mark, and whether to throw an exception when an invalid encoding is detected.

Public Properties

BodyName (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.BodyName Gets the name for this encoding that can be used with mail agent body tags.
CodePage (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.CodePage When overridden in a derived class, gets the code page identifier of this encoding.
EncodingName (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.EncodingName Gets the human-readable description of the encoding.
HeaderName (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.HeaderName Gets the name for this encoding that can be used with mail agent header tags.
IsBrowserDisplay (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.IsBrowserDisplay Gets an indication whether this encoding can be used for display by browser clients.
IsBrowserSave (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.IsBrowserSave Gets an indication whether this encoding can be used for saving by browser clients.
IsMailNewsDisplay (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.IsMailNewsDisplay Gets and indication whether this encoding can be used for display by mail and news clients.
IsMailNewsSave (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.IsMailNewsSave Gets an indication whether this encoding can be used for saving by mail and news clients.
WebName (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.WebName Gets the name registered with the Internet Assigned Numbers Authority (IANA) for this encoding.
WindowsCodePage (inherited from System.Text.Encoding)	Read-only See base class member description: System.Text.Encoding.WindowsCodePage Gets the Windows operating system code page that most closely corresponds to this encoding.

Public Methods

Equals	Overridden: Returns a value indicating whether this instance is equal to a specified object.
GetByteCount (inherited from System.Text.Encoding)	Overloaded: `GetByteCount(char[] chars)`See base class member description: System.Text.Encoding.GetByteCount Calculates the number of bytes required to encode a specified character array.
GetByteCount	Overloaded: `GetByteCount(string chars)`Overridden: Calculates the number of bytes required to store the results of encoding the characters from a specified String.
GetByteCount	Overloaded: `GetByteCount(char[] chars, int index, int count)`Overridden: Calculates the number of bytes required to store the results of encoding a set of characters from a specified Unicode character array.
GetBytes (inherited from System.Text.Encoding)	Overloaded: `GetBytes(char[] chars)`See base class member description: System.Text.Encoding.GetBytes Encodes a specified character array into a byte array.
GetBytes	Overloaded: `GetBytes(string s)`Overridden: Encodes the characters from a specified String and returns the results in a byte array.
GetBytes (inherited from System.Text.Encoding)	Overloaded: `GetBytes(char[] chars, int index, int count)`See base class member description: System.Text.Encoding.GetBytes Encodes a range of characters from a character array into a byte array.
GetBytes	Overloaded: `GetBytes(char[] chars, int charIndex, int charCount, byte[] bytes, int byteIndex)`Overridden: Encodes a specified range of elements from a Unicode character array and stores the results in a specified range of elements in a byte array.
GetBytes	Overloaded: `GetBytes(string s, int charIndex, int charCount, byte[] bytes, int byteIndex)`Overridden: Encodes a specified range of characters from a String and stores the results in a specified range of elements in a byte array.
GetCharCount (inherited from System.Text.Encoding)	Overloaded: `GetCharCount(byte[] bytes)`See base class member description: System.Text.Encoding.GetCharCount Calculates the number of characters produced by decoding an array of bytes.
GetCharCount	Overloaded: `GetCharCount(byte[] bytes, int index, int count)`Overridden: Calculates the number of characters that would result from decoding a specified range of elements in a byte array.
GetChars (inherited from System.Text.Encoding)	Overloaded: `GetChars(byte[] bytes)`See base class member description: System.Text.Encoding.GetChars Decodes a byte array into an array of characters.
GetChars (inherited from System.Text.Encoding)	Overloaded: `GetChars(byte[] bytes, int index, int count)`See base class member description: System.Text.Encoding.GetChars Decodes a range of bytes from a byte array into a character array.
GetChars	Overloaded: `GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex)`Overridden: Decodes a range of elements from a specified byte array and stores the result into a specified range of elements in a Unicode character array.
GetDecoder	Overridden: Obtains a decoder that can convert a UTF-8 encoded sequence of bytes into a sequence of Unicode characters.
GetEncoder	Overridden: Obtains an encoder that can convert a sequence of Unicode characters into a UTF-8 encoded sequence of bytes.
GetHashCode	Overridden: Returns the hash code for this instance.
GetMaxByteCount	Overridden: Calculates the maximum number of bytes required to encode a specified number of characters.
GetMaxCharCount	Overridden: Calculates the maximum number of characters that can result from decoding a specified number of bytes.
GetPreamble	Overridden: Returns a Unicode byte order mark encoded in UTF-8 format, if the constructor for this instance requested byte order mark support.
GetString (inherited from System.Text.Encoding)	Overloaded: `GetString(byte[] bytes)`See base class member description: System.Text.Encoding.GetString Returns a string containing the decoded representation of the specified byte array.
GetString (inherited from System.Text.Encoding)	Overloaded: `GetString(byte[] bytes, int index, int count)`See base class member description: System.Text.Encoding.GetString Returns a string containing the decoded representation of a range of bytes in a byte array.
GetType (inherited from System.Object)	See base class member description: System.Object.GetType Derived from System.Object, the primary base class for all objects.
ToString (inherited from System.Object)	See base class member description: System.Object.ToString Derived from System.Object, the primary base class for all objects.

Protected Methods

Finalize (inherited from System.Object)	See base class member description: System.Object.Finalize Derived from System.Object, the primary base class for all objects.
MemberwiseClone (inherited from System.Object)	See base class member description: System.Object.MemberwiseClone Derived from System.Object, the primary base class for all objects.

Hierarchy:

System.Object

System.Text.Encoding

System.Text.UTF8Encoding

System.Text.UTF8Encoding Member Details

Overloaded ctor #1

Summary: Initializes a new instance of the UTF8Encoding class.

Default constructor. This constructor is called by derived class constructors to initialize state in this type.

C# Syntax:


            public UTF8Encoding();

Remarks

By default, this constructor does not request that a Unicode byte order mark prefix encoded characters, and this constructor does not request an exception be thrown when an invalid encoding is detected.

Exception Type	Condition
ArgumentNullException	s is null.
ArgumentException	An invalid high or low member of a surrogate pair was encountered during encoding.

Exception Type	Condition
ArgumentNullException	chars is null.
ArgumentOutOfRangeException	The index and count parameters do not denote a valid range in chars.

Exception Type	Condition
ArgumentNullException	bytes is null.
ArgumentOutOfRangeException	index and count do not denote a valid range in the byte array.

Exception Type	Condition
ArgumentNullException	The bytes parameter is null.
ArgumentOutOfRangeException	The index and count parameters do not denote a valid range in the byte array.