CESU-8 defines an encoding scheme for Unicode identical to UTF-8 except for its representation of supplementary characters.
In CESU-8, supplementary characters are represented as six-byte sequences resulting from the transformation of each UTF-16 surrogate code unit into an eight-bit form similar to the UTF-8 transformation, but without first converting the input surrogate pairs to a scalar value.
CESU-8 is useful in 8-bit processing environments where binary collation with UTF-16 is required. It is designed and recommended for use only within products requiring this UTF-16 binary collation equivalence.
CESU-8 is a Compatibility Encoding Scheme for UTF-16 (CESU) that serializes a Unicode code point as a sequence of one, two, three or six bytes.
Słownik i definicje SAPa na C.