If your dataset uses primarily ASCII characters (which represent majority of Latin alphabets), significant storage savings may be achieved as compared to UTF-16 data types.įor example, changing an existing column data type from NCHAR(10) to CHAR(10) using an UTF-8 enabled collation, translates into nearly 50% reduction in storage requirements. Performance differences between UTF-8 and UTF-16 The table below outlines these storage boundaries: But UTF-16 uses at least 16-bits for every character in code points 0 to 65535 (available in UCS-2 and UTF-16 alike), and code points 65536 to 1114111 use the same 4 bytes as UTF-8.The code points 65536 to 1114111 use 4 bytes, and represent the character range for Supplementary Characters. ASCII characters (0-127) use 1 byte, code points 128 to 2047 use 2 bytes, and code points 2048 to 65535 use 3 bytes.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |