展示字符集编码表示
输出结果Charset: US-ASCII input :?Ma?ana?Encoded: 0: 3f (?) 1: 4d (M) 2: 61 (a) 3: 3f (?) 4: 61 (a) 5: 6e (n) 6: 61 (a) 7: 3f (?)Charset: ISO-8859-1 input :?Ma?ana?Encoded: 0: bf (?) 1: 4d (M) 2: 61 (a) 3: f1 (?) 4: 61 (a) 5: 6e (n) 6: 61 (a) 7: 3f (?)Charset: UTF-8 input :?Ma?ana?Encoded: 0: c2 (?) 1: bf (?) 2: 4d (M) 3: 61 (a) 4: c3 (?) 5: b1 (±) 6: 61 (a) 7: 6e (n) 8: 61 (a) 9: 3f (?)Charset: UTF-16BE input :?Ma?ana?Encoded: 0: 00 1: bf (?) 2: 00 3: 4d (M) 4: 00 5: 61 (a) 6: 00 7: f1 (?) 8: 00 9: 61 (a) 10: 00 11: 6e (n) 12: 00 13: 61 (a) 14: 00 15: 3f (?)Charset: UTF-16LE input :?Ma?ana?Encoded: 0: bf (?) 1: 00 2: 4d (M) 3: 00 4: 61 (a) 5: 00 6: f1 (?) 7: 00 8: 61 (a) 9: 00 10: 6e (n) 11: 00 12: 61 (a) 13: 00 14: 3f (?) 15: 00Charset: UTF-16 input :?Ma?ana?Encoded: 0: fe (t) 1: ff (?) 2: 00 3: bf (?) 4: 00 5: 4d (M) 6: 00 7: 61 (a) 8: 00 9: f1 (?) 10: 00 11: 61 (a) 12: 00 13: 6e (n) 14: 00 15: 61 (a) 16: 00 17: 3f (?)
UTF -16BE 和UTF -16LE把每个字符编码为一个 2-字节数值。因此这类编码的解码器必须
要预先了解数据是如何编码的,或者根据编码数据流本身来确定字节顺序的方式。UTF -16
编码承认一种字节顺序标记:Unicode字符\uFEFF 。只有发生在编码流的开端时字节顺序
标记才表现为其特殊含义。如果之后遇到该值,它是根据其定义的 Unicode 值(零宽度,
无间断空格)被映射。外来的,小字节序系统可能会优先考虑\ uFEF 并且把流编码为
UTF -16LE。使用UTF -16编码优先考虑和认可字节顺序标记使系统带有不同的内部字节顺
序,从而与 Unicode数据交流
UTF-16BE无字节标记,编码高位字序UTF-16LE无字节标记,编码低位字序
更多信息请参考: orelly出版的 java nio 第6章.