unicode:bom_to_encoding/1
检测一个二进制数据的 UTF 字节顺序标记
用法:
bom_to_encoding(Bin) -> {Encoding, Length}
内部实现:
-spec bom_to_encoding(Bin) -> {Encoding, Length} when
Bin :: binary(),
Encoding :: 'latin1' | 'utf8'
| {'utf16', endian()}
| {'utf32', endian()},
Length :: non_neg_integer().
bom_to_encoding(>) ->
{utf8,3};
bom_to_encoding(>) ->
{{utf32,big},4};
bom_to_encoding(>) ->
{{utf32,little},4};
bom_to_encoding(>) ->
{{utf16,big},2};
bom_to_encoding(>) ->
{{utf16,little},2};
bom_to_encoding(Bin) when is_binary(Bin) ->
{latin1,0}.
检测一个二进制数据 Bin 的 UTF 字节顺序标记(Byte Order Mark)
unicode:bom_to_encoding(>).
unicode:bom_to_encoding(>).
unicode:bom_to_encoding(>).
如果找不到字节顺序标记,则返回 {latin1,0}。
unicode:bom_to_encoding(>).
下面把读入的文件 test.txt 的编码 encoding 设置为输出端的编码:
{ok, File} = file:open("test.txt", [read, binary]),
{ok, Bin} = file:read(File, 4),
{Encoding, _Length} = unicode:bom_to_encoding(Bin),
io:setopts(File, [{encoding, Encoding}]).