文件头ufeff字符
翻译文件po,提示文件有误,最后查到,文件头多了个ufeff字符
比如说对于UTF-16,如果接收者收到的BOM是FEFF,表明这个字节流是Big-Endian的;如果收到FFFE,就表明这个字节流是Little-Endian的。
UTF-8不需要BOM来表明字节顺序,但可以用BOM来表明“我是UTF-8编码”。BOM的UTF-8编码是EF BB BF(用UltraEdit打开文本、切换到16进制可以看到)。所以如果接收者收到以EF BB BF开头的字节流,就知道这是UTF-8编码了。
从此分析,文件编码不对,在windows 中用记录本打开,另存,解决问题
# conding=utf-8 f = open("aa.po", "r",encoding='utf-8') file = f.read() file1 = file.split(",") print(file1) file2 = file.encode('utf-8').decode('utf-8-sig') print(file2)
['ufeff试试编码']
试试编码
进程已结束,退出代码 0
# conding=utf-8 f = open("aautf8.txt", "r",encoding='utf-8') file = f.read() file1 = file.split(",") print(file1) file2 = file.encode('utf-8').decode('utf-8-sig') print(file2)
['试试编码']
试试编码
进程已结束,退出代码 0
# conding=utf-8 f = open("aaansi.txt", "r",encoding='utf-8') file = f.read() file1 = file.split(",") print(file1) file2 = file.encode('utf-8').decode('utf-8-sig') print(file2)
Traceback (most recent call last):
File "D:/odoo141229/调试/filebm.py", line 5, in
file = f.read()
File "D:odsoftpython37libcodecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xca in position 0: invalid continuation byte
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)