使用python2-------有关解码、打印乱码等遇到的坑_随笔

使用python2-------有关解码、打印乱码等遇到的坑

文章目录

- - 0.python2使用之前的准备工作
  - - （1）将pycharm的默认编码设置为UTF8
    - （2）将python2的文件模板设置UTF8编码（后续每次创建python文件就不用写了）
    - （3）检查一下数据库字段中的排序规则
  - 1.意外的收获，惊讶的发现
  - 2.注定有些文本很难解析decode，怎么搞？
  - 3. 打印(print)列表、字典居然不显示中文（默认unicode编码）
  - - （1）打印列表的时候
    - （2）打印字典的时候
    - （3）使用照妖镜，让它现出原形！

写在前面:

能不用python2，尽量别用！！！
但是，现实工作如果实在需要，还是得了解一些常见的坑！！！

0.python2使用之前的准备工作（1）将pycharm的默认编码设置为UTF8

（2）将python2的文件模板设置UTF8编码（后续每次创建python文件就不用写了）

（3）检查一下数据库字段中的排序规则

utf8-bin 是大小写敏感
utf8_general_ci 表示不区分大小写（一般使用这个模式）

1.意外的收获，惊讶的发现

# -*- coding: utf-8 -*-
import sys

reload(sys)
sys.setdefaultencoding('utf8')


name = "你好, Alien"
name2 = "你好Alien"
print name2, name
print (name2, name)
print name2

你好Alien 你好, Alien
('xe4xbdxa0xe5xa5xbdAlien', 'xe4xbdxa0xe5xa5xbd, Alien')		# 最NB的是，带个括号打印，居然是乱码
你好Alien

2.注定有些文本很难解析decode，怎么搞？

夜路走多了，总会遇到写妖魔鬼怪，怎么试都解决不了！

b'{"m_strategy_execution_price": 6.066980440315375, "m_strategy_state": 3, "m_strategy_asset": 0.0, "m_strategy_ordered_asset": 1022656.29, "m_strategy_market_price": 0.0, "m_strategy_type": 4818, "m_client_strategy_id": 110900033, "m_strategy_price_diff": -1518.302897856544, "m_strategy_asset_diff": -557891.2899999999, "m_strategy_qty": 1540000, "m_strategy_execution_qty": 168561, "m_strategy_execution_asset": 1022656.29, "m_xtp_strategy_id": 1744072474645, "error_id": 0, "m_strategy_ordered_qty": 168561, "error_msg": "600863xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbcx9b600777xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbcx9b601015xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbcx9b600956xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbc", "m_strategy_cancelled_qty": 24539, "m_strategy_unclosed_qty": -168561}'

init_str = b'{"m_strategy_execution_price": 6.066980440315375, "m_strategy_state": 3, "m_strategy_asset": 0.0, "m_strategy_ordered_asset": 1022656.29, "m_strategy_market_price": 0.0, "m_strategy_type": 4818, "m_client_strategy_id": 110900033, "m_strategy_price_diff": -1518.302897856544, "m_strategy_asset_diff": -557891.2899999999, "m_strategy_qty": 1540000, "m_strategy_execution_qty": 168561, "m_strategy_execution_asset": 1022656.29, "m_xtp_strategy_id": 1744072474645, "error_id": 0, "m_strategy_ordered_qty": 168561, "error_msg": "600863xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbcx9b600777xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbcx9b601015xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbcx9b600956xe6x89xa7xe8xa1x8cT0xe4xbaxa4xe6x98x9377000xe8x82xa1xefxbc", "m_strategy_cancelled_qty": 24539, "m_strategy_unclosed_qty": -168561}'


# 尝试了N次失败之后，懵逼的你估计也不知道怎么解码了
str_001 = init_str.decode("utf8")
print(str_001)

str_002 = init_str.decode("gbk")
print(str_002)

str_003 = init_str.decode("gb2312")
print(str_003)

str_004 = bytes.decode(init_str)
print(str_004)
...
...
# 此处省略N种方法

躺平方法如下：

# 添加一个ignore，解码不了的字符就忽略

str_666 = init_str.decode("utf8", "ignore")
print(str_666)

{"m_strategy_execution_price": 6.066980440315375, "m_strategy_state": 3, "m_strategy_asset": 0.0, "m_strategy_ordered_asset": 1022656.29, "m_strategy_market_price": 0.0, "m_strategy_type": 4818, "m_client_strategy_id": 110900033, "m_strategy_price_diff": -1518.302897856544, "m_strategy_asset_diff": -557891.2899999999, "m_strategy_qty": 1540000, "m_strategy_execution_qty": 168561, "m_strategy_execution_asset": 1022656.29, "m_xtp_strategy_id": 1744072474645, "error_id": 0, "m_strategy_ordered_qty": 168561, "error_msg": "600863执行T0交易77000股；600777执行T0交易77000股；601015执行T0交易77000股；600956执行T0交易77000股", "m_strategy_cancelled_qty": 24539, "m_strategy_unclosed_qty": -168561}

3. 打印(print)列表、字典居然不显示中文（默认unicode编码）（1）打印列表的时候

my_list = ["alien", "乾坤未定，你我皆黑马！","hello,world!"]

print my_list

# 显示效果如下

['alien', 'xe4xb9xbexe5x9dxa4xe6x9cxaaxe5xaex9axefxbcx8cxe4xbdxa0xe6x88x91xe7x9ax86xe9xbbx91xe9xa9xacxefxbcx81', 'hello,world!']

（2）打印字典的时候

my_dict = {"name": "alien", "slogan": "乾坤未定，你我皆黑马！", "project": "hello,world!"}

print my_dict

# 显示效果如下

{'project': 'hello,world!', 'slogan': 'xe4xb9xbexe5x9dxa4xe6x9cxaaxe5xaex9axefxbcx8cxe4xbdxa0xe6x88x91xe7x9ax86xe9xbbx91xe9xa9xacxefxbcx81', 'name': 'alien'}

（3）使用照妖镜，让它现出原形！

my_list = ["alien", "乾坤未定，你我皆黑马！","hello,world!"]
my_dict = {"name": "alien", "slogan": "乾坤未定，你我皆黑马！", "project": "hello,world!"}

print "{}".format(my_dict).decode("string-escape")
print str(my_dict).decode("string-escape")
print("n")
print "{}".format(my_list).decode("string-escape")
print str(my_list).decode("string-escape")

{'project': 'hello,world!', 'slogan': '乾坤未定，你我皆黑马！', 'name': 'alien'}
{'project': 'hello,world!', 'slogan': '乾坤未定，你我皆黑马！', 'name': 'alien'}


['alien', '乾坤未定，你我皆黑马！', 'hello,world!']
['alien', '乾坤未定，你我皆黑马！', 'hello,world!']

欢迎点赞支持！

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5479987.html

使用python2-------有关解码、打印乱码等遇到的坑

发表评论

评论列表（0条）