urllib.urlencode不喜欢unicode值：这种解决方法如何？_随笔

urllib.urlencode不喜欢unicode值：这种解决方法如何？

您确实应该紧张。在某些数据结构中可能混合使用字节和文本的整个想法令人震惊。它违反了处理字符串数据的基本原理：在输入时解码，仅在unipre中工作，在输出时编码。

更新以回应评论：

您将要输出某种HTTP请求。这需要准备为字节字符串。如果您的字典中包含顺序数大于等于128的Unipre字符，则urllib.urlenpre无法正确准备该字节字符串的事实确实很不幸。如果您的字典中混用了字节字符串和unipre字符串，则需要小心。让我们检查一下urlenpre（）的作用：

>>> import urllib>>> tests = ['x80', 'xe2x82xac', 1, '1', u'1', u'x80', u'u20ac']>>> for test in tests:...     print repr(test), repr(urllib.urlenpre({'a':test}))...'x80' 'a=%80''xe2x82xac' 'a=%E2%82%AC'1 'a=1''1' 'a=1'u'1' 'a=1'u'x80'Traceback (most recent call last):  File "<stdin>", line 2, in <module>  File "C:python27liburllib.py", line 1282, in urlenpre    v = quote_plus(str(v))UnipreEnpreError: 'ascii' prec can't enpre character u'x80' in position 0: ordinal not in range(128)

最后两个测试演示了urlenpre（）的问题。现在让我们看一下str测试。

如果坚持混合使用，那么至少应确保str对象以UTF-8编码。

’ x80’是可疑的-它不是any_valid_unipre_string.enpre（’utf8’）的结果。
‘ xe2 x82 xac’正常；这是u’ u20ac’.enpre（’utf8’）的结果。
‘1’是可以的-urlenpre（）的输入中所有ASCII字符都可以，如果需要，它将进行百分比编码，例如’％’。

这是建议的转换器功能。它不会改变输入字典，也不会返回输入字典（就像您一样）；它返回一个新的字典。如果值是str对象但不是有效的UTF-8字符串，它将强制异常。顺便说一句，您对它不处理嵌套对象的担忧有点误导了您的代码，仅对字典起作用，而嵌套字典的概念并没有真正实现。

def enpred_dict(in_dict):    out_dict = {}    for k, v in in_dict.iteritems():        if isinstance(v, unipre): v = v.enpre('utf8')        elif isinstance(v, str): # Must be enpred in UTF-8 v.depre('utf8')        out_dict[k] = v    return out_dict

这是输出，以相反的顺序使用相同的测试（因为这次令人讨厌的测试位于最前面）：

>>> for test in tests[::-1]:...     print repr(test), repr(urllib.urlenpre(enpred_dict({'a':test})))...u'u20ac' 'a=%E2%82%AC'u'x80' 'a=%C2%80'u'1' 'a=1''1' 'a=1'1 'a=1''xe2x82xac' 'a=%E2%82%AC''x80'Traceback (most recent call last):  File "<stdin>", line 2, in <module>  File "<stdin>", line 8, in enpred_dict  File "C:python27libencodingsutf_8.py", line 16, in depre    return precs.utf_8_depre(input, errors, True)UnipreDepreError: 'utf8' prec can't depre byte 0x80 in position 0: invalid start byte>>>

有帮助吗？

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5647841.html

urllib.urlencode不喜欢unicode值：这种解决方法如何？

发表评论

评论列表（0条）