>>> from env_helper import info; info()

页面更新时间： 2024-01-23 21:49:24
运行环境：
    Linux发行版本: Debian GNU/Linux 12 (bookworm)
    操作系统内核: Linux-6.1.0-17-amd64-x86_64-with-glibc2.36
    Python版本: 3.11.2

9.2. Python3解码utf-8 escape字符串¶

如果遇到 ‘\xe4xb8xadxe5x9bxbd’ 这样的utf-8 escape字符串，并且不是在代码里，而是从别的地方获取的，无法更改，就需要特殊的解码方式。

在 Python2 中，可以直接用 decode("string_escape") 解决，但是 Python 3 中 str 类型无法 decode ，那么怎么办呢？

有两种方法，第一种来自stackoverflow

https://stackoverflow.com/questions/26311277/evaluate-utf-8-literal-escape-sequences-in-a-string-in-python3

>>> s = r'\xe4\xb8\xad\xe5\x9b\xbd'
>>>
>>> c = s.encode().decode('unicode-escape').encode('raw_unicode_escape').decode('utf-8')
>>>
>>> print(c)

中国

没错，decode(‘unicode-escape’)之后，字符串实际上变成了’:raw-latex:`\xe4`:raw-latex:`\xb8`:raw-latex:`\xad`:raw-latex:`\xe5`:raw-latex:`\x`9b:raw-latex:`xbd`‘，然后就可以用常规的.encode(’raw_unicode_escape’).decode(‘utf-8’)解决

第二种方法

>>> s = r'\xe4\xbd\xa0\xe5\xa5\xbd'
>>> eval("print('"+s+"'.encode('raw_unicode_escape').decode('utf-8'))")
>>>
>>> #写成函数
>>> def getUtf8Escape(s):
>>>     return eval("'"+s + "'.encode('raw_unicode_escape').decode('utf-8')")
>>>
>>> print(getUtf8Escape(s))

你好
你好

复制代码简单粗暴的方法，但确实有效

顺带一提，如果是Unicode的escape字符串，或者没有转义的utf-8，其实很简单

Unicode明文

>>> a = r'\u8bf7'
>>> b = a.encode().decode("unicode_escape")
>>> print(b)

请

>>> #utf-8
>>> a = '\xe4\xbd\xa0\xe5\xa5\xbd'
>>> b = a.encode('raw_unicode_escape').decode('utf-8')
>>> b

'你好'

9.1. 终端颜色

9.3. Rich：Python开发者的终端工具！

Python 3 教程 文档

9.2. Python3解码utf-8 escape字符串¶

Python 3 教程文档