python decode error - UnicodeDecodeError: 'utf-8' codec can't decode
Download file from web in Python 3
decode 시에 error 가 발생하면 error 파라미터를 같이 넘긴다.
- snippet.py
>>> html = html.decode('utf-8') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 38395: invalid start byte
The errors argument specifies the response when the input string can’t be converted according to the encoding’s rules. Legal values for this argument are 'strict' (raise a UnicodeDecodeError exception), 'replace' (use U+FFFD, REPLACEMENT CHARACTER), 'ignore' (just leave the character out of the Unicode result), or 'backslashreplace' (inserts a \xNN escape sequence). The following examples show the differences:
https://docs.python.org/3/howto/unicode.html#the-unicode-type
- snippet.py
>>> b'\x80abc'.decode("utf-8", "strict") Traceback (most recent call last): ... UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte >>> b'\x80abc'.decode("utf-8", "replace") '\ufffdabc' >>> b'\x80abc'.decode("utf-8", "backslashreplace") '\\x80abc' >>> b'\x80abc'.decode("utf-8", "ignore") 'abc'
관련 문서
Plugin Backlinks: 아무 것도 없습니다.