~emersion/public-inbox

1

Fwd: [lists.sr.ht] UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 43: invalid continuation byte

Details
Message ID
<BVAJQ1RXNCRE.16DDFWUTXGJTE@homura>
Sender timestamp
1562250895
DKIM signature
pass
Download raw message
This looks like an emailthreads issue to me, can you change this to
decode(errors='ignore')?

Forwarded message from sr.ht errors on Tue Jul 2, 2019 at 1:00 PM:

Exception occured on GET https://lists.sr.ht/~philmd/qemu/patches/5556

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/python3.7/site-packages/listssrht/blueprints/patches.py", line 118, in patchset
    [m.parsed() for m in [msg] + msg.replies])
  File "/usr/lib/python3.7/site-packages/listssrht/blueprints/patches.py", line 56, in _parse_thread
    parsed = parse_thread(thread)
  File "/usr/lib/python3.7/site-packages/emailthreads/threads.py", line 186, in parse
    parse_reply(msg, in_reply_to, thread)
  File "/usr/lib/python3.7/site-packages/emailthreads/threads.py", line 146, in parse_reply
    blocks = parse_blocks(msg)
  File "/usr/lib/python3.7/site-packages/emailthreads/quotes.py", line 54, in parse_blocks
    text = get_text(msg)
  File "/usr/lib/python3.7/site-packages/emailthreads/util.py", line 39, in get_text
    text = text_part.get_payload(decode=True).decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 43: invalid continuation byte


Request body:

(no request body)

Request headers:

Host: lists.sr.ht
X-Real-Ip: [redacted]
X-Forwarded-For: [redacted]
X-Forwarded-Proto: https
Connection: close
Cache-Control: no-cache
Pragma: no-cache
Accept: */*
Accept-Encoding: gzip, deflate
From: bingbot(at)microsoft.com
User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
Details
Message ID
<iqxirkaPTUIOTC6Ezxuyf4h5_m1REw__qayBW_tebemhjOaZB8B3mnXFnIx0xJXaHo81F1YXjuF5sp4mMPtRqHFc2WS9N8_hG0fOsVdbUSw=@emersion.fr>
In-Reply-To
<BVAJQ1RXNCRE.16DDFWUTXGJTE@homura> (view parent)
Sender timestamp
1562269553
DKIM signature
pass
Download raw message
On Thursday, July 4, 2019 5:34 PM, Drew DeVault <sir@cmpwn.com> wrote:
> This looks like an emailthreads issue to me, can you change this to
> decode(errors='ignore')?

I've pushed a commit which uses errors='replace'.