~sircmpwn/sr.ht-discuss

6 3

Email header corruption in mbox archives

Details
Message ID
<1e5ffe85-4347-43a2-b54d-d11ad937474b@www.fastmail.com>
DKIM signature
pass
Download raw message
Hi folks,

I'm currently testing out patatt[1] which inserts additional headers into patch emails to allow them to be cryptographically verified, providing attestation that a patch does indeed come from the purported author. I'd like to use this as part of my workflow for handling patches submitted to a mailing list hosted on sr.ht.

I set up a test mailing list and sent a signed patch by email. The headers can be seen in the raw email [2]:

    X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
        bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=;
        b=owGbwMvMwCF2w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7 l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ8+0MF7GrfzWnP K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
    X-Developer-Key: i=paul@pbarker.dev; a=openpgp; fpr=D2DDFDAE30017AF4CB62AA96A67255DFCCE62ECD

However, looking at the mbox export of this message [3], the headers appear mangled:

    X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
     bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=; =?utf-8?q?b=3DowGbwMvMwCF2?=
     =?utf-8?q?w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7_?=
     =?utf-8?q?l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ?=
     =?utf-8?q?8+0MF7GrfzWnP?=
     K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
    X-Developer-Key:
     i=paul@pbarker.dev; a=openpgp; fpr=D2DDFDAE30017AF4CB62AA96A67255DFCCE62ECD

This mangling causes patatt to die when trying to parse the headers in the downloaded mbox file. That's obviously a bug in patatt, it should reject the signature rather than crashing, but that's a bug to solve elsewhere. I think the header corruption when exporting an mbox file is likely to be a bug in sourcehut as email headers should be passed through without modification. Am I right on that? Or is there some user error/misunderstanding I haven't noticed?

[1]: https://pypi.org/project/patatt/
[2]: https://lists.sr.ht/~pbarker/test/%3C20210530120341.14262-1-paul%40pbarker.dev%3E/raw
[3]: https://lists.sr.ht/~pbarker/test/%3C20210530120341.14262-1-paul%40pbarker.dev%3E/mbox

Thanks,

-- 
Paul Barker
https://pbarker.dev/
Details
Message ID
<CBQL12B7MAWL.3AXFJ89N79X4R@taiga>
In-Reply-To
<1e5ffe85-4347-43a2-b54d-d11ad937474b@www.fastmail.com> (view parent)
DKIM signature
fail
Download raw message
DKIM signature: fail
On Sun May 30, 2021 at 8:32 AM EDT, Paul Barker wrote:
> X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
> bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=;
> b=owGbwMvMwCF2w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7
> l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ8+0MF7GrfzWnP
> K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
> X-Developer-Key: i=paul@pbarker.dev; a=openpgp;
> fpr=D2DDFDAE30017AF4CB62AA96A67255DFCCE62ECD
>
> However, looking at the mbox export of this message [3], the headers
> appear mangled:
>
> X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
> bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=;
> =?utf-8?q?b=3DowGbwMvMwCF2?=
> =?utf-8?q?w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7_?=
> =?utf-8?q?l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ?=
> =?utf-8?q?8+0MF7GrfzWnP?=
> K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
> X-Developer-Key:
> i=paul@pbarker.dev; a=openpgp;
> fpr=D2DDFDAE30017AF4CB62AA96A67255DFCCE62ECD
>
> This mangling causes patatt to die when trying to parse the headers in
> the downloaded mbox file. That's obviously a bug in patatt, it should
> reject the signature rather than crashing, but that's a bug to solve
> elsewhere. I think the header corruption when exporting an mbox file is
> likely to be a bug in sourcehut as email headers should be passed
> through without modification. Am I right on that? Or is there some user
> error/misunderstanding I haven't noticed?

At a glance, this looks like an acceptable re-encoding of the same
header. Technically seems more correct since the headers in the original
email exceed the length limit. Shouldn't your mail parser be able to
handle this?
Details
Message ID
<O0VqPi4mAJh9GqrwnDQUQ1ZzPc1l1fUAv0fKG0LP4YRR8WYhZVTy1XR3Ay5-jKA2F-omnHSz4HaFoDz5uOq8jE6t6uKxzTx6dHsykGqTdZQ=@emersion.fr>
In-Reply-To
<CBQL12B7MAWL.3AXFJ89N79X4R@taiga> (view parent)
DKIM signature
pass
Download raw message
On Sunday, May 30th, 2021 at 2:34 PM, Drew DeVault <sir@cmpwn.com> wrote:

> At a glance, this looks like an acceptable re-encoding of the same
> header. Technically seems more correct since the headers in the original
> email exceed the length limit.

The 78 char limit is a soft limit. The hard limit is 998 chars.

It's incorrect in general to use quoted-printable encoding in arbitrary
header fields. Only a few select header fields support quoted-printable
encoding, the RFCs explicitly state when it's allowed.
Details
Message ID
<CBQLFQMYLK2I.2MCGK7L3KJH77@taiga>
In-Reply-To
<O0VqPi4mAJh9GqrwnDQUQ1ZzPc1l1fUAv0fKG0LP4YRR8WYhZVTy1XR3Ay5-jKA2F-omnHSz4HaFoDz5uOq8jE6t6uKxzTx6dHsykGqTdZQ=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
On Sun May 30, 2021 at 8:52 AM EDT, Simon Ser wrote:
> The 78 char limit is a soft limit. The hard limit is 998 chars.
>
> It's incorrect in general to use quoted-printable encoding in arbitrary
> header fields. Only a few select header fields support quoted-printable
> encoding, the RFCs explicitly state when it's allowed.

Gotcha. Probably another obnoxious Python issue, then.
Details
Message ID
<4LUuYrzCgoLInaGige21bur30YDUv8omUDDKOo7oodVgqmBylbgXI6LaKD15P_Hsy1T33_ekSaPCXrihzpbf7C1jWqColmgBNNp2rNoWVYc=@emersion.fr>
In-Reply-To
<CBQLFQMYLK2I.2MCGK7L3KJH77@taiga> (view parent)
DKIM signature
pass
Download raw message
On Sunday, May 30th, 2021 at 2:53 PM, Drew DeVault <sir@cmpwn.com> wrote:

> Gotcha. Probably another obnoxious Python issue, then.

Yeah, I think we'll want to always set max_line_length=998 in the email
policy. I have an old WIP patch that tries to unify sr.ht's email lib
usage, I'll try to re-spin them and see if they fix this issue.
Details
Message ID
<1cd4d72f-b268-43f7-8edb-62edb9446712@www.fastmail.com>
In-Reply-To
<CBQL12B7MAWL.3AXFJ89N79X4R@taiga> (view parent)
DKIM signature
pass
Download raw message
On Sun, 30 May 2021, at 13:34, Drew DeVault wrote:
> On Sun May 30, 2021 at 8:32 AM EDT, Paul Barker wrote:
> > X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
> > bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=;
> > b=owGbwMvMwCF2w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7
> > l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ8+0MF7GrfzWnP
> > K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
> > X-Developer-Key: i=paul@pbarker.dev; a=openpgp;
> > fpr=D2DDFDAE30017AF4CB62AA96A67255DFCCE62ECD
> >
> > However, looking at the mbox export of this message [3], the headers
> > appear mangled:
> >
> > X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
> > bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=;
> > =?utf-8?q?b=3DowGbwMvMwCF2?=
> > =?utf-8?q?w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7_?=
> > =?utf-8?q?l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ?=
> > =?utf-8?q?8+0MF7GrfzWnP?=
> > K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
> > X-Developer-Key:
> > i=paul@pbarker.dev; a=openpgp;
> > fpr=D2DDFDAE30017AF4CB62AA96A67255DFCCE62ECD
> >
> > This mangling causes patatt to die when trying to parse the headers in
> > the downloaded mbox file. That's obviously a bug in patatt, it should
> > reject the signature rather than crashing, but that's a bug to solve
> > elsewhere. I think the header corruption when exporting an mbox file is
> > likely to be a bug in sourcehut as email headers should be passed
> > through without modification. Am I right on that? Or is there some user
> > error/misunderstanding I haven't noticed?
> 
> At a glance, this looks like an acceptable re-encoding of the same
> header. Technically seems more correct since the headers in the original
> email exceed the length limit. Shouldn't your mail parser be able to
> handle this?
> 

Looks like I can decode this with a bit of Python:

    >>> import mailbox, email.header
    >>> msg = mailbox.mbox('patch.mbx')[0]
    >>> encoded_header = msg.get('X-Developer-Signature')
    >>> print(encoded_header)
    v=1; a=openpgp-sha256; l=672; h=from:subject;
     bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=; =?utf-8?q?b=3DowGbwMvMwCF2?=
     =?utf-8?q?w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7_?=
     =?utf-8?q?l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ?=
     =?utf-8?q?8+0MF7GrfzWnP?=
     K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=
    >>> decoded_header = str(email.header.make_header(email.header.decode_header(encoded_header)))
    >>> print(decoded_header)
    v=1; a=openpgp-sha256; l=672; h=from:subject; bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=; b=owGbwMvMwCF2w7xIXuiX9CvG02pJDAmb67lTNi0+IeF97TL76vtKD7xjSjaluz0o/KfmZLX8rMi7 l3M6O0pZGMQ4GGTFFFl2z951+fqDJVt7b0gHw8xhZQIZwsDFKQATydFhZJi+fFfvJ8+0MF7GrfzWnP K7mAM/3n/r/UC+bprf6/g114QYGdbHcsaK7b1nanfA4IeZi1V0lL26cruXUWxgSEnNDP1FrAA=

So maybe I could extend patatt to handle this.

It's probably also worth extending patatt to produce shorted header lines in the first place.

... And probably worth switching to a mail client which wraps emails correctly when sending, I'll do that today ...

Thanks,

-- 
Paul Barker
https://pbarker.dev/
Details
Message ID
<20210530152817.00007a21@pbarker.dev>
In-Reply-To
<1cd4d72f-b268-43f7-8edb-62edb9446712@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message
On Sun, 30 May 2021 14:00:47 +0100
"Paul Barker" <paul@pbarker.dev> wrote:

> On Sun, 30 May 2021, at 13:34, Drew DeVault wrote:
> > On Sun May 30, 2021 at 8:32 AM EDT, Paul Barker wrote:
> > >
> > > This mangling causes patatt to die when trying to parse the
> > > headers in the downloaded mbox file. That's obviously a bug in
> > > patatt, it should reject the signature rather than crashing, but
> > > that's a bug to solve elsewhere. I think the header corruption
> > > when exporting an mbox file is likely to be a bug in sourcehut as
> > > email headers should be passed through without modification. Am I
> > > right on that? Or is there some user error/misunderstanding I
> > > haven't noticed?  
> > 
> > At a glance, this looks like an acceptable re-encoding of the same
> > header. Technically seems more correct since the headers in the
> > original email exceed the length limit. Shouldn't your mail parser
> > be able to handle this?
> >   
> 

On the sending side it doesn't look like I have much option but to have
this sent as a single long line.

With the header wrapped neatly in the .patch file:

    X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject;
     bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=;
     b=owGbwMvMwCF2w7xIXuiX9CvG02pJDAmb571P2bT4hIT3tcvsq+8rPfCOKdmU7vag8J+ak9XysyLv
     Xs7p7ChlYRDjYJAVU2TZPXvX5esPlmztvSEdDDOHlQlkCAMXpwBM5JA3I8O5hP6Tqm7lJst0rldcux
     1V7M4q8T5o1fPU6Zs+hxj+SjvN8D/DK3rn8b0m34/Xy388Yeu8jvFdJf/c6Y6LDU7Hulj01nAAAA==

I then run `git send-email --smtp-debug=1 0001.patch` and see that the
header is joined into a single long line before sending:

    Net::SMTP::_SSL=GLOB(0x5646fbdc3ac8)>>> X-Developer-Signature: v=1; a=openpgp-sha256; l=672; h=from:subject; bh=C40yOKgIfnNIUP+OW9WyPdBfljkZPpfUL1NepOODlx8=; b=owGbwMvMwCF2w7xIXuiX9CvG02pJDAmb571P2bT4hIT3tcvsq+8rPfCOKdmU7vag8J+ak9XysyLv 
Xs7p7ChlYRDjYJAVU2TZPXvX5esPlmztvSEdDDOHlQlkCAMXpwBM5JA3I8O5hP6Tqm7lJst0rldcux 1V7M4q8T5o1fPU6Zs+hxj+SjvN8D/DK3rn8b0m34/Xy388Yeu8jvFdJf/c6Y6LDU7Hulj01nAAAA==

I don't fancy trying to change the behaviour of Net::SMTP in perl so I
think we just have to accept this one.

The best solution is probably to decode the header fully when parsing
it in patatt or whatever other script someone wants to write. Hopefully
with this mail in the archives other folks will avoid spending too
much time down this rabbit hole :)

(Also, wrapping should now be fixed as I'm sending via claws-mail)

Thanks,

-- 
Paul Barker
https://pbarker.dev/
Reply to thread Export thread (mbox)