~sircmpwn/aerc

13 4

non ASCII letters in header fields

Ondřej Synáček
Details
Message ID
<C99WK4R2GGLC.2URW0OS09S9MM@Prefab.local>
DKIM signature
pass
Download raw message
I'm not sure if this problem directly ties to aerc but I will ask anyway.
I've set up `isync` (mbsync) to sync my emails from my email provider
(Fastmail). However when I load the emails into aerc, some of them
are not shown correctly.

Good example is my Sent folder. My full name contains accented letters and
the "From" field in the email contains it (`"FirstName LastName" <.....>`.
When I look into the maildir files, I see this From header has value like
so: `=?utf-8?Q?Ond=C5=99ej_Syn=C3=A1=C4=8Dek?= <ondrejsynacek@fastmail.com>`.

This value is then used within aerc making it unreadable. I've reached out
to team maintaining isync to tell them about this. They told me that isync does
not alter the contents of the messages. I see the file has utf-8 encoding.

When I'm using aerc without maildir, this problem does not occur. Other email
clients that I use do not have this problem but I do not use maildir with
them.

If this is some issue on Fastmail's side, I would like to tell them. I just
wanted to get your perspective first.

Thanks
Details
Message ID
<YGNH5_G_1qBO-iMQXgMLLNnamQrWFnbwH8h2qgEaXK_onEIqAuESoWT153wNTPIkxjdKP2dgaFa9KZiRDllf08lXqdXjaTDLLhrB_QXCd3o=@emersion.fr>
In-Reply-To
<C99WK4R2GGLC.2URW0OS09S9MM@Prefab.local> (view parent)
DKIM signature
pass
Download raw message
Sounds like an aerc bug to me. go-message parsing functions such as
Header.AddressList should decode the charset, maybe aerc isn't using
these properly for the maildir side?
Details
Message ID
<20210215082637.wpbzbljfz3henvbg@feather.localdomain>
In-Reply-To
<C99WK4R2GGLC.2URW0OS09S9MM@Prefab.local> (view parent)
DKIM signature
pass
Download raw message
On Mon, Feb 15, 2021 at 07:49:24AM +0100, Ondřej Synáček wrote:
> When I look into the maildir files, I see this From header has value like
> so: `=?utf-8?Q?Ond=C5=99ej_Syn=C3=A1=C4=8Dek?= <ondrejsynacek@fastmail.com>`.

That's normal. Email is horribly broken and only expects ascii...
So we need to encode all other languages in transit.

> When I'm using aerc without maildir, this problem does not occur. Other email
> clients that I use do not have this problem but I do not use maildir with
> them.

Without maildir implying imap?
The encoding is valid in your case, a mail file I sent to myself with that
display name works just fine[1] in all workers (notmuch / maildir / imap)

Is the email very private? If not I could have a look if you upload the file as
is from your mailstore to a pastebin site, or drop it to me via email (as an
attachment).

> If this is some issue on Fastmail's side, I would like to tell them. I just
> wanted to get your perspective first.

Can't tell without the raw email

[1] https://labrat.space/irc/c145557e826f1897/Tl.jpeg

Cheers,
Reto
Ondřej Synáček
Details
Message ID
<5CC4FB74-FFEB-4241-AE62-2D9C922AC22F@fastmail.com>
In-Reply-To
<YGNH5_G_1qBO-iMQXgMLLNnamQrWFnbwH8h2qgEaXK_onEIqAuESoWT153wNTPIkxjdKP2dgaFa9KZiRDllf08lXqdXjaTDLLhrB_QXCd3o=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
On 15 Feb 2021, at 9:24, Simon Ser wrote:

> Sounds like an aerc bug to me. go-message parsing functions such as
> Header.AddressList should decode the charset, maybe aerc isn't using
> these properly for the maildir side?

The thing is that it’s not even consistent. Some emails that have 
“From”
headers with accents display fine, some not.

> Without maildir implying imap?

No what I mean is that I use plain IMAP and the client does not read 
from
maildir folders. Sorry for my phrasing, not very experienced with doing
email on command line.

> Is the email very private? If not I could have a look if you upload 
> the file as
> is from your mailstore to a pastebin site, or drop it to me via email 
> (as an
> attachment).

Not very private. I will send it to your address as an attachment if 
that is
fine. Thank you very much!
Kian Kasad
Details
Message ID
<20210215203241.k7ead3fn4j4ua6yv@frisbee>
In-Reply-To
<C99WK4R2GGLC.2URW0OS09S9MM@Prefab.local> (view parent)
DKIM signature
pass
Download raw message
On 21/02/15 07:49AM, Ondřej Synáček wrote:
> I'm not sure if this problem directly ties to aerc but I will ask anyway.
> I've set up `isync` (mbsync) to sync my emails from my email provider
> (Fastmail). However when I load the emails into aerc, some of them
> are not shown correctly.
> 
> Good example is my Sent folder. My full name contains accented letters and
> the "From" field in the email contains it (`"FirstName LastName" <.....>`.
> When I look into the maildir files, I see this From header has value like
> so: `=?utf-8?Q?Ond=C5=99ej_Syn=C3=A1=C4=8Dek?= <ondrejsynacek@fastmail.com>`.

This looks like a standard encoded-word email header as defined in RFC
2047 §2 [1]. As Reto mentioned, non-ASCII charsets get encoded to ASCII in
emails. So the way you see that header (in the Maildir file) is actually
the way it should look. It shows up the same for me in the Maildir file
of the email you sent:

	From: =?utf-8?q?Ond=C5=99ej_Syn=C3=A1=C4=8Dek?= <ondrejsynacek@fastmail.com>

So this seems like an aerc bug. Aerc should parse that encoding and
display the decoded UTF-8 (or whatever charset you use) in the client.

> This value is then used within aerc making it unreadable. I've reached out
> to team maintaining isync to tell them about this. They told me that isync does
> not alter the contents of the messages. I see the file has utf-8 encoding.

This is definitely not an isync bug. What you're seeing is completely
valid email format. (And also, isync does not alter the contents of the
message, so it can't be isync's fault.)

> When I'm using aerc without maildir, this problem does not occur. Other email
> clients that I use do not have this problem but I do not use maildir with
> them.

This is a bit weird. I don't know why it would work when using IMAP
directly but fail when using Maildir. I'm using Neomutt, and that header
format shows up fine for me in a Maildir. So my guess is some parsing
step is skipped in aerc when reading from Maildirs.

> If this is some issue on Fastmail's side, I would like to tell them. I just
> wanted to get your perspective first.

Not a Fastmail issue, since the format is valid.

You also mentioned in your reply that some headers with accents appear
normally. Are these also using the encoded-word format, or are they just
plain UTF-8 in the header?

References:
1. https://tools.ietf.org/html/rfc2047

--
Kian Kasad
PGP 0x1715EEAA14DAEC1
Ondřej Synáček
Details
Message ID
<C9AEG7XFICKO.W4GS96J6OTZG@Prefab.local>
In-Reply-To
<20210215203241.k7ead3fn4j4ua6yv@frisbee> (view parent)
DKIM signature
pass
Download raw message
On Mon Feb 15, 2021 at 9:32 PM CET, Kian Kasad wrote:
> You also mentioned in your reply that some headers with accents appear
> normally. Are these also using the encoded-word format, or are they just
> plain UTF-8 in the header?

They display as they should to the reader, so I guess plain UTF-8. The
strangest part is that I have some emails that contain my name inside
From header and it's showing fine in some of those mails.
Details
Message ID
<20210216221134.qr4hrqzvok6ops4g@feather.localdomain>
In-Reply-To
<20210215082637.wpbzbljfz3henvbg@feather.localdomain> (view parent)
DKIM signature
pass
Download raw message
On Mon, Feb 15, 2021 at 09:26:37AM +0100, Reto wrote:
> Is the email very private? If not I could have a look if you upload the file as
> is from your mailstore to a pastebin site, or drop it to me via email (as an
> attachment).

So, I did have a look.
Turns out it's the quotes.

runnable example here: http://ix.io/2PFf

Given something like

>From: "=?utf-8?b?T25kxZllaiBTeW7DocSNZWs=?=" <whatever@fastmail.com>

it fails:

```
fails: h.AddressList('from')[0].Name: =?utf-8?b?T25kxZllaiBTeW7DocSNZWs=?=
works: ent.Header.Text('from'): "Ondřej Synáček" <whatever@fastmail.com>
fails: msg.Header.AddressList('from')[0].Name: =?utf-8?b?T25kxZllaiBTeW7DocSNZWs=?=
```

If you remove the quotes it works as expected:

>From: =?utf-8?b?T25kxZllaiBTeW7DocSNZWs=?= <whatever@fastmail.com>

```
fails: h.AddressList('from')[0].Name: Ondřej Synáček
works: ent.Header.Text('from'): Ondřej Synáček <whatever@fastmail.com>
fails: msg.Header.AddressList('from')[0].Name: Ondřej Synáček
```

Now, that's either a mistake in go-message and friends or a lenient RFC interpretation
of all other MUAs

Let's see, RFC2047 tells us in paragraph 5.3:
>An 'encoded-word' MUST NOT appear within a 'quoted-string'

Now "quoted-string" (pun intented) is defined by RFC2822 3.2.5 as
>quoted-string = [CFWS] DQUOTE *([FWS] qcontent) [FWS] DQUOTE [ CFWS]

So it looks like go-message does the right thing.
In other words you MUST NOT (rfc speak) quote the display name outside of the
encoding.

What did you use to write the mail then? Was that aerc? Aerc seems to work correctly
if I set the display name to something strange like "some µ crazy thing"

>To: =?utf-8?q?some_=C2=B5_crazy_thing?= <spam@labrat.space>

As you can see, no quotes meaning it also happily decodes back.

Cheers,
Reto
Ondřej Synáček
Details
Message ID
<C9BN4TK4SA28.3LOFQZ62GQTZX@Prefab.local>
In-Reply-To
<20210216221134.qr4hrqzvok6ops4g@feather.localdomain> (view parent)
DKIM signature
pass
Download raw message
On Tue Feb 16, 2021 at 11:11 PM CET, Reto wrote:
> So, I did have a look.
> Turns out it's the quotes.

Interesting, did not consider this would be an issue because other email
clients always displayed "From" field just fine.
>
> runnable example here: http://ix.io/2PFf
> Now, that's either a mistake in go-message and friends or a lenient RFC
> interpretation
> of all other MUAs

Mail on iOS, Mailmate on macOS and Apple Mail on macOS seem to cope with
this fine.

> What did you use to write the mail then? Was that aerc? Aerc seems to
> work correctly

No these emails are written mostly with Mailmate (macOS). I tried
composing new message there and yes, it's reproducible.
The composer in that application allows me to customize sender
email address, when I open that window, it's showing quotes
around my name (see attachment).

The accounts into that application were imported from operating system.
MacOS has preferences where you can import various internet accounts
and then other applications can use them.

What is weird is that when I remove the quotes around my name in
Mailmate and send the email, I get mangled header anyway.
I found out that when I remove the quotes around my name via
Mailmate UI, and confirm it in the dialog box, then try editing it
again, the quotes reappear.

So it's weird. It seems like it's not applying those changes and it
also seems that this is the way it's importing it to the application.

Is it possible to use regex, search for that pattern in my maildir
and then sync it back to the server? Or can this have some bad
consequences?

Anyway thanks for researching this, I'm glad the culprit was found.
I have opened a ticket for the application in their support forum[1] and
linked back here.

[1]
https://freron.lighthouseapp.com/projects/58672-mailmate/tickets/2758-raw-messages-sent-with-mailmate-have-incompatible-from-header
Details
Message ID
<40600A22-9F2B-4D87-A260-246C5FEB8199@labrat.space>
In-Reply-To
<C9BN4TK4SA28.3LOFQZ62GQTZX@Prefab.local> (view parent)
DKIM signature
pass
Download raw message
On 17 February 2021 08:51:34 CET, "Ondřej Synáček" <ondrejsynacek@fastmail.com> wrote:

>No these emails are written mostly with Mailmate (macOS). I tried
>composing new message there and yes, it's reproducible.
>The composer in that application allows me to customize sender
>email address, when I open that window, it's showing quotes
>around my name (see attachment).
>
>The accounts into that application were imported from operating system.
>MacOS has preferences where you can import various internet accounts
>and then other applications can use them.
>
>What is weird is that when I remove the quotes around my name in
>Mailmate and send the email, I get mangled header anyway.
>I found out that when I remove the quotes around my name via
>Mailmate UI, and confirm it in the dialog box, then try editing it
>again, the quotes reappear.
>
>So it's weird. It seems like it's not applying those changes and it
>also seems that this is the way it's importing it to the application.

Hm... maybe it's me misinterpreting the RFCs?

@Emersion, what do you think?
The current behavior is strange in the sense that h.Text("from") decodes but h.AddressList("from") doesn't.

Not sure what the proper behavior is, but it looks like other MUAs pretty much all decode it anyhow and some apparently also force the quotes.

Would you rather have me open up a bug in go-message than keep the discussion on this list?

>Is it possible to use regex, search for that pattern in my maildir
>and then sync it back to the server? Or can this have some bad
>consequences?

Ehm, bad idea I think.
For now consider it a bug until we can figure this out.

>Anyway thanks for researching this, I'm glad the culprit was found.
>I have opened a ticket for the application in their support forum[1]
>and
>linked back here.
>
>[1]
>https://freron.lighthouseapp.com/projects/58672-mailmate/tickets/2758-raw-messages-sent-with-mailmate-have-incompatible-from-header

Can't open, so if they reply please do send their reply to the list, although I guess it will be a pre canned unhelpful answer with barely a human involved. But I'm happy to let myself be surprised.

Cheers,
Reto
Ondřej Synáček
Details
Message ID
<C9BNNH0MB0N4.2T1UZKSOAYPL3@Prefab.local>
In-Reply-To
<40600A22-9F2B-4D87-A260-246C5FEB8199@labrat.space> (view parent)
DKIM signature
pass
Download raw message
On Wed Feb 17, 2021 at 9:14 AM CET, Reto wrote:
> Can't open, so if they reply please do send their reply to the list,
> although I guess it will be a pre canned unhelpful answer with barely a
> human involved. But I'm happy to let myself be surprised.

Will do. I think Mailmate is made by single developer (at least that's
what I always assumed) so maybe his support could be friendlier.
I have purchased license for it so that's one thing.
Well there many opened tickets there so if the developer is also
doing his own support, it might take longer.
Details
Message ID
<XfVf4XGU4OoxGjqGBtPnORT8ZXi5V7IXtP71SaehvQzvs51ic87nt1Gmy3IpMF7oanyVf_anl2nb2-YBXN2LVvFjrq0jsW1bfQEqboxqOn4=@emersion.fr>
In-Reply-To
<40600A22-9F2B-4D87-A260-246C5FEB8199@labrat.space> (view parent)
DKIM signature
pass
Download raw message
On Wednesday, February 17th, 2021 at 9:14 AM, Reto <reto@labrat.space> wrote:

> @Emersion, what do you think?
> The current behavior is strange in the sense that h.Text("from")
> decodes but h.AddressList("from") doesn't.

That's expected.

There are two types of header fields: structured and unstructured.

Structured fields need a parser to properly be interpreted. They have
an ABNF grammar, and q-encoding isn't allowed everywhere. Examples:
To, From, In-Reply-To.

Unstructured fields are just text and don't need to be parsed.
q-encoding is allowed everywhere. Example: Subject.

The Header.Text method parses unstructured header fields only, and
decodes any q-encoding it finds. Note, some fields don't allow
q-encoding at all.
Details
Message ID
<20210222142719.gimyj22jz5qwehor@feather.localdomain>
In-Reply-To
<XfVf4XGU4OoxGjqGBtPnORT8ZXi5V7IXtP71SaehvQzvs51ic87nt1Gmy3IpMF7oanyVf_anl2nb2-YBXN2LVvFjrq0jsW1bfQEqboxqOn4=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
On Mon, Feb 22, 2021 at 02:20:27PM +0000, Simon Ser wrote:
> That's expected.

Thanks a bunch.
Am I correct that this is then a case of "broken email, not a go-message bug"?

Cheers,
Reto
Details
Message ID
<-7NVKg5CNX2UC7phbaLcQMBAMuwgE5xFWFWqX9F0aN5QElk0_TojIUncqB3CpsLJo2Jgb1IiTS1ShI4ywpK01ahF5m4k-0PgQREhbU6Jrvo=@emersion.fr>
In-Reply-To
<20210222142719.gimyj22jz5qwehor@feather.localdomain> (view parent)
DKIM signature
pass
Download raw message
On Monday, February 22nd, 2021 at 3:27 PM, Reto <reto@labrat.space> wrote:

> Am I correct that this is then a case of "broken email, not a go-message bug"?

Since the RFC disallows q-encoding in quoted strings, I'd say yes. In
any case, we just delegate address parsing to the stdlib, so if there
was a bug, it would need to be reported upstream to the Go issue
tracker.
Ondřej Synáček
Details
Message ID
<C9G4SWTL0C20.IQ5L7EY80MDG@Prefab.local>
In-Reply-To
<-7NVKg5CNX2UC7phbaLcQMBAMuwgE5xFWFWqX9F0aN5QElk0_TojIUncqB3CpsLJo2Jgb1IiTS1ShI4ywpK01ahF5m4k-0PgQREhbU6Jrvo=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
On Mon Feb 22, 2021 at 3:31 PM CET, Simon Ser wrote:
> Since the RFC disallows q-encoding in quoted strings, I'd say yes. In
> any case, we just delegate address parsing to the stdlib, so if there
> was a bug, it would need to be reported upstream to the Go issue
> tracker.

So I guess then I have to

1. Stop using mailing clients that do this
2. Somehow find a way to fix my old emails

Does that mean all other mail clients are just more tolerant in parsing
the email headers?
Reply to thread Export thread (mbox)