~sircmpwn/aerc

2 2

aerc character encoding

Details
Message ID
<C2ORNHTD8YCI.1S8PJWTZU29S9@kobain>
DKIM signature
pass
Download raw message
Hello,

From time to time, I receive email that’s not UTF-8, and characters
are not rendered and instead show as \xxx. This is a problem when
reading mail. Is this a weird bug on my end, or is it because the mail
is never converted to utf-8 when displayed? Is there a nice way to
convert the encoding of received mail to the encoding set by the
current locale?

I use emacs when replying to mail. What I do is that I use aerc's
:toggle-headers, find the character encoding, and use Emacs'
revert-buffer-with-coding-system function to see the file with the
correct encoding, then use set-buffer-file-coding-system to convert
the mail to UTF-8 (because it’s 2020). If you’re an aerc+Emacs guru,
do you think that this process can be automated?

I’d be happy to hack prototype of text encoding conversion (with some
pointers!). I see at least three ways:
- using the LESSCHARSET env var when displaying the mail;
- using Go's [x/text/encoding] package;
- piping to iconv.

[x/text/encoding]: https://pkg.go.dev/golang.org/x/text/encoding?tab=doc

Thank you all,

-- Antonin
Details
Message ID
<42E38109-3BFF-4077-9B17-7921FE6F967B@labrat.space>
In-Reply-To
<C2ORNHTD8YCI.1S8PJWTZU29S9@kobain> (view parent)
DKIM signature
pass
Download raw message
On 12 May 2020 14:56:53 CEST, "Antonin Décimo" <antonin.decimo@gmail.com> wrote:
>>From time to time, I receive email that’s not UTF-8, and characters
>are not rendered and instead show as \xxx. This is a problem when
>reading mail. Is this a weird bug on my end, or is it because the mail
>is never converted to utf-8 when displayed? 

You have encoding issues when reading the message? Are you on the master branch version? It should just decode automatically there. The latest release version is still buggy iirc.

 What is still a bug on master is replying... it's a know issue no one fixed yet. But there should be no shenanigans with piping involved.
Rather it should just do the right thing and decode it on the fly.


Cheers,
Reto
Details
Message ID
<C2OU33GT7AHE.2W8KU74P7FDAN@kobain>
In-Reply-To
<42E38109-3BFF-4077-9B17-7921FE6F967B@labrat.space> (view parent)
DKIM signature
pass
Download raw message
> You have encoding issues when reading the message? Are you on the master
> branch version? It should just decode automatically there.

Yes, I’m using aerc 0.3.0.r178.gea2646f from the AUR. Should have
stated that before, sorry.

For instance, I received a message encoded in windows-1252. The word
"électricité" was displayed as "<E9>l<E9>ctricit<E9>" on both termite
and gnome-terminal. The same is happening with iso-8559-1 mails.

> What is still a bug on master is replying... it's a know issue no one
> fixed yet. But there should be no shenanigans with piping involved.

Anyway, the filter mechanism does not powerful enough to extract the
encoding from the header and pass it to iconv.

> Rather it should just do the right thing and decode it on the fly.

Agreed.
Reply to thread Export thread (mbox)