~bptato/chawan-devel

2 2

Unicode support?

Details
Message ID
<CZX1DKA0D2S8.M9XJ214UUSXN@posteo.net>
DKIM signature
pass
Download raw message
I haven't looked at the codebase yet, but I imagined that unicode would
be supported? However, when I set `cha -T text/html` as the command to
pipe text/html emails to in aerc, I see that e.g. the unicode character
’ (U+2019) is rendered as

   ’ 

and similar garbled text for other unicode things. I checked with nim's
`validateUtf8` and the character that is in the email is valid UTF-8. If
it's not supported: is that a goal of the project?
Details
Message ID
<Zfh_GEVvJwgC_6c_@touya>
In-Reply-To
<CZX1DKA0D2S8.M9XJ214UUSXN@posteo.net> (view parent)
DKIM signature
pass
Download raw message
> I haven't looked at the codebase yet, but I imagined that unicode would
> be supported?

UTF-8 is fully supported, as are several other character sets.

> However, when I set `cha -T text/html` as the command to pipe text/html emails
> to in aerc, I see that e.g. the unicode character ’ (U+2019) is rendered as
> 
>    ’ 
> 
> and similar garbled text for other unicode things.

I assume charset detection went wrong somewhere, and Chawan used the default
latin-2 fallback. That or the output charset is wrong.

Could you please:

* send the output of `echo "LC_ALL=$LC_ALL LC_CTYPE=$LC_CTYPE LANG=$LANG"`,
* check if UTF-8 works *anywhere* at all, e.g. `echo Straße | cha -T text/html`
  (if it does, please send a file where it does not), and
* try adding the switch `-I utf-8` and then `-O utf-8` to the aerc command?

Thanks.
Details
Message ID
<CZX7JKZ6M9EF.3W08N314IGK7F@posteo.net>
In-Reply-To
<Zfh_GEVvJwgC_6c_@touya> (view parent)
DKIM signature
pass
Download raw message
Of the three environment variables, I only have one set:

    LANG=en_AU.UTF-8

Adding `-I utf-8` to the aerc command was enough to get it working.

Thanks,
Leon
Reply to thread Export thread (mbox)