~skeeto/public-inbox

1

Re: Some sanity for C and C++ development on Windows

Details
Message ID
<df749edc-0413-4735-9cf2-c77db202cc6e@app.fastmail.com>
DKIM signature
pass
Download raw message
Hi, something that isn't stressed in the article is that
SetConsoleOutputCP() changes global state. Which means that if your
program does it, then yes, it might correctly process UTF-8 with libc
procedures, but once it's done your console will be broken! A simple example:

1. Open w64devkit
2. Verify that current active code page is IBM437:

    $ chcp
    Active code page: 437

3. Create and enter a directory with a Unicode character in its name:

    $ mkdir tortüga
    $ cd tortüga

4. Create a sample file main.c:

    #include <stdio.h>

    typedef signed int b32;
    typedef unsigned int u32;

    enum {
        IBM437 = 437,
        CP_UTF8 = 65001,
    };

    #define W32(r) __declspec(dllimport) r __stdcall
    W32(b32) SetConsoleOutputCP(u32);

    int main(int argc, char *argv[])
    {
        fprintf(stderr, "USAGE: %s <args>\n", argv[0]);
        /* SetConsoleOutputCP(IBM437); */
        return 0;

    }

4. Obtain libwinsane.o and copy it to the directory.
5. Compile: gcc main.c libwinsane.o
6. Exit directory and launch the binary:

    $ cd ..
    $ ./tortüga/a.exe
    USAGE: ./tortüga/a.exe <args>

7. As you can see it printed the character correctly, however:

    $ cd tort�ga
    $ pwd
    C:/Users/aragnir/code/tort�ga

8. If you uncomment the line that sets the code page back to IBM437 then there
   is no lingering side effect. No way I'm doing that at every exit point in my
   program!

On a side note, if you compile libwinsane.o with clang (from llvm-mingw), but
link with ld (not lld), then compiling the whole thing with gcc produces a
malformed binary:

    $ cd ~/code/skeeto_scratch/libwinsane
    $ make CC=x86_64-w64-mingw32-clang
    x86_64-w64-mingw32-clang -Os -g -Wall -Wextra   -c -o init.o init.c
    x86_64-w64-mingw32-windres -o manifest.o manifest.rc
    x86_64-w64-mingw32-ld -relocatable -o libwinsane.o init.o manifest.o

    $ cd ../../torüga
    $ cp ../skeeto_scratch/libwinsane/libwinsane.o .

    $ gcc main.c libwinsane.o
    C:/Users/aragnir/code/shared/w64devkit/bin/ld.exe: a.exe:/4: section below image base
    $ ./a.exe
    sh: ./a.exe: Exec format error

So it seems that gcc and clang don't cooperate here.

Re: Some sanity for C and C++ development on Windows

Details
Message ID
<20241221174915.fcdj42zjw5a3xpma@nullprogram.com>
In-Reply-To
<df749edc-0413-4735-9cf2-c77db202cc6e@app.fastmail.com> (view parent)
DKIM signature
missing
Download raw message
Thanks, Pavel, and good point! I've added a note to my article. It's been 
three years, and I never made significant use of libwinsane, in large part 
because of this issue. It's an unsatisfactory solution in general, though 
convenient for a quick port.

Some good news: Your example will no longer demonstrate the problem in the 
next x64 w64devkit release. I've enabled unicode in 64-bit builds, and so 
shell behavior no longer depends on the console code page. Though most 
other included software still does, so you'd only need to change your 
example slightly. The fundamental problem doesn't change.

> compiling the whole thing with gcc produces a malformed binary

That's not too surprising, particularly with windres involved. If it's 
LLVM versus Binutils, generally it's a Bintuils bug, so bfd rather than 
LLVM windres. That's where I'd look first. This year I observed a similar 
incompatibility between Binutils import libraries and MSVC link.exe:

https://github.com/skeeto/w64devkit/issues/135

It seems mixing and matching toolchains doesn't produce robust results 
unless there's a hard module boundary mediating them. I bet hardly anyone 
is doing this, so it doesn't get noticed and fixed.
Reply to thread Export thread (mbox)