~mpu/qbe

5 3

Seemingly incorrect codegen for arm64-apple

Details
Message ID
<5a825d5d-285c-44a1-970c-1d0f1d3daa22@gmail.com>
DKIM signature
pass
Download raw message
I wrote this simple program that allocates memory on the stack with a 
specified size in another function and returns a pointer to that memory:

```elle
external fn printf(string formatter, ...);

fn doSomething(long size) {
     int items[size];
     items[0] = 100;
     return items;
}

pub fn main() {
     int *res = doSomething(10);
     printf("%d", res[0]); // Expected: 100
     return 0;
}
```

It compiles into QBE fine.

```qbe
function l $doSomething(l %size.2) {
@start
	%tmp.2.3 =l extsw 4
	%tmp.5 =l copy %size.2
	%tmp.6 =l copy %tmp.2.3
	%tmp.7 =l mul %tmp.5, %tmp.6
	%items.8 =l alloc8 %tmp.7
	%tmp.8.9 =l extsw 0
	%tmp.11 =l copy 4
	%tmp.12 =l copy %tmp.8.9
	%tmp.13 =l mul %tmp.11, %tmp.12
	%tmp.14 =l copy %items.8
	%tmp.15 =l copy %tmp.13
	%tmp.16 =l add %tmp.14, %tmp.15
	storew 100, %tmp.16
	%r.v16.17 =l copy %items.8
	ret %r.v16.17
}
export function w $main() {
@start
	%tmp.19.20 =l call $doSomething(l 10)
	%res.18 =l copy %tmp.19.20
	%tmp.21.22 =l extsw 0
	%tmp.24 =l copy 4
	%tmp.25 =l copy %tmp.21.22
	%tmp.26 =l mul %tmp.24, %tmp.25
	%tmp.27 =l copy %res.18
	%tmp.28 =l copy %tmp.26
	%tmp.29 =l add %tmp.27, %tmp.28
	%tmp.30.31 =w loadl %tmp.29
	%tmp.32.33 =w call $printf(l $main.21, ..., w %tmp.30.31)
	%r.v33.34 =w copy 0
	ret %r.v33.34
}
data $main.21 = { b "%d", b 0 }
```

The issue is that in the assembly generated by QBE, this instruction is 
generated:
```s
and x1, x1, #1048575, lsl #12
```
which executes `lsl` and `and` inline.

This actually raises an assembler error on the arm64 architecture when 
linking with clang.
```d
dist/concat/out.s:10:24: error: invalid operand for instruction
  and x1, x1, #1048575, lsl #12
                        ^
```

This can be fixed manually by splitting the 2 instructions:
```s
lsl x1, x1, #12
and x1, x1, #1048575
```

and now the program compiles and seemingly runs correctly. I can 
obviously do this manually during the build process, but I thought I 
should raise this here too just in case it wasn't reported before.

It also seems like this instruction is only emitted when the size that 
is passed to create the buffer is taken in as a parameter in the 
function. This means that this function:

```elle
external fn printf(string formatter, ...);

pub fn main() {
     long size = 10;
     int items[size];
     items[0] = 100;
     printf("%d", items[0]);
     return 0;
}
```

which compiles into this:
```qbe
export function w $main() {
@start
	%size.2 =l copy 10
	%tmp.2.3 =l extsw 4
	%tmp.5 =l copy %size.2
	%tmp.6 =l copy %tmp.2.3
	%tmp.7 =l mul %tmp.5, %tmp.6
	%items.8 =l alloc8 %tmp.7
	%tmp.8.9 =l extsw 0
	%tmp.11 =l copy 4
	%tmp.12 =l copy %tmp.8.9
	%tmp.13 =l mul %tmp.11, %tmp.12
	%tmp.14 =l copy %items.8
	%tmp.15 =l copy %tmp.13
	%tmp.16 =l add %tmp.14, %tmp.15
	storew 100, %tmp.16
	%tmp.17.18 =l extsw 0
	%tmp.20 =l copy 4
	%tmp.21 =l copy %tmp.17.18
	%tmp.22 =l mul %tmp.20, %tmp.21
	%tmp.23 =l copy %items.8
	%tmp.24 =l copy %tmp.22
	%tmp.25 =l add %tmp.23, %tmp.24
	%tmp.26.27 =w loadl %tmp.25
	%tmp.28.29 =w call $printf(l $main.17, ..., w %tmp.26.27)
	%r.v29.30 =w copy 0
	ret %r.v29.30
}
data $main.17 = { b "%d", b 0 }
```

does not actually emit this instruction and so the program compiles 
correctly.
Details
Message ID
<39d54e7a-0590-490d-8cdc-37e6a0478b88@gmail.com>
In-Reply-To
<5a825d5d-285c-44a1-970c-1d0f1d3daa22@gmail.com> (view parent)
DKIM signature
pass
Download raw message
I did this in my build process temporarily:

```makefile
$(TMP_PATH)/out.tmp3.s: $(TMP_PATH)/out.tmp2.s
	sed -E 's/and\t(.*), #(.*), lsl (.*)/lsl\t\1, \3\n\tand\t\1, #\2/g' $< > $@
```

but I would love to know whether this is an issue I caused due to 
incompetence or due to QBE screwing up codegen.
Details
Message ID
<79dd8faf-0b18-4344-be87-b72f475ec4af@yyny.dev>
In-Reply-To
<39d54e7a-0590-490d-8cdc-37e6a0478b88@gmail.com> (view parent)
DKIM signature
pass
Download raw message
> I would love to know whether this is an issue I caused due to 
> incompetence or due to QBE screwing up codegen.

AND with an immediate does not support shifting, so definitely a QBE 
codegen bug:

https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/AND--immediate-?lang=en
Details
Message ID
<45e463dc-49ae-48d5-9d1d-d94e661ce6ed@app.fastmail.com>
In-Reply-To
<39d54e7a-0590-490d-8cdc-37e6a0478b88@gmail.com> (view parent)
DKIM signature
pass
Download raw message
On Thu, Jun 20, 2024, at 22:03, Rosie wrote:
> but I would love to know whether this is an issue I caused due to 
> incompetence or due to QBE screwing up codegen.

qbe should not generate invalid asm. That being said, the
way you allocate memory on the stack of a function and
return it is unsafe. You should consider that all the
pointers you got from a "alloc" instruction are invalid
once the function has returned.

I will try to fix the asm generation bug shortly.
Details
Message ID
<60e59fea-e8c6-43c9-8f6c-15b3f34575a4@gmail.com>
In-Reply-To
<39d54e7a-0590-490d-8cdc-37e6a0478b88@gmail.com> (view parent)
DKIM signature
pass
Download raw message
 > the way you allocate memory on the stack of a function and return it 
is unsafe.

I know, a better approach would probably have been just allocate on the 
heap with malloc. However in a function like this:

```elle
fn concat(int size, ...) {
     variadic args[size * #size(string)];
     defer free(args);

     string strings[size];
     int sizes[size];
     long length = 0;

     // Collect the strings and the final string length
     for int i = 0; i < size; i++ {
         strings[i] = args yield string;
         sizes[i] = strlen(strings[i]);
         length += sizes[i];
     }

     string result = malloc((length + 1) * #size(char));
     int index = 0;

     // Construct the final string
     for int i = 0; i < size; i++ {
         string current = strings[i];

         for int j = 0; j < sizes[i]; j++ {
             result[index] = current[j];
             index++;
         }
     }

     // Include null terminator
     result[index] = '\0';
     return result;
}
```

which does not return stack allocated memory, the incorrect instruction 
is still emitted.

(The QBE generated by the compiler is quite long so I'll refrain from 
sending that in this reply)
Details
Message ID
<91494689-8019-4584-9b48-be7c8a3d9c4a@app.fastmail.com>
In-Reply-To
<60e59fea-e8c6-43c9-8f6c-15b3f34575a4@gmail.com> (view parent)
DKIM signature
pass
Download raw message
On Fri, Jun 21, 2024, at 11:24, Rosie wrote:
> which does not return stack allocated memory, the incorrect instruction 
> is still emitted.

Oh yes sure, I'm not saying qbe is right.
Reply to thread Export thread (mbox)