~mpu/qbe

17 3

possible bug in rv64 output

Gregory (tim) Kelly <gtkelly@zerowasteplanet.com>
Details
Message ID
<644148A9.3070609@zerowasteplanet.com>
DKIM signature
missing
Download raw message
% uname -a
FreeBSD fbsd_clean 13.1-RELEASE FreeBSD 13.1-RELEASE 
releng/13.1-n250148-fc952ac2212 GENERIC amd64

% ./qbe -t rv64 test/sum.ssa
....

	xor t1, t1, 1


Although sum.ssa passes the check, the above line is ambiguous.  It uses 
the form xor, which expects three registers, but has an immediate, which 
requires xori.  I do not see xori in rv64/emit.c, nor does the xor have 
a %k modifier (i.e., xor%k).

If this is left up to the assembler, and it accepts it, I can see it 
being interpreted as register 1 instead of the number 1.

(I do not have an assembler to test this.)

tim

-- 
"Chile será la tumba del neoliberalismo!
(Chile will be the death of neoliberalism!)"
		-- Progressive International
Details
Message ID
<64414ED2.30306@dialectronics.com>
In-Reply-To
<644148A9.3070609@zerowasteplanet.com> (view parent)
DKIM signature
missing
Download raw message
Apologies on that.  I posted from the wrong email address.

This code narrows it down

% cat test/xor.ssa
export
function w $sum(l %arr, w %num) {
@start
	%n1 =w xor %num, 1
         ret %n1
}

% ./qbe -t rv64 test/xor.ssa
.text
.globl sum
sum:
	sd fp, -16(sp)
	sd ra, -8(sp)
	add fp, sp, -16
	add sp, sp, -16
	xor a0, a1, 1
	add sp, fp, 16
	ld ra, 8(fp)
	ld fp, 0(fp)
	ret
.type sum, @function
.size sum, .-sum
/* end function sum */


There does not appear to be a Oxori or equivalent instruction in the il.

(Also, this is a leaf function and does not need a prologue or epilog, 
which add seven instructions.)

tim

-- 
"We march, y'all mad. We sit down y'all mad.
We speak up, y'all mad. We die y'all silent."
		-- sidewalk graffiti in Minneapolis, MN
Details
Message ID
<22af2564-c048-4c20-93ab-5f5d75008c5e@app.fastmail.com>
In-Reply-To
<64414ED2.30306@dialectronics.com> (view parent)
DKIM signature
missing
Download raw message
About assembly issues: if the assembler eats it,
I'm happy enough. It's not uncommon that assembler
actually do a little bit of work on their own and
we are fine relying on it. For example, the amd64
backend never emits movabsq instructions and lets
the assembler use them when necessary.
Details
Message ID
<644181C3.4020508@dialectronics.com>
In-Reply-To
<22af2564-c048-4c20-93ab-5f5d75008c5e@app.fastmail.com> (view parent)
DKIM signature
missing
Download raw message
Quentin Carbonneaux wrote:
> About assembly issues: if the assembler eats it,
> I'm happy enough. It's not uncommon that assembler
> actually do a little bit of work on their own and
> we are fine relying on it. For example, the amd64
> backend never emits movabsq instructions and lets
> the assembler use them when necessary.

I have attached a patch to address this for add(i), or(i), xor(i), and 
possibly sll(i)*.  I am having some difficulty coming up with a full 
test case, as ssa keeps removing instructions, but

function w $immed(l %arr, w %num) {
@start
         %n1 =w or %num, 1
         %n1 =w or %num, %n1

         %n2 =w xor %num, 1
         %n2 =w xor %num, %n2

         %n3 =w add %num, 1
         %n3 =w add %num, %n2

         %n4 =w sub %num, 1
         %n4 =w sub %num, %n3

         ret %n4
}

yields

  % ./qbe -t rv64 test/i.ssa
.text
immed:
	sd fp, -16(sp)
	sd ra, -8(sp)
	add fp, sp, -16
	add sp, sp, -16
	xori t0, a1, 1
	xor t0, a1, t0
	addw t0, a1, t0
	subw a0, a1, t0
	add sp, fp, 16
	ld ra, 8(fp)
	ld fp, 0(fp)
	ret
.type immed, @function
.size immed, .-immed
/* end function immed */

Note that the prologue and epilogue are hard-coded for add and not addi. 
   I can fix this in another patch.

More testing is needed, but I do not know the IL well enough to come up 
with robust test cases.

I would not count on an assembler making the correct assumption about 
whether the third value is a register or an immediate.  One of them will 
decide to do it a certain way, then others will do the same and 
eventually we have one and only choice.

tim

-- 
"American exceptionalism."  - one-liner by U.S. President Barack Obama
Details
Message ID
<64418513.3080204@dialectronics.com>
In-Reply-To
<22af2564-c048-4c20-93ab-5f5d75008c5e@app.fastmail.com> (view parent)
DKIM signature
missing
Download raw message
Oops.  Left off immediates for Kl.  Patch fixed and attached.

Quentin Carbonneaux wrote:
> About assembly issues: if the assembler eats it,
> I'm happy enough. It's not uncommon that assembler
> actually do a little bit of work on their own and
> we are fine relying on it. For example, the amd64
> backend never emits movabsq instructions and lets
> the assembler use them when necessary.

-- 
"I propose to take our countrymen's claims of American exceptionalism
seriously, which is to say I propose subjecting our country to an
exceptional moral standard."
		-- Te-Nehisi Coates
Details
Message ID
<6442B4DC.705@dialectronics.com>
In-Reply-To
<22af2564-c048-4c20-93ab-5f5d75008c5e@app.fastmail.com> (view parent)
DKIM signature
missing
Download raw message
Thinking about this a little more, I don't think my patch solves the 
problem completely.  It will work for scalars/immediates that are less 
or equal to than 2^12 (4096) but won't for stuff like 0xFF00000000. 
That will need some shifting in registers.

When I did the rv32 backend to pcc, this could not be handled in a 
template.  I had to test the size of the scalar and if necessary add in 
an additional instruction.

For now, please ignore the patch, as it is incomplete.

(I ended up looking at this because I was trying to look at the 
ramifications of a hardwired zero register vs. using xor of something 
with itself.)

tim

-- 
"It is difficult to get a man to understand something
when his salary depends on him not understanding it."
		-- Upton Sinclair
Details
Message ID
<b4df4462-3fa9-4711-8dfd-6244c31e3ac3@app.fastmail.com>
In-Reply-To
<6442B4DC.705@dialectronics.com> (view parent)
DKIM signature
missing
Download raw message
On Fri, Apr 21, 2023, at 18:07, Tim Kelly wrote:
> Thinking about this a little more, I don't think my patch solves the 
> problem completely.  It will work for scalars/immediates that are less 
> or equal to than 2^12 (4096) but won't for stuff like 0xFF00000000. 
> That will need some shifting in registers.

That is precisely the kind of problems we
handle during instruction selection (isel.c).
Check it out, maybe it does the right
thing already.
Details
Message ID
<6443AD6A.8010206@dialectronics.com>
In-Reply-To
<b4df4462-3fa9-4711-8dfd-6244c31e3ac3@app.fastmail.com> (view parent)
DKIM signature
missing
Download raw message
I'm not following your response.  As I said in my initial post

xor t1, t1, 1

is not a legal instruction on RV32 or RV64.  It combines two different 
forms.  The xor instruction expects three registers.  "1" is not a 
register.  If "1" is to be a scalar/constant, then the xori instruction 
must be selected.

The opcodes:

xor  = 0110011
xori = 0010011

Looking at it closer, I do not find that add, sub and the shifts have 
"i" variants in the emit.c omap struct.

Yes, preinspection is necessary, to properly load a scalar 2^12 (4096) 
or greater (using one or two additional instructions), but if the 
scalar/constant is less than 4095, it is unnecessary to load it into a 
register (and penalizes the CPU).  Hence the need to select both xor and 
xori (and add or addi, sub or subi, etc).

tim

Quentin Carbonneaux wrote:
> On Fri, Apr 21, 2023, at 18:07, Tim Kelly wrote:
>> Thinking about this a little more, I don't think my patch solves the 
>> problem completely.  It will work for scalars/immediates that are less 
>> or equal to than 2^12 (4096) but won't for stuff like 0xFF00000000. 
>> That will need some shifting in registers.
> 
> That is precisely the kind of problems we
> handle during instruction selection (isel.c).
> Check it out, maybe it does the right
> thing already.

-- 
They who can give up essential liberty to obtain a little temporary 
safety, deserve neither liberty nor safety.

			-- Benjamin Franklin
Details
Message ID
<6443DA4A.2090502@dialectronics.com>
In-Reply-To
<b4df4462-3fa9-4711-8dfd-6244c31e3ac3@app.fastmail.com> (view parent)
DKIM signature
missing
Download raw message
Quentin Carbonneaux wrote:
> That is precisely the kind of problems we
> handle during instruction selection (isel.c).
> Check it out, maybe it does the right
> thing already.

Ok, I think I follow what you are saying a bit more.  We were looking at 
it from different ends.  You were saying it should already work for 
large scalars, which I had not tested.

Yes, it actually does work without any changes for scalars >= 4096, by 
emitting a pseudoinstruction li (leaving it up to the assembler to 
decide if this can be done with one or two instructions).

The issue is scalars < 4096.  The emit omap templates do not have an 
ability to append the i to the instruction in this case, and the 
add/sub/or/xor falls through, which is not correct.  My patch addresses 
this.

Original behavior

#cat test/xor.ssa

export
function w $sum(l %arr, w %num) {
@start
         %n1 =w xor %num, 1
         ret %n1
}


# ./qbe -t rv64 test/xor.ssa
.text
.globl sum
sum:
	sd fp, -16(sp)
	sd ra, -8(sp)
	add fp, sp, -16
	add sp, sp, -16
	xor a0, a1, 1
	add sp, fp, 16
	ld ra, 8(fp)
	ld fp, 0(fp)
	ret
/* end function sum */

.section .note.GNU-stack,"",@progbits


As I indicated in my previous post, the xor is in an ambiguous form.

After patch


# ./qbe -t rv64 test/xor.ssa
.text
.globl sum
sum:
	sd fp, -16(sp)
	sd ra, -8(sp)
	add fp, sp, -16
	add sp, sp, -16
	xori a0, a1, 1
	add sp, fp, 16
	ld ra, 8(fp)
	ld fp, 0(fp)
	ret
/* end function sum */

.section .note.GNU-stack,"",@progbit




Scalars >= 4096 are unaffected because the temp isel already moved them 
into a register.

#cat test/xor.ssa

export
function w $sum(l %arr, w %num) {
@start
         %n1 =w xor %num, 8096
         ret %n1
}

# ./qbe -t rv64 test/xor.ssa
.text
.globl sum
sum:
	sd fp, -16(sp)
	sd ra, -8(sp)
	add fp, sp, -16
	add sp, sp, -16
	li t0, 8096
	xor a0, a1, t0
	add sp, fp, 16
	ld ra, 8(fp)
	ld fp, 0(fp)
	ret
/* end function sum */

.section .note.GNU-stack,"",@progbits


So, pending some further testing, I believe my second patch is correct. 
  Testing needs to be done on an rv64 system, which I can not do.  If 
the original behavior was passing tests, I would be concerned that the 
assembler is changing code and the original emit is relying on behavior 
specific to the assembler.

(Again, the prologue and epilogue are hardcoded, so the incorrect use of 
add instead of addi needs to be fixed in a later patch, as well as even 
the presence of either on a leaf function.)

tim

-- 
No one is born hating another person because of the color of his skin,
or his background, or his religion.  People must learn to hate, and if
they can learn to hate, they can be taught to love, for love comes more
naturally to the human heart than its opposite.
		-- Nelson Mandela
Details
Message ID
<CAGw6cBuH1trVdCXDaqAWMxw+q4NGzeZnWyO8Lpgig6=ss2jeww@mail.gmail.com>
In-Reply-To
<644181C3.4020508@dialectronics.com> (view parent)
DKIM signature
missing
Download raw message
On 2023-04-20, Tim Kelly <gtkelly@dialectronics.com> wrote:
> I would not count on an assembler making the correct assumption about
> whether the third value is a register or an immediate.  One of them will
> decide to do it a certain way, then others will do the same and
> eventually we have one and only choice.

GNU as treats xor with an immediate third operand as an alias for xori
(see [0]). QBE outputs gas assembly, so I don't see an issue with
using this alias if it simplifies the code. Are you aware of any
assemblers that claim to support gas syntax, but don't support this
alias?

[0] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=opcodes/riscv-opc.c;h=d9d69cda548bc577062e4923b00e2eb9d085c20c;hb=HEAD#l479
Details
Message ID
<644442E0.8050508@dialectronics.com>
In-Reply-To
<CAGw6cBuH1trVdCXDaqAWMxw+q4NGzeZnWyO8Lpgig6=ss2jeww@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Michael Forney wrote:
> GNU as treats xor with an immediate third operand as an alias for xori
> (see [0]). QBE outputs gas assembly, so I don't see an issue with
> using this alias if it simplifies the code. Are you aware of any
> assemblers that claim to support gas syntax, but don't support this
> alias?

I don't pay attention to GNU.  They don't pay attention to independent 
standards and warps them the same as Microsoft does.  If I want to 
adhere to what GNU does, I'd just use gcc.

The RISC-V standard calls for add/sub/or/xor/sh* to have three registers 
and addi/subi/ori/xori/sh*i with two registers and the immediate for 
scalars < 4096.

tim

-- 
"Chile será la tumba del neoliberalismo!
(Chile will be the death of neoliberalism!)"
		-- Progressive International
Details
Message ID
<CAGw6cBsX5KA5q7NkvgLPG4hcP6JQQfEmZ0BSKgSO+6LWn+FkZg@mail.gmail.com>
In-Reply-To
<644442E0.8050508@dialectronics.com> (view parent)
DKIM signature
missing
Download raw message
On 2023-04-22, Tim Kelly <gtkelly@dialectronics.com> wrote:
> I don't pay attention to GNU.  They don't pay attention to independent
> standards and warps them the same as Microsoft does.  If I want to
> adhere to what GNU does, I'd just use gcc.

Just curious, which assembler do you plan to use with QBE?

> The RISC-V standard calls for add/sub/or/xor/sh* to have three registers
> and addi/subi/ori/xori/sh*i with two registers and the immediate for
> scalars < 4096.

I'm talking about the assembly language syntax, not the ISA. QBE
outputs gas syntax (GNU as), which parses xor with an immediate as an
alias for xori. It also supports several pseudo instructions like li,
which also don't exist in the ISA.

If you want to add support for some other risc-v assembly syntax,
that's fine. But I don't see a problem with using a feature of gas
assembly syntax in QBE's code that produces gas assembly syntax.

If there exists another assembler that claims to support gas syntax,
but doesn't support this alias, that'd be a reason to avoid using it,
but you haven't yet told us which one you're using that has a problem
with QBE's output.
Details
Message ID
<64444A2D.1070509@dialectronics.com>
In-Reply-To
<CAGw6cBsX5KA5q7NkvgLPG4hcP6JQQfEmZ0BSKgSO+6LWn+FkZg@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Michael Forney wrote:
> On 2023-04-22, Tim Kelly <gtkelly@dialectronics.com> wrote:
>> I don't pay attention to GNU.  They don't pay attention to independent
>> standards and warps them the same as Microsoft does.  If I want to
>> adhere to what GNU does, I'd just use gcc.
> 
> Just curious, which assembler do you plan to use with QBE?

That really isn't relevant and distracts from the correctness of the 
emitted Assembly by QBE.

> I'm talking about the assembly language syntax, not the ISA. QBE
> outputs gas syntax (GNU as), which parses xor with an immediate as an
> alias for xori. It also supports several pseudo instructions like li,
> which also don't exist in the ISA.

li is a documented pseudoinstruction in the RISC-v Reader, written by 
Andrew Waterman and David Patterson, two of the original designers of 
the RISC-V standard.  li is also a previously recognized 
pseudoinstrucion in the PowerPC architecture.

> If you want to add support for some other risc-v assembly syntax,
> that's fine. But I don't see a problem with using a feature of gas
> assembly syntax in QBE's code that produces gas assembly syntax.
> 
> If there exists another assembler that claims to support gas syntax,
> but doesn't support this alias, that'd be a reason to avoid using it,
> but you haven't yet told us which one you're using that has a problem
> with QBE's output.

That really isn't relevant and distracts from the correctness of the 
emitted Assembly by QBE.

tim

-- 
"American exceptionalism."  - one-liner by U.S. President Barack Obama
Details
Message ID
<CAGw6cBvvodPe7Yt_tC6adiGq=hDCPXjJ7OvcUEWjertW_p4Sww@mail.gmail.com>
In-Reply-To
<64444A2D.1070509@dialectronics.com> (view parent)
DKIM signature
missing
Download raw message
On 2023-04-22, Tim Kelly <gtkelly@dialectronics.com> wrote:
> That really isn't relevant and distracts from the correctness of the
> emitted Assembly by QBE.

You talk about correctness of emitted assembly, as if assembly was a
well-defined standard language that all assemblers follow.
Unfortunately, this is not the case. The risc-v assembly programmer's
manual does *not* describe a complete assembly language. It doesn't
say what comments look like, how to emit data, constant syntax,
directives for section, alignment, symbol visibility, etc.

The flavor of assembly language syntax that QBE produces *is*
relevant. We have to pick some syntax to produce, and there are many
incompatible choices here (gas, nasm, msvc, plan9, etc). Currently,
QBE only supports gas syntax. Personally, I'd like to see QBE gain
support for plan9 assembly syntax, and this would involve writing code
to produce this syntax.

I don't see a problem with adding support for other assemblers with
QBE. But you seem to be interested in a hypothetical assembler that
supports gas syntax *except* for the convenient immediate aliases for
risc-v, but also say it's not relevant whether such an assembler
exists.
Details
Message ID
<CAGw6cBtusn4ENSxR9zWoD19K3OJZTUkNPTk529EHLWXKFQ5ESg@mail.gmail.com>
In-Reply-To
<CAGw6cBvvodPe7Yt_tC6adiGq=hDCPXjJ7OvcUEWjertW_p4Sww@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
On 2023-04-22, Michael Forney <mforney@mforney.org> wrote:
> The risc-v assembly programmer's
> manual does *not* describe a complete assembly language. It doesn't
> say what comments look like, how to emit data, constant syntax,
> directives for section, alignment, symbol visibility, etc.

Just to correct myself here. It does look like that document does
describe a brief overview of these directives describing the gas
syntax. But it also says things like

> I think it's probably better to beef up the binutils documentation rather than duplicating it here.

implying that the document is describing the specifics of athe ssembly
language for use with GNU binutils.

I haven't looked to closely at your patch, but if it doesn't break
anything and is unintrusive, then I think it'd be fine to apply. But I
still don't really see what problem you're trying to solve.
Details
Message ID
<644456CC.4070200@dialectronics.com>
In-Reply-To
<CAGw6cBvvodPe7Yt_tC6adiGq=hDCPXjJ7OvcUEWjertW_p4Sww@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Michael Forney wrote:
 > I don't see a problem with adding support for other assemblers with
 > QBE. But you seem to be interested in a hypothetical assembler that
 > supports gas syntax *except* for the convenient immediate aliases for
 > risc-v, but also say it's not relevant whether such an assembler
 > exists.

You are conflating syntax with behavior.  QBE might emit gas syntax, but 
your position requires gas behavior.  My patch removes the requirement 
that QBE users use assemblers that reproduce gas behavior.  It is 
comparable to QBE compiling -std=c99 as opposed to incorporating 
gnu-extensions.

tim

-- 
"Chile será la tumba del neoliberalismo!
(Chile will be the death of neoliberalism!)"
		-- Progressive International
Details
Message ID
<CAGw6cBu=MDdRWHZvtMPaeS5AFU8mHLp04KE9mjBP1WOwokxQRw@mail.gmail.com>
In-Reply-To
<644456CC.4070200@dialectronics.com> (view parent)
DKIM signature
missing
Download raw message
On 2023-04-22, Tim Kelly <gtkelly@dialectronics.com> wrote:
> You are conflating syntax with behavior.  QBE might emit gas syntax, but
> your position requires gas behavior.  My patch removes the requirement
> that QBE users use assemblers that reproduce gas behavior.  It is
> comparable to QBE compiling -std=c99 as opposed to incorporating
> gnu-extensions.

It is syntax, though. gas parses the line `xor t1, t1, 1` as syntactic
sugar for xori. On closer look, these aliases are indeed described in
the risc-v assembly manual [0], but rather than listing all the
aliases explicitly, it instructs you to look for the ALIAS lines in
opcodes/riscv-opc.c in binutils, which is what I linked earlier.

-std=c99 vs -std=gnu99 is not really the same thing. There is a
standard ISO/IEC 9899:1999 describing C99, but no standard assembly
syntax upon which gas extends.

[0] https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md#instruction-aliases
Details
Message ID
<64445F60.8010505@dialectronics.com>
In-Reply-To
<CAGw6cBu=MDdRWHZvtMPaeS5AFU8mHLp04KE9mjBP1WOwokxQRw@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Michael Forney wrote:
> It is syntax, though. gas parses the line `xor t1, t1, 1` as syntactic
> sugar for xori. 

Is there an equivalent of diabetes for programmers that rely on 
"syntactic sugar" to solve their laziness?  The fix took a handful of 
lines of code (most of which were adding %i to the omap field).

> On closer look, these aliases are indeed described in
> the risc-v assembly manual [0], but rather than listing all the
> aliases explicitly, it instructs you to look for the ALIAS lines in
> opcodes/riscv-opc.c in binutils, which is what I linked earlier.

I tune out as soon as you mention GPL/GNU/binutils.  My toolchain is 
completely gnu/gpl/binutils-free.  If you want GNU to set QBE policy, 
then we've reached an end of the conversation.

tim

-- 
In assuming that highly-intelligent children will find their own
way, we create highly-intelligent adults with no debt to society.
Reply to thread Export thread (mbox)