~emersion/soju-dev

5 2

Index out of Range

Details
Message ID
<87blajjprb.fsf@posteo.net>
DKIM signature
pass
Download raw message
Hi, I've just updated soju and got this error message after
(accidentally) running /list on IRCNet:

panic: runtime error: index out of range [15] with length 15

goroutine 5 [running]:
git.sr.ht/~emersion/soju.partialCasemap(0x96d3b0, 0xc0003aa046, 0xf, 0xc0002d2c00, 0x414148) 
        /home/phi/soju/irc.go:505 +0x247
git.sr.ht/~emersion/soju.(*downstreamConn).marshalEntity(0xc000092a20, 0xc000206140, 0xc0003aa046, 0xf, 0xc0002d2d30, 0x4138d6)
        /home/phi/soju/downstream.go:214 +0x77
git.sr.ht/~emersion/soju.(*upstreamConn).handleMessage.func11(0xc000092a20)                  
        /home/phi/soju/upstream.go:1084 +0xc5
git.sr.ht/~emersion/soju.(*upstreamConn).forEachDownstreamByID.func1(0xc000092a20)           
        /home/phi/soju/upstream.go:214 +0x41
git.sr.ht/~emersion/soju.(*network).forEachDownstream.func1(0xc000092a20)                    
        /home/phi/soju/user.go:143 +0x47
git.sr.ht/~emersion/soju.(*user).forEachDownstream(0xc000200000, 0xc0002d2d68)               
        /home/phi/soju/user.go:392 +0x52
git.sr.ht/~emersion/soju.(*network).forEachDownstream(0xc000206140, 0xc0002d2da0)            
        /home/phi/soju/user.go:139 +0x68
git.sr.ht/~emersion/soju.(*upstreamConn).forEachDownstream(...)
        /home/phi/soju/upstream.go:206
git.sr.ht/~emersion/soju.(*upstreamConn).forEachDownstreamByID(0xc000214420, 0x1, 0xc0002d37e0)
        /home/phi/soju/upstream.go:210 +0x6a
git.sr.ht/~emersion/soju.(*upstreamConn).handleMessage(0xc000214420, 0xc0003a44c0, 0x1, 0x0) 
        /home/phi/soju/upstream.go:1080 +0x255d
git.sr.ht/~emersion/soju.(*user).run(0xc000200000)
        /home/phi/soju/user.go:506 +0x74f
git.sr.ht/~emersion/soju.(*Server).addUserLocked.func1(0xc000200000, 0xc0000c6370)           
        /home/phi/soju/server.go:142 +0x2f
created by git.sr.ht/~emersion/soju.(*Server).addUserLocked
        /home/phi/soju/server.go:141 +0x175

I tried to look into it myself, and it seems the case-mapping returns a
string longer than the actual string. But looking at the case mapping
functions, that shouldn't happen (nor is it intended, if I'm
understanding the code correctly). In GDB I confirm that the string

    "#hajd\372szoboszl\363"

was turned into

    "#hajd\357\277\275szoboszl\357\277\275"

using casemapRFC1459 if I am not mistaken. Before I dive too deep into
the codebase, I wanted to report the issue here, in case anyone here
knows what is going on.

My Go version is go1.14.
Details
Message ID
<20210412225503.151002a5@vroom.localdomain>
In-Reply-To
<87blajjprb.fsf@posteo.net> (view parent)
DKIM signature
pass
Download raw message
Hi, thank you for reporting this bug.

On Mon, 12 Apr 2021 20:56:24 +0200, Philip K. wrote:
> I tried to look into it myself, and it seems the case-mapping returns
> a string longer than the actual string. But looking at the case
> mapping functions, that shouldn't happen (nor is it intended, if I'm
> understanding the code correctly). In GDB I confirm that the string
> 
>     "#hajd\372szoboszl\363"
> 
> was turned into
> 
>     "#hajd\357\277\275szoboszl\357\277\275"
> 
> using casemapRFC1459 if I am not mistaken. Before I dive too deep into
> the codebase, I wanted to report the issue here, in case anyone here
> knows what is going on.

The casemapping code uses runes, go "for := range string" loop, string
indexing and strings.Builder.WriteRune(), so it only works with valid
UTF-8.

Because "#hajd\372szoboszl\363" is not UTF-8, invalid parts are
replaced with "\357\277\275", U+FFFD, �, the replacement character.

Working on byte slices instead (and then casting to string) should fix
the bug.  But it means casting invalid UTF-8 to a type that should not
contain UTF-8.

I'll send a patch tomorrow if there's no objection to this.

Hubert
Details
Message ID
<ZNAZM5prcPFvSlaQgheVDjQyxQCko3S_o0-_0DKJ2kSTGBq322VHLwHCvFAEYVwgX0lfPLuARo-in8w4ExqCQyycSSntla8VJG3y0Ejq-mM=@emersion.fr>
In-Reply-To
<20210412225503.151002a5@vroom.localdomain> (view parent)
DKIM signature
pass
Download raw message
On Monday, April 12th, 2021 at 10:54 PM, Hubert Hirtz <hubert@hirtz.pm> wrote:

> Because "#hajd\372szoboszl\363" is not UTF-8, invalid parts are
> replaced with "\357\277\275", U+FFFD, �, the replacement character.
> Working on byte slices instead (and then casting to string) should fix
> the bug. But it means casting invalid UTF-8 to a type that should not
> contain UTF-8.

Can we maybe convert the name to a []rune instead?

    nameRunes := []rune(name)
    r = name[i]
Details
Message ID
<20210413092459.2ea2bb98@vroom.localdomain>
In-Reply-To
<ZNAZM5prcPFvSlaQgheVDjQyxQCko3S_o0-_0DKJ2kSTGBq322VHLwHCvFAEYVwgX0lfPLuARo-in8w4ExqCQyycSSntla8VJG3y0Ejq-mM=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
On Tue, 13 Apr 2021 05:56:25 +0000, Simon Ser wrote:
> Can we maybe convert the name to a []rune instead?
> 
>     nameRunes := []rune(name)
>     r = name[i]

This will fix the panic, but will change invalid bytes to the
replacement character (because of the string to []rune conversion [0]).
I don't think this is wanted here.

[0] https://play.golang.org/p/KOwD4_XpPNR
Details
Message ID
<y1shGfa9YeXA2ivhOEVevugEf-wkaRJ2VwkjUdlfTZ7iwQUTMyZVjI6cSH1Gyb3F4fKTffJ6xEMoZZpIAUFbSIDG9pKtm1oiGj8605gF3FM=@emersion.fr>
In-Reply-To
<20210413092459.2ea2bb98@vroom.localdomain> (view parent)
DKIM signature
pass
Download raw message
On Tuesday, April 13th, 2021 at 9:24 AM, Hubert Hirtz <hubert@hirtz.pm> wrote:

> On Tue, 13 Apr 2021 05:56:25 +0000, Simon Ser wrote:
>
> > Can we maybe convert the name to a []rune instead?
> >
> >     nameRunes := []rune(name)
> >     r = name[i]
> >
>
> This will fix the panic, but will change invalid bytes to the
> replacement character (because of the string to []rune conversion [0]).
> I don't think this is wanted here.

Right. I guess can just relay the invalid UTF-8. Case-mapping operates
on bytes only anyways.
Details
Message ID
<yEhfi1ntb-TxTED6OYrBJZRNb3IQtz1XQ240mZVermDc_vACuFsLmN_9yhakq6zGY_rSVKCXYPfGrnYd1kUw2GU5Y0d_nF-y16b1CdtFLWU=@emersion.fr>
In-Reply-To
<y1shGfa9YeXA2ivhOEVevugEf-wkaRJ2VwkjUdlfTZ7iwQUTMyZVjI6cSH1Gyb3F4fKTffJ6xEMoZZpIAUFbSIDG9pKtm1oiGj8605gF3FM=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
This panic should be fixed with 70e5ed05b6fc ("Make casemapping work
over bytes instead of runes").
Reply to thread Export thread (mbox)