From nobody Wed Feb 17 14:33:44 2021 Authentication-Results: mail-b.sr.ht; dkim=pass header.d=emersion.fr header.i=@emersion.fr Received: from mail-40134.protonmail.ch (mail-40134.protonmail.ch [185.70.40.134]) by mail-b.sr.ht (Postfix) with ESMTPS id BC7E611F00D for <~emersion/soju-dev@lists.sr.ht>; Wed, 17 Feb 2021 14:33:42 +0000 (UTC) Date: Wed, 17 Feb 2021 14:33:20 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=emersion.fr; s=protonmail3; t=1613572420; bh=/nfxqubADnCgNK6yawGwmEbHJia1ryVJtpU6KMQdU5A=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=WwmmBrwZa1iuC/4uUKihJzLIPM3hatIceBicinGM9uVBh9C3QHrqkimwlczi/01u0 sEfxwzl3oIE306TLzuNI2O+707gd+G/rxOyn1oLEXUl0NeDrxkAkCAopkvT3xx0ad/ UA4kz59el0rgBpxaRGeIk5o1R9G4QI9S5id73VObrgYS0SV3ZZsZhV3zvGrMsjET8I GMOh/5R6t5PsZVdVs0ufGFJ8FrXijVRpUFpTEyiFjiqSlfHYYb9H0LkkmHl5yrXLAp Y+HMwUDAA6W8FVCWLdiH7ODSc0fkG5JCHZlGD/+gbpjAZvxNJS5pZnSbezFKtsKyi/ gzUxPw7861slg== To: Hubert Hirtz From: Simon Ser Cc: ~emersion/soju-dev@lists.sr.ht Reply-To: Simon Ser Subject: Re: [PATCH v6] Implement casemapping Message-ID: In-Reply-To: <20210126121607.14029-1-hubert@hirtz.pm> References: <20210126121607.14029-1-hubert@hirtz.pm> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=10.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=disabled version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on mailout.protonmail.ch On Tuesday, January 26th, 2021 at 1:16 PM, Hubert Hirtz w= rote: > TL;DR: supports for casemapping, now logs are saved in > casemapped/canonical/tolower form > (eg. in the #channel directory instead of #Channel... or something) > > =3D=3D What is casemapping? =3D=3D > > see > > =3D=3D Casemapping and multi-upstream =3D=3D > > Since each upstream does not necessarily use the same casemapping, and > since casemappings cannot coexist [0], > > 1. soju must also update the database accordingly to upstreams' > casemapping, otherwise it will end up inconsistent, > 2. soju must "normalize" entity names and expose only one casemapping > that is a subset of all supported casemappings (here, ascii). > > [0] On some upstreams, "emersion[m]" and "emersion{m}" refer to the same > user (upstreams that advertise rfc1459 for example), while on others > (upstreams that advertise ascii) they don't. > > =3D=3D Storage changes =3D=3D > > Once upstream's casemapping is known (default to rfc1459), entity names > in map keys are made into casemapped form, for upstreamConn, > upstreamChannel and network. > > downstreamConn advertises "CASEMAPPING=3Dascii", and always casemap map > keys with ascii. > > Some functions require the caller to casemap their argument (to avoid > needless calls to casemapping functions). As discussed elsewhere, I think it would be nice to have something like this: // casemap is a map that stores case-mapped values. type casemapMap struct { m map[string]casemapEntry // indexed by canonical name cm casemapping } type casemapEntry struct { original string value interface{} } // Get returns the value with the given name. The name is canonicalized // before accessing the map. func (cm *casemapMap) Get(name string) interface{} // UpdateCasemapping changes the case-mapping used by the map. func (cm *casemapMap) UpdateCasemapping(cm casemapping) > Log directories are casemapped. Is this desirable? Can upstream servers change the original name while still referring to the same nick/channel? I guess we don't want to store downstream messages which use a different case-mapping in a different log directory. So we would need to convert downstream names to original upstream names if we want to do this. > =3D=3D Message forwarding and casemapping =3D=3D > > When relaying entity names from downstreams to upstreams, soju uses the > upstream casemapping, in order to not get in the way of the user. This You mean soju uses the name as sent by downstream? > does not brings any issue, as long as soju replies with the ascii > casemapping in mind (solves point 1.). > > When relaying entity names from upstreams with non-ascii casemappings, > soju *partially* casemap them: it only change the case of characters > which are not ascii letters. ASCII case is thus kept intact, while > special symbols like []{} are the same every time soju sends them to > downstreams (solves point 2.). As discussed elsewhere, we should use the original upstream names. > =3D=3D Casemapping changes =3D=3D > > Casemapping changes are not fully supported by this patch and will > result in loss of history. This is a limitation of the protocol and > should be solved by the RENAME spec. [=E2=80=A6] > @@ -902,7 +913,7 @@ func (dc *downstreamConn) welcome() error { > =09dc.SendMessage(&irc.Message{ > =09=09Prefix: dc.srv.prefix(), > =09=09Command: irc.RPL_MYINFO, > -=09=09Params: []string{dc.nick, dc.srv.Hostname, "soju", "aiwroO", "Oov= aimnqpsrtklbeI"}, > +=09=09Params: []string{dc.nick, dc.srv.Hostname, "soju", "oOwari", "Oov= aimnqpsrtklbeI"}, Why is this necessary? Does the order matter? > =09})