~emersion/soju-dev

This thread contains a patchset. You're looking at the original emails, but you may wish to use the patch review UI. Review patch
2 2

[PATCH v6] Implement casemapping

Details
Message ID
<20210126121607.14029-1-hubert@hirtz.pm>
DKIM signature
pass
Download raw message
Patch: +290 -88
TL;DR: supports for casemapping, now logs are saved in
casemapped/canonical/tolower form
(eg. in the #channel directory instead of #Channel... or something)

== What is casemapping? ==

see <https://modern.ircdocs.horse/#casemapping-parameter>

== Casemapping and multi-upstream ==

Since each upstream does not necessarily use the same casemapping, and
since casemappings cannot coexist [0],

1. soju must also update the database accordingly to upstreams'
   casemapping, otherwise it will end up inconsistent,
2. soju must "normalize" entity names and expose only one casemapping
   that is a subset of all supported casemappings (here, ascii).

[0] On some upstreams, "emersion[m]" and "emersion{m}" refer to the same
user (upstreams that advertise rfc1459 for example), while on others
(upstreams that advertise ascii) they don't.

== Storage changes ==

Once upstream's casemapping is known (default to rfc1459), entity names
in map keys are made into casemapped form, for upstreamConn,
upstreamChannel and network.

downstreamConn advertises "CASEMAPPING=ascii", and always casemap map
keys with ascii.

Some functions require the caller to casemap their argument (to avoid
needless calls to casemapping functions).

Log directories are casemapped.

== Message forwarding and casemapping ==

When relaying entity names from downstreams to upstreams, soju uses the
upstream casemapping, in order to not get in the way of the user.  This
does not brings any issue, as long as soju replies with the ascii
casemapping in mind (solves point 1.).

When relaying entity names from upstreams with non-ascii casemappings,
soju *partially* casemap them: it only change the case of characters
which are not ascii letters.  ASCII case is thus kept intact, while
special symbols like []{} are the same every time soju sends them to
downstreams (solves point 2.).

== Casemapping changes ==

Casemapping changes are not fully supported by this patch and will
result in loss of history.  This is a limitation of the protocol and
should be solved by the RENAME spec.
---
 bridge.go     |   4 +-
 downstream.go |  63 ++++++++++++-----
 irc.go        |  92 ++++++++++++++++++++++++-
 service.go    |   6 +-
 upstream.go   | 183 +++++++++++++++++++++++++++++++++++---------------
 user.go       |  30 +++++++--
 6 files changed, 290 insertions(+), 88 deletions(-)

diff --git a/bridge.go b/bridge.go
index f707e33..45584d8 100644
--- a/bridge.go
@@ -54,8 +54,8 @@ func sendNames(dc *downstreamConn, ch *upstreamChannel) {
	maxLength := maxMessageLength - len(emptyNameReply.String())

	var buf strings.Builder
	for nick, memberships := range ch.Members {
		s := memberships.Format(dc) + dc.marshalEntity(ch.conn.network, nick)
	for _, member := range ch.Members {
		s := member.Memberships.Format(dc) + dc.marshalEntity(ch.conn.network, member.Nick)

		n := buf.Len() + 1 + len(s)
		if buf.Len() != 0 && n > maxLength {
diff --git a/downstream.go b/downstream.go
index b69a475..d1afe43 100644
--- a/downstream.go
+++ b/downstream.go
@@ -89,6 +89,7 @@ type downstreamConn struct {
	registered  bool
	user        *user
	nick        string
	nickCM      string // casemapped nickname
	rawUsername string
	networkName string
	clientName  string
@@ -162,16 +163,20 @@ func (dc *downstreamConn) upstream() *upstreamConn {
	return dc.network.conn
}

// isOurNick checks whether the given nick is ours on the given network,
// according to its casemapping.
//
// Therefore nick should not be casemapped beforehand.
func isOurNick(net *network, nick string) bool {
	// TODO: this doesn't account for nick changes
	if net.conn != nil {
		return nick == net.conn.nick
		return net.casemap(nick) == net.conn.nickCM
	}
	// We're not currently connected to the upstream connection, so we don't
	// know whether this name is our nickname. Best-effort: use the network's
	// configured nickname and hope it was the one being used when we were
	// connected.
	return nick == net.Nick
	return net.casemap(nick) == net.casemap(net.Nick)
}

// marshalEntity converts an upstream entity name (ie. channel or nick) into a
@@ -183,6 +188,7 @@ func (dc *downstreamConn) marshalEntity(net *network, name string) string {
	if isOurNick(net, name) {
		return dc.nick
	}
	name = partialCasemap(net.casemap, name)
	if dc.network != nil {
		if dc.network != net {
			panic("soju: tried to marshal an entity for another network")
@@ -196,6 +202,7 @@ func (dc *downstreamConn) marshalUserPrefix(net *network, prefix *irc.Prefix) *i
	if isOurNick(net, prefix.Name) {
		return dc.prefix()
	}
	prefix.Name = partialCasemap(net.casemap, prefix.Name)
	if dc.network != nil {
		if dc.network != net {
			panic("soju: tried to marshal a user prefix for another network")
@@ -412,13 +419,15 @@ func (dc *downstreamConn) handleMessageUnregistered(msg *irc.Message) error {
				Params:  []string{dc.nick, nick, "contains illegal characters"},
			}}
		}
		if nick == serviceNick {
		nickCM := casemapASCII(nick)
		if nickCM == serviceNickCM {
			return ircError{&irc.Message{
				Command: irc.ERR_NICKNAMEINUSE,
				Params:  []string{dc.nick, nick, "Nickname reserved for bouncer service"},
			}}
		}
		dc.nick = nick
		dc.nickCM = nickCM
	case "USER":
		if err := parseMessageParams(msg, &dc.rawUsername, nil, nil, &dc.realname); err != nil {
			return err
@@ -733,6 +742,7 @@ func (dc *downstreamConn) updateNick() {
			Params:  []string{uc.nick},
		})
		dc.nick = uc.nick
		dc.nickCM = casemapASCII(dc.nick)
	}
}

@@ -878,6 +888,7 @@ func (dc *downstreamConn) welcome() error {

	isupport := []string{
		fmt.Sprintf("CHATHISTORY=%v", dc.srv.HistoryLimit),
		"CASEMAPPING=ascii",
	}

	if uc := dc.upstream(); uc != nil && uc.networkName != "" {
@@ -902,7 +913,7 @@ func (dc *downstreamConn) welcome() error {
	dc.SendMessage(&irc.Message{
		Prefix:  dc.srv.prefix(),
		Command: irc.RPL_MYINFO,
		Params:  []string{dc.nick, dc.srv.Hostname, "soju", "aiwroO", "OovaimnqpsrtklbeI"},
		Params:  []string{dc.nick, dc.srv.Hostname, "soju", "oOwari", "OovaimnqpsrtklbeI"},
	})
	// TODO: other RPL_ISUPPORT tokens
	dc.SendMessage(&irc.Message{
@@ -1104,6 +1115,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				nick = unmarshaledNick
			}
		}
		nickCM := casemapASCII(nick)

		if strings.ContainsAny(nick, illegalNickChars) {
			return ircError{&irc.Message{
@@ -1111,7 +1123,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				Params:  []string{dc.nick, rawNick, "contains illegal characters"},
			}}
		}
		if nick == serviceNick {
		if nickCM == serviceNickCM {
			return ircError{&irc.Message{
				Command: irc.ERR_NICKNAMEINUSE,
				Params:  []string{dc.nick, rawNick, "Nickname reserved for bouncer service"},
@@ -1147,6 +1159,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				Params:  []string{nick},
			})
			dc.nick = nick
			dc.nickCM = casemapASCII(nick)
		}
	case "JOIN":
		var namesStr string
@@ -1164,6 +1177,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
			if err != nil {
				return err
			}
			upstreamNameCM := uc.network.casemap(upstreamName)

			var key string
			if len(keys) > i {
@@ -1181,7 +1195,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {

			var ch *Channel
			var ok bool
			if ch, ok = uc.network.channels[upstreamName]; ok {
			if ch, ok = uc.network.channels[upstreamNameCM]; ok {
				// Don't clear the channel key if there's one set
				// TODO: add a way to unset the channel key
				if key != "" {
@@ -1313,7 +1327,9 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
			modeStr = msg.Params[1]
		}

		if name == dc.nick {
		nameCM := casemapASCII(name)

		if nameCM == dc.nickCM {
			if modeStr != "" {
				dc.forEachUpstream(func(uc *upstreamConn) {
					uc.SendMessageLabeled(dc.id, &irc.Message{
@@ -1351,7 +1367,8 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				Params:  params,
			})
		} else {
			ch, ok := uc.channels[upstreamName]
			upstreamNameCM := uc.network.casemap(upstreamName)
			ch, ok := uc.channels[upstreamNameCM]
			if !ok {
				return ircError{&irc.Message{
					Command: irc.ERR_NOSUCHCHANNEL,
@@ -1400,7 +1417,8 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				Params:  []string{upstreamChannel, topic},
			})
		} else { // getting topic
			ch, ok := uc.channels[upstreamChannel]
			upstreamChannelCM := uc.network.casemap(upstreamChannel)
			ch, ok := uc.channels[upstreamChannelCM]
			if !ok {
				return ircError{&irc.Message{
					Command: irc.ERR_NOSUCHCHANNEL,
@@ -1471,7 +1489,8 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				return err
			}

			ch, ok := uc.channels[upstreamChannel]
			upstreamChannelCM := uc.network.casemap(upstreamChannel)
			ch, ok := uc.channels[upstreamChannelCM]
			if ok {
				sendNames(dc, ch)
			} else {
@@ -1495,8 +1514,9 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {

		// TODO: support WHO masks
		entity := msg.Params[0]
		entityCM := casemapASCII(entity)

		if entity == dc.nick {
		if entityCM == dc.nickCM {
			// TODO: support AWAY (H/G) in self WHO reply
			dc.SendMessage(&irc.Message{
				Prefix:  dc.srv.prefix(),
@@ -1510,7 +1530,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
			})
			return nil
		}
		if entity == serviceNick {
		if entityCM == serviceNickCM {
			dc.SendMessage(&irc.Message{
				Prefix:  dc.srv.prefix(),
				Command: irc.RPL_WHOREPLY,
@@ -1561,7 +1581,9 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
			mask = mask[:i]
		}

		if mask == dc.nick {
		maskCM := casemapASCII(mask)

		if maskCM == dc.nickCM {
			dc.SendMessage(&irc.Message{
				Prefix:  dc.srv.prefix(),
				Command: irc.RPL_WHOISUSER,
@@ -1609,7 +1631,8 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
		tags := copyClientTags(msg.Tags)

		for _, name := range strings.Split(targetsStr, ",") {
			if name == serviceNick {
			nameCM := casemapASCII(name)
			if nameCM == serviceNickCM {
				if dc.caps["echo-message"] {
					echoTags := tags.Copy()
					echoTags["time"] = irc.TagValue(time.Now().UTC().Format(serverTimeLayout))
@@ -1629,7 +1652,9 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				return err
			}

			if upstreamName == "NickServ" {
			upstreamNameCM := uc.network.casemap(upstreamName)

			if upstreamNameCM == "nickserv" {
				dc.handleNickServPRIVMSG(uc, text)
			}

@@ -1654,7 +1679,7 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
				Command: "PRIVMSG",
				Params:  []string{upstreamName, text},
			}
			uc.produce(upstreamName, echoMsg, dc)
			uc.produce(upstreamNameCM, echoMsg, dc)

			uc.updateChannelAutoDetach(upstreamName)
		}
@@ -1786,12 +1811,14 @@ func (dc *downstreamConn) handleMessageRegistered(msg *irc.Message) error {
			}}
		}

		entityCM := uc.network.casemap(entity)

		var history []*irc.Message
		switch subcommand {
		case "BEFORE":
			history, err = store.LoadBeforeTime(uc.network, entity, timestamp, limit)
			history, err = store.LoadBeforeTime(uc.network, entityCM, timestamp, limit)
		case "AFTER":
			history, err = store.LoadAfterTime(uc.network, entity, timestamp, limit)
			history, err = store.LoadAfterTime(uc.network, entityCM, timestamp, limit)
		default:
			// TODO: support LATEST, BETWEEN
			return ircError{&irc.Message{
diff --git a/irc.go b/irc.go
index 9a09da4..26cf699 100644
--- a/irc.go
+++ b/irc.go
@@ -118,12 +118,13 @@ outer:
					return nil, fmt.Errorf("malformed modestring %q: missing mode argument for %c%c", modeStr, plusMinus, mode)
				}
				member := arguments[nextArgument]
				if _, ok := ch.Members[member]; ok {
				memberCM := ch.conn.network.casemap(member)
				if _, ok := ch.Members[memberCM]; ok {
					if plusMinus == '+' {
						ch.Members[member].Add(ch.conn.availableMemberships, membership)
						ch.Members[memberCM].Memberships.Add(ch.conn.availableMemberships, membership)
					} else {
						// TODO: for upstreams without multi-prefix, query the user modes again
						ch.Members[member].Remove(membership)
						ch.Members[memberCM].Memberships.Remove(membership)
					}
				}
				needMarshaling[nextArgument] = struct{}{}
@@ -391,3 +392,88 @@ func parseCTCPMessage(msg *irc.Message) (cmd string, params string, ok bool) {

	return cmd, params, true
}

type casemapping func(string) string

// CasemapASCII of name is the canonical representation of name according to the
// ascii casemapping.
func casemapASCII(name string) string {
	var sb strings.Builder
	sb.Grow(len(name))
	for _, r := range name {
		if 'A' <= r && r <= 'Z' {
			r += 'a' - 'A'
		}
		sb.WriteRune(r)
	}
	return sb.String()
}

// casemapRFC1459 of name is the canonical representation of name according to the
// rfc1459 casemapping.
func casemapRFC1459(name string) string {
	var sb strings.Builder
	sb.Grow(len(name))
	for _, r := range name {
		if 'A' <= r && r <= 'Z' {
			r += 'a' - 'A'
		} else if r == '{' {
			r = '['
		} else if r == '}' {
			r = ']'
		} else if r == '\\' {
			r = '|'
		} else if r == '~' {
			r = '^'
		}
		sb.WriteRune(r)
	}
	return sb.String()
}

// casemapRFC1459Strict of name is the canonical representation of name
// according to the rfc1459-strict casemapping.
func casemapRFC1459Strict(name string) string {
	var sb strings.Builder
	sb.Grow(len(name))
	for _, r := range name {
		if 'A' <= r && r <= 'Z' {
			r += 'a' - 'A'
		} else if r == '{' {
			r = '['
		} else if r == '}' {
			r = ']'
		} else if r == '\\' {
			r = '|'
		}
		sb.WriteRune(r)
	}
	return sb.String()
}

func parseCasemappingToken(tokenValue string) (casemap casemapping, ok bool) {
	switch tokenValue {
	case "ascii":
		casemap = casemapASCII
	case "rfc1459":
		casemap = casemapRFC1459
	case "rfc1459-strict":
		casemap = casemapRFC1459Strict
	default:
		return nil, false
	}
	return casemap, true
}

func partialCasemap(higher casemapping, name string) string {
	nameFullyCM := higher(name)
	var sb strings.Builder
	sb.Grow(len(name))
	for i, r := range nameFullyCM {
		if 'a' <= r && r <= 'z' {
			r = rune(name[i])
		}
		sb.WriteRune(r)
	}
	return sb.String()
}
diff --git a/service.go b/service.go
index 1c239e1..76d8a20 100644
--- a/service.go
+++ b/service.go
@@ -27,6 +27,7 @@ import (
)

const serviceNick = "BouncerServ"
const serviceNickCM = "bouncerserv"
const serviceRealname = "soju bouncer service"

var servicePrefix = &irc.Prefix{
@@ -767,8 +768,9 @@ func handleServiceChannelUpdate(dc *downstreamConn, params []string) error {
	if err != nil {
		return fmt.Errorf("unknown channel %q", name)
	}
	upstreamNameCM := uc.network.casemap(upstreamName)

	ch, ok := uc.network.channels[upstreamName]
	ch, ok := uc.network.channels[upstreamNameCM]
	if !ok {
		return fmt.Errorf("unknown channel %q", name)
	}
@@ -777,7 +779,7 @@ func handleServiceChannelUpdate(dc *downstreamConn, params []string) error {
		return err
	}

	uc.updateChannelAutoDetach(upstreamName)
	uc.updateChannelAutoDetach(upstreamNameCM)

	if err := dc.srv.db.StoreChannel(uc.network.ID, ch); err != nil {
		return fmt.Errorf("failed to update channel: %v", err)
diff --git a/upstream.go b/upstream.go
index f0fdf63..6822ade 100644
--- a/upstream.go
+++ b/upstream.go
@@ -40,6 +40,11 @@ func (err registrationError) Error() string {
	return fmt.Sprintf("registration error: %v", string(err))
}

type upstreamMember struct {
	Nick        string
	Memberships *memberships
}

type upstreamChannel struct {
	Name         string
	conn         *upstreamConn
@@ -49,7 +54,7 @@ type upstreamChannel struct {
	Status       channelStatus
	modes        channelModes
	creationTime string
	Members      map[string]*memberships
	Members      map[string]upstreamMember
	complete     bool
	detachTimer  *time.Timer
}
@@ -84,9 +89,11 @@ type upstreamConn struct {
	availableChannelModes map[byte]channelModeType
	availableChannelTypes string
	availableMemberships  []membership
	advertisedCasemap     bool

	registered    bool
	nick          string
	nickCM        string // casemapped nickname
	username      string
	realname      string
	modes         userModes
@@ -218,6 +225,7 @@ func (uc *upstreamConn) forEachDownstreamByID(id uint64, f func(*downstreamConn)
	})
}

// name must be casemapped.
func (uc *upstreamConn) getChannel(name string) (*upstreamChannel, error) {
	ch, ok := uc.channels[name]
	if !ok {
@@ -409,11 +417,14 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			}
		}

		if msg.Prefix.Name == serviceNick {
		prefixNameCM := uc.network.casemap(msg.Prefix.Name)
		entityCM := uc.network.casemap(entity)

		if prefixNameCM == serviceNickCM {
			uc.logger.Printf("skipping %v from soju's service: %v", msg.Command, msg)
			break
		}
		if entity == serviceNick {
		if entityCM == serviceNickCM {
			uc.logger.Printf("skipping %v to soju's service: %v", msg.Command, msg)
			break
		}
@@ -421,9 +432,9 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		if msg.Prefix.User == "" && msg.Prefix.Host == "" { // server message
			uc.produce("", msg, nil)
		} else { // regular user message
			target := entity
			if target == uc.nick {
				target = msg.Prefix.Name
			target := entityCM
			if target == uc.nickCM {
				target = prefixNameCM
			}

			if ch, ok := uc.network.channels[target]; ok {
@@ -431,7 +442,7 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
					uc.handleDetachedMessage(msg.Prefix.Name, text, ch)
				}

				highlight := msg.Prefix.Name != uc.nick && isHighlight(text, uc.nick)
				highlight := prefixNameCM != uc.nickCM && isHighlight(text, uc.nick)
				if ch.DetachOn == FilterMessage || ch.DetachOn == FilterDefault || (ch.DetachOn == FilterHighlight && highlight) {
					uc.updateChannelAutoDetach(target)
				}
@@ -603,6 +614,10 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			dc.updateSupportedCaps()
		})

		// Heuristic: we can join the channels before receiving the
		// eventual CASEMAPPING 005 token, because upstream *should*
		// respond to the JOINs after sending 005 (and after sending the
		// MOTD otherwise).
		if len(uc.network.channels) > 0 {
			var channels, keys []string
			for _, ch := range uc.network.channels {
@@ -637,6 +652,14 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			}
			if !negate {
				switch parameter {
				case "CASEMAPPING":
					casemap, ok := parseCasemappingToken(value)
					if !ok {
						casemap = casemapRFC1459
					}
					uc.network.updateCasemapping(casemap)
					uc.nickCM = uc.network.casemap(uc.nick)
					uc.advertisedCasemap = true
				case "CHANMODES":
					parts := strings.SplitN(value, ",", 5)
					if len(parts) < 4 {
@@ -679,6 +702,13 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
				// TODO: handle ISUPPORT negations
			}
		}
	case irc.RPL_ENDOFMOTD, irc.ERR_NOMOTD:
		if !uc.advertisedCasemap {
			// upstream did not send a CASEMAPPING token, thus
			// default to rfc1459.
			uc.network.updateCasemapping(casemapRFC1459)
			uc.nickCM = uc.network.casemap(uc.nick)
		}
	case "BATCH":
		var tag string
		if err := parseMessageParams(msg, &tag); err != nil {
@@ -723,18 +753,22 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		prefixNameCM := uc.network.casemap(msg.Prefix.Name)
		newNickCM := uc.network.casemap(newNick)

		me := false
		if msg.Prefix.Name == uc.nick {
		if prefixNameCM == uc.nickCM {
			uc.logger.Printf("changed nick from %q to %q", uc.nick, newNick)
			me = true
			uc.nick = newNick
			uc.nickCM = newNickCM
		}

		for _, ch := range uc.channels {
			if memberships, ok := ch.Members[msg.Prefix.Name]; ok {
				delete(ch.Members, msg.Prefix.Name)
				ch.Members[newNick] = memberships
				uc.appendLog(ch.Name, msg)
		for chName, ch := range uc.channels {
			if member, ok := ch.Members[prefixNameCM]; ok {
				delete(ch.Members, prefixNameCM)
				ch.Members[newNickCM] = member
				uc.appendLog(chName, msg)
			}
		}

@@ -757,13 +791,16 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		prefixNameCM := uc.network.casemap(msg.Prefix.Name)

		for _, ch := range strings.Split(channels, ",") {
			if msg.Prefix.Name == uc.nick {
			chCM := uc.network.casemap(ch)
			if prefixNameCM == uc.nickCM {
				uc.logger.Printf("joined channel %q", ch)
				uc.channels[ch] = &upstreamChannel{
				uc.channels[chCM] = &upstreamChannel{
					Name:    ch,
					conn:    uc,
					Members: make(map[string]*memberships),
					Members: make(map[string]upstreamMember),
				}
				uc.updateChannelAutoDetach(ch)

@@ -772,16 +809,19 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
					Params:  []string{ch},
				})
			} else {
				ch, err := uc.getChannel(ch)
				ch, err := uc.getChannel(chCM)
				if err != nil {
					return err
				}
				ch.Members[msg.Prefix.Name] = &memberships{}
				ch.Members[prefixNameCM] = upstreamMember{
					Nick:        msg.Prefix.Name,
					Memberships: &memberships{},
				}
			}

			chMsg := msg.Copy()
			chMsg.Params[0] = ch
			uc.produce(ch, chMsg, nil)
			uc.produce(chCM, chMsg, nil)
		}
	case "PART":
		if msg.Prefix == nil {
@@ -793,24 +833,27 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		prefixNameCM := uc.network.casemap(msg.Prefix.Name)

		for _, ch := range strings.Split(channels, ",") {
			if msg.Prefix.Name == uc.nick {
			chCM := uc.network.casemap(ch)
			if prefixNameCM == uc.nickCM {
				uc.logger.Printf("parted channel %q", ch)
				if uch, ok := uc.channels[ch]; ok {
					delete(uc.channels, ch)
				if uch, ok := uc.channels[chCM]; ok {
					delete(uc.channels, chCM)
					uch.updateAutoDetach(0)
				}
			} else {
				ch, err := uc.getChannel(ch)
				ch, err := uc.getChannel(chCM)
				if err != nil {
					return err
				}
				delete(ch.Members, msg.Prefix.Name)
				delete(ch.Members, prefixNameCM)
			}

			chMsg := msg.Copy()
			chMsg.Params[0] = ch
			uc.produce(ch, chMsg, nil)
			uc.produce(chCM, chMsg, nil)
		}
	case "KICK":
		if msg.Prefix == nil {
@@ -822,36 +865,41 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		if user == uc.nick {
		channelCM := uc.network.casemap(channel)
		userCM := uc.network.casemap(user)

		if userCM == uc.nickCM {
			uc.logger.Printf("kicked from channel %q by %s", channel, msg.Prefix.Name)
			delete(uc.channels, channel)
		} else {
			ch, err := uc.getChannel(channel)
			ch, err := uc.getChannel(channelCM)
			if err != nil {
				return err
			}
			delete(ch.Members, user)
			delete(ch.Members, userCM)
		}

		uc.produce(channel, msg, nil)
		uc.produce(channelCM, msg, nil)
	case "QUIT":
		if msg.Prefix == nil {
			return fmt.Errorf("expected a prefix")
		}

		if msg.Prefix.Name == uc.nick {
		prefixNameCM := uc.network.casemap(msg.Prefix.Name)

		if prefixNameCM == uc.nickCM {
			uc.logger.Printf("quit")
		}

		for _, ch := range uc.channels {
			if _, ok := ch.Members[msg.Prefix.Name]; ok {
				delete(ch.Members, msg.Prefix.Name)
		for chName, ch := range uc.channels {
			if _, ok := ch.Members[prefixNameCM]; ok {
				delete(ch.Members, prefixNameCM)

				uc.appendLog(ch.Name, msg)
				uc.appendLog(chName, msg)
			}
		}

		if msg.Prefix.Name != uc.nick {
		if prefixNameCM != uc.nickCM {
			uc.forEachDownstream(func(dc *downstreamConn) {
				dc.SendMessage(dc.marshalMessage(msg, uc.network))
			})
@@ -861,7 +909,8 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		if err := parseMessageParams(msg, nil, &name, &topic); err != nil {
			return err
		}
		ch, err := uc.getChannel(name)
		nameCM := uc.network.casemap(name)
		ch, err := uc.getChannel(nameCM)
		if err != nil {
			return err
		}
@@ -879,7 +928,8 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		if err := parseMessageParams(msg, &name); err != nil {
			return err
		}
		ch, err := uc.getChannel(name)
		nameCM := uc.network.casemap(name)
		ch, err := uc.getChannel(nameCM)
		if err != nil {
			return err
		}
@@ -890,21 +940,23 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		} else {
			ch.Topic = ""
		}
		uc.produce(ch.Name, msg, nil)
		uc.produce(nameCM, msg, nil)
	case "MODE":
		var name, modeStr string
		if err := parseMessageParams(msg, &name, &modeStr); err != nil {
			return err
		}

		nameCM := uc.network.casemap(name)

		if !uc.isChannel(name) { // user mode change
			if name != uc.nick {
			if nameCM != uc.nickCM {
				return fmt.Errorf("received MODE message for unknown nick %q", name)
			}
			return uc.modes.Apply(modeStr)
			// TODO: notify downstreams about user mode change?
		} else { // channel mode change
			ch, err := uc.getChannel(name)
			ch, err := uc.getChannel(nameCM)
			if err != nil {
				return err
			}
@@ -914,12 +966,12 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
				return err
			}

			uc.appendLog(ch.Name, msg)
			uc.appendLog(nameCM, msg)

			if ch, ok := uc.network.channels[name]; !ok || !ch.Detached {
			if ch, ok := uc.network.channels[nameCM]; !ok || !ch.Detached {
				uc.forEachDownstream(func(dc *downstreamConn) {
					params := make([]string, len(msg.Params))
					params[0] = dc.marshalEntity(uc.network, name)
					params[0] = dc.marshalEntity(uc.network, ch.Name)
					params[1] = modeStr

					copy(params[2:], msg.Params[2:])
@@ -956,12 +1008,13 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		if err := parseMessageParams(msg, nil, &channel); err != nil {
			return err
		}
		channelCM := uc.network.casemap(channel)
		modeStr := ""
		if len(msg.Params) > 2 {
			modeStr = msg.Params[2]
		}

		ch, err := uc.getChannel(channel)
		ch, err := uc.getChannel(channelCM)
		if err != nil {
			return err
		}
@@ -972,11 +1025,11 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}
		if firstMode {
			if c, ok := uc.network.channels[channel]; !ok || !c.Detached {
			if c, ok := uc.network.channels[channelCM]; !ok || !c.Detached {
				modeStr, modeParams := ch.modes.Format()

				uc.forEachDownstream(func(dc *downstreamConn) {
					params := []string{dc.nick, dc.marshalEntity(uc.network, channel), modeStr}
					params := []string{dc.nick, dc.marshalEntity(uc.network, ch.Name), modeStr}
					params = append(params, modeParams...)

					dc.SendMessage(&irc.Message{
@@ -993,7 +1046,9 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		ch, err := uc.getChannel(channel)
		channelCM := uc.network.casemap(channel)

		ch, err := uc.getChannel(channelCM)
		if err != nil {
			return err
		}
@@ -1014,7 +1069,8 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		if err := parseMessageParams(msg, nil, &name, &who, &timeStr); err != nil {
			return err
		}
		ch, err := uc.getChannel(name)
		nameCM := uc.network.casemap(name)
		ch, err := uc.getChannel(nameCM)
		if err != nil {
			return err
		}
@@ -1069,11 +1125,13 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		ch, ok := uc.channels[name]
		nameCM := uc.network.casemap(name)

		ch, ok := uc.channels[nameCM]
		if !ok {
			// NAMES on a channel we have not joined, forward to downstream
			uc.forEachDownstreamByID(downstreamID, func(dc *downstreamConn) {
				channel := dc.marshalEntity(uc.network, name)
				channel := dc.marshalEntity(uc.network, ch.Name)
				members := splitSpace(members)
				for i, member := range members {
					memberships, nick := uc.parseMembershipPrefix(member)
@@ -1098,7 +1156,11 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {

		for _, s := range splitSpace(members) {
			memberships, nick := uc.parseMembershipPrefix(s)
			ch.Members[nick] = memberships
			nickCM := uc.network.casemap(nick)
			ch.Members[nickCM] = upstreamMember{
				Nick:        nick,
				Memberships: memberships,
			}
		}
	case irc.RPL_ENDOFNAMES:
		var name string
@@ -1106,11 +1168,13 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
			return err
		}

		ch, ok := uc.channels[name]
		nameCM := uc.network.casemap(name)

		ch, ok := uc.channels[nameCM]
		if !ok {
			// NAMES on a channel we have not joined, forward to downstream
			uc.forEachDownstreamByID(downstreamID, func(dc *downstreamConn) {
				channel := dc.marshalEntity(uc.network, name)
				channel := dc.marshalEntity(uc.network, ch.Name)

				dc.SendMessage(&irc.Message{
					Prefix:  dc.srv.prefix(),
@@ -1126,7 +1190,7 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		}
		ch.complete = true

		if c, ok := uc.network.channels[name]; !ok || !c.Detached {
		if c, ok := uc.network.channels[nameCM]; !ok || !c.Detached {
			uc.forEachDownstream(func(dc *downstreamConn) {
				forwardChannel(dc, ch)
			})
@@ -1403,7 +1467,7 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
		// Ignore
	case irc.RPL_LUSERCLIENT, irc.RPL_LUSEROP, irc.RPL_LUSERUNKNOWN, irc.RPL_LUSERCHANNELS, irc.RPL_LUSERME:
		// Ignore
	case irc.RPL_MOTDSTART, irc.RPL_MOTD, irc.RPL_ENDOFMOTD:
	case irc.RPL_MOTDSTART, irc.RPL_MOTD:
		// Ignore
	case irc.RPL_LISTSTART:
		// Ignore
@@ -1447,7 +1511,8 @@ func (uc *upstreamConn) handleMessage(msg *irc.Message) error {
}

func (uc *upstreamConn) handleDetachedMessage(sender string, text string, ch *Channel) {
	highlight := sender != uc.nick && isHighlight(text, uc.nick)
	senderCM := uc.network.casemap(sender)
	highlight := senderCM != uc.nickCM && isHighlight(text, uc.nick)
	if ch.RelayDetached == FilterMessage || ((ch.RelayDetached == FilterHighlight || ch.RelayDetached == FilterDefault) && highlight) {
		uc.forEachDownstream(func(dc *downstreamConn) {
			if highlight {
@@ -1569,6 +1634,7 @@ func splitSpace(s string) []string {

func (uc *upstreamConn) register() {
	uc.nick = uc.network.Nick
	uc.nickCM = uc.network.casemap(uc.network.Nick)
	uc.username = uc.network.Username
	if uc.username == "" {
		uc.username = uc.nick
@@ -1668,6 +1734,8 @@ func (uc *upstreamConn) SendMessageLabeled(downstreamID uint64, msg *irc.Message
//
// The internal message ID is returned. If the message isn't recorded in the
// log file, an empty string is returned.
//
// entity must be casemapped.
func (uc *upstreamConn) appendLog(entity string, msg *irc.Message) (msgID string) {
	if uc.user.msgStore == nil {
		return ""
@@ -1718,6 +1786,8 @@ func (uc *upstreamConn) appendLog(entity string, msg *irc.Message) (msgID string
//
// If origin is not nil and origin doesn't support echo-message, the message is
// forwarded to all connections except origin.
//
// target must be casemapped.
func (uc *upstreamConn) produce(target string, msg *irc.Message, origin *downstreamConn) {
	var msgID string
	if target != "" {
@@ -1759,6 +1829,7 @@ func (uc *upstreamConn) updateAway() {
	uc.away = away
}

// name must be casemapped.
func (uc *upstreamConn) updateChannelAutoDetach(name string) {
	if uch, ok := uc.channels[name]; ok {
		if ch, ok := uc.network.channels[name]; ok && !ch.Detached {
diff --git a/user.go b/user.go
index 2864910..a19225b 100644
--- a/user.go
+++ b/user.go
@@ -66,15 +66,18 @@ type network struct {

	conn           *upstreamConn
	channels       map[string]*Channel
	history        map[string]*networkHistory // indexed by entity
	history        map[string]*networkHistory // indexed by casemapped entity
	offlineClients map[string]struct{}        // indexed by client name
	lastError      error
	casemap        casemapping
}

func newNetwork(user *user, record *Network, channels []Channel) *network {
	m := make(map[string]*Channel, len(channels))
	for _, ch := range channels {
		ch := ch
		// Don't casemap channel names yet, we don't know the
		// casemapping the server is going to use.
		m[ch.Name] = &ch
	}

@@ -85,6 +88,7 @@ func newNetwork(user *user, record *Network, channels []Channel) *network {
		channels:       m,
		history:        make(map[string]*networkHistory),
		offlineClients: make(map[string]struct{}),
		casemap:        casemapRFC1459,
	}
}

@@ -188,8 +192,9 @@ func (net *network) detach(ch *Channel) {
	ch.Detached = true
	net.user.srv.Logger.Printf("network %q: detaching channel %q", net.GetName(), ch.Name)

	chNameCM := net.casemap(ch.Name)
	if net.conn != nil {
		if uch, ok := net.conn.channels[ch.Name]; ok {
		if uch, ok := net.conn.channels[chNameCM]; ok {
			uch.updateAutoDetach(0)
		}
	}
@@ -213,10 +218,11 @@ func (net *network) attach(ch *Channel) {
	net.user.srv.Logger.Printf("network %q: attaching channel %q", net.GetName(), ch.Name)

	var uch *upstreamChannel
	chNameCM := net.casemap(ch.Name)
	if net.conn != nil {
		uch = net.conn.channels[ch.Name]
		uch = net.conn.channels[chNameCM]

		net.conn.updateChannelAutoDetach(ch.Name)
		net.conn.updateChannelAutoDetach(chNameCM)
	}

	net.forEachDownstream(func(dc *downstreamConn) {
@@ -230,14 +236,15 @@ func (net *network) attach(ch *Channel) {
			forwardChannel(dc, uch)
		}

		if net.history[ch.Name] != nil {
		if net.history[chNameCM] != nil {
			dc.sendNetworkHistory(net)
		}
	})
}

func (net *network) deleteChannel(name string) error {
	ch, ok := net.channels[name]
	nameCM := net.casemap(name)
	ch, ok := net.channels[nameCM]
	if !ok {
		return fmt.Errorf("unknown channel %q", name)
	}
@@ -250,10 +257,19 @@ func (net *network) deleteChannel(name string) error {
	if err := net.user.srv.db.DeleteChannel(ch.ID); err != nil {
		return err
	}
	delete(net.channels, name)
	delete(net.channels, nameCM)
	return nil
}

func (net *network) updateCasemapping(newCasemap casemapping) {
	net.casemap = newCasemap
	newChannels := make(map[string]*Channel, len(net.channels))
	for _, c := range net.channels {
		newChannels[net.casemap(c.Name)] = c
	}
	net.channels = newChannels
}

type user struct {
	User
	srv *Server
-- 
2.30.0
Details
Message ID
<FTJc4Si1Ws5_lE8hFWjwoCtzuS5WcR34gqoPTP3ptz8wOJZJw2d42Uncn13lWByXCelt3Z5pOzTbb_M_daErxdn6UW2umEOaQZSVAh9n3aU=@emersion.fr>
In-Reply-To
<20210126121607.14029-1-hubert@hirtz.pm> (view parent)
DKIM signature
pass
Download raw message
On Tuesday, January 26th, 2021 at 1:16 PM, Hubert Hirtz <hubert@hirtz.pm> wrote:

> TL;DR: supports for casemapping, now logs are saved in
> casemapped/canonical/tolower form
> (eg. in the #channel directory instead of #Channel... or something)
>
> == What is casemapping? ==
>
> see <https://modern.ircdocs.horse/#casemapping-parameter>
>
> == Casemapping and multi-upstream ==
>
> Since each upstream does not necessarily use the same casemapping, and
> since casemappings cannot coexist [0],
>
> 1. soju must also update the database accordingly to upstreams'
>    casemapping, otherwise it will end up inconsistent,
> 2. soju must "normalize" entity names and expose only one casemapping
>    that is a subset of all supported casemappings (here, ascii).
>
> [0] On some upstreams, "emersion[m]" and "emersion{m}" refer to the same
> user (upstreams that advertise rfc1459 for example), while on others
> (upstreams that advertise ascii) they don't.
>
> == Storage changes ==
>
> Once upstream's casemapping is known (default to rfc1459), entity names
> in map keys are made into casemapped form, for upstreamConn,
> upstreamChannel and network.
>
> downstreamConn advertises "CASEMAPPING=ascii", and always casemap map
> keys with ascii.
>
> Some functions require the caller to casemap their argument (to avoid
> needless calls to casemapping functions).

As discussed elsewhere, I think it would be nice to have something like
this:

    // casemap is a map that stores case-mapped values.
    type casemapMap struct {
        m  map[string]casemapEntry // indexed by canonical name
        cm casemapping
    }

    type casemapEntry struct {
        original string
        value    interface{}
    }

    // Get returns the value with the given name. The name is canonicalized
    // before accessing the map.
    func (cm *casemapMap) Get(name string) interface{}
    // UpdateCasemapping changes the case-mapping used by the map.
    func (cm *casemapMap) UpdateCasemapping(cm casemapping)

> Log directories are casemapped.

Is this desirable? Can upstream servers change the original name while
still referring to the same nick/channel?

I guess we don't want to store downstream messages which use a different
case-mapping in a different log directory. So we would need to convert
downstream names to original upstream names if we want to do this.

> == Message forwarding and casemapping ==
>
> When relaying entity names from downstreams to upstreams, soju uses the
> upstream casemapping, in order to not get in the way of the user.  This

You mean soju uses the name as sent by downstream?

> does not brings any issue, as long as soju replies with the ascii
> casemapping in mind (solves point 1.).
>
> When relaying entity names from upstreams with non-ascii casemappings,
> soju *partially* casemap them: it only change the case of characters
> which are not ascii letters.  ASCII case is thus kept intact, while
> special symbols like []{} are the same every time soju sends them to
> downstreams (solves point 2.).

As discussed elsewhere, we should use the original upstream names.

> == Casemapping changes ==
>
> Casemapping changes are not fully supported by this patch and will
> result in loss of history.  This is a limitation of the protocol and
> should be solved by the RENAME spec.

[…]

> @@ -902,7 +913,7 @@ func (dc *downstreamConn) welcome() error {
>  	dc.SendMessage(&irc.Message{
>  		Prefix:  dc.srv.prefix(),
>  		Command: irc.RPL_MYINFO,
> -		Params:  []string{dc.nick, dc.srv.Hostname, "soju", "aiwroO", "OovaimnqpsrtklbeI"},
> +		Params:  []string{dc.nick, dc.srv.Hostname, "soju", "oOwari", "OovaimnqpsrtklbeI"},

Why is this necessary? Does the order matter?

>  	})
Details
Message ID
<20210218021136.466fd035@acer.home>
In-Reply-To
<FTJc4Si1Ws5_lE8hFWjwoCtzuS5WcR34gqoPTP3ptz8wOJZJw2d42Uncn13lWByXCelt3Z5pOzTbb_M_daErxdn6UW2umEOaQZSVAh9n3aU=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
> > Log directories are casemapped.  
> 
> Is this desirable? Can upstream servers change the original name while
> still referring to the same nick/channel?
> 
> I guess we don't want to store downstream messages which use a different
> case-mapping in a different log directory. So we would need to convert
> downstream names to original upstream names if we want to do this.

Yes, this is desirable. Channels can change case depending on who joins
them first, and soju must always provide history even if the server
uses different cases on different re-connections.  We could use the
casemapped form we first encountered but there's no point in that, it's
just simpler to casemap them all.

> > == Message forwarding and casemapping ==
> >
> > When relaying entity names from downstreams to upstreams, soju uses the
> > upstream casemapping, in order to not get in the way of the user.  This  
> 
> You mean soju uses the name as sent by downstream?

This paragraph is about point 1/ above: updating the database even when
downstream doesn't use the same casemapped form of a name, eg:

    D: JOIN #soju{m}/fn
    D: PART #soju[m]/fn

given "fn" is an upstream with the rfc1459 casemapping (that considers
{} and [] to be equivalent). "Using the upstream casemapping" here
means figuring out downstream wants to part "#soju{m}" (even if it typed
"#soju[m]" for the second command), and thus the database entry for this
channel must be dropped.

> > does not brings any issue, as long as soju replies with the ascii
> > casemapping in mind (solves point 1.).
> >
> > When relaying entity names from upstreams with non-ascii casemappings,
> > soju *partially* casemap them: it only change the case of characters
> > which are not ascii letters.  ASCII case is thus kept intact, while
> > special symbols like []{} are the same every time soju sends them to
> > downstreams (solves point 2.).  
> 
> As discussed elsewhere, we should use the original upstream names.

This is easily doable with your casemapMap idea and I have a nearly
ready revision for this patch that uses it.
Reply to thread Export thread (mbox)