Thanks for the thorough review! Even in beta, Slidge is awesome.
> Supposedly, contacts are bound to a unique session instance, and thus
> are never shared between different sessions. They are discriminated by
> the stanzas "from" attribute and dispatched to the appropriate session
> (see `BaseGateway.__register_handlers()`). I might be missing
> something here; have you witnessed some sort of leaking between
> different sessions?
My understanding is that, since each session can populate the XMPP
account's roster individually and there's no way of differentiating
between the "source" session or account for a legacy contact in XMPP,
it's unclear how a user might choose to send a message from one account
or the other.
That is to say, let's say I have two accounts on WhatsApp, one called
"Alex", and the other one called "Super-Secret Alter-Ego" (oh well, the
secret's out now). Let's say that, for some reason, I have the same
contact, Jane, on both my WhatsApp clients, but they don't know both
accounts are used by me.
Linking both of these accounts/devices into Slidge will work fine for
all other contacts, but presumably the selection of which session will
be used to send to Jane is random, and perhaps down to which session
gets to re-sync the roster last. In effect, sending a message from my
XMPP client to Jane's WhatsApp client might end up showing up as coming
from my "Alter-Ego" account when I thought I was sending as "Alex". I'm
guessing there's similar considerations for contact history, where
messages from the other session might appear in the same thread but as
coming from a different account.
That said, I haven't done any work to confirm these issues and any
potential fixes, but I'm entirely basing this on prior understanding of
how XMPP-to-legacy-protocol mapping works. Maybe resources on the legacy
user JID will work, but do clients allow you to choose which resource to
send from (or differentiate history based on the resource)?
> A dirty workaround could be to re-fetch avatars only on message
> exchange. Something like `Contact.avatar_fetched = False` on
> `__init__` and then wrap `Contact.send_text()` and
> `Session.send_text()` with a `if not c.avatar_fetched:
> c.fetch_avatar()`. But really, we need a way to cache contacts'
> avatars in core.
WhatsApp does actually send a `PushName` event when an incoming message
is sent and remote changes have been made to the contact information
that aren't reflected in our local DB -- we already use this to some
extent, and it does help, but of course a generic solution would work
best as a future improvement.
> I think the Pillow python lib (already a "core" dependency, because
> XMPP avatars need to be PNG) can be used to convert to JPEG when needed.
I'll take a look at pillow for image uploads at least, though I'm
actually sort of ambivalent about this since we'd be converting a
lossless format to a lossy one -- I find the default behaviour of
WhatsApp here to be somewhat disagreeable, that is. Less so for other
lossy formats though (iOS takes photos in HEIC format by default, which
not all devices support).
The biggest deal for me is audio uploads, which most Android phones
capture as `m4a`, which isn't really configurable, and which isn't
compatible with WhatsApp either; some people I know swear by them
(annoyingly enough), and I suspect that not having the convenient little
player in the chat thread will be a bummer. Perhaps there's some easy
win here but I suspect including an `ffmpeg` dependency will be the only
solution.
> It seems to work fine EXCEPT on initial launch, because the
> "generated" dir is empty and only modifications in .go files trigger a
> build. The workaround right now is to just touch a .go file and wait,
> but I think a cleaner fix is to have an initial build on startup if
> the generated dir is empty or inexistent. (in the watcher.py script
> maybe?).
That's a good point, I'll see about fixing the `watcher.py` script for
this (or, otherwise, fix the Dockerfile for local development).
Broader local build issues will likely be fixed by work I'll need to do
on Debian packaging, which I'll need anyways for my own purposes.
> I guess I need to add a Basession.shutdown() method, so that you don't
> have to repeat the content of BaseGateway.shutdown(). I also need that
> for the skype plugin, it's going to be DRYer this way. This can be
> done later.
I saw that you already did this, and it's exactly what I need!
Essentially, `Session.shutdown()` here would just disconnect the
WebSocket connection, and not log out, contrary to some other plugins
which can re-authenticate based on registration form data.
> I'm currently factoring a lot of what's in `LegacyContact` as mixins
> (so they can be reused for MUC participants...), and I really need to
> make these carbon methods homogeneous to their "normal" counterparts.
> Not sure yet about the right design to do this yet, but getting there.
That's, again, awesome -- these methods look rather similar from the
outside, and have a few similarities internally, so it makes sense to
merge. For the purposes of WhatsApp, carbons are essentially a boolean
value difference away from regular messages; the JID of the contact is
the same regardless. Not sure if that's the case for other plugins.
> Why not import at the beginning of the file? Any reason for this?
Ah, this is simple copy-pasta on rails, no real reason. I'll move to the
top.
> Other points:
All good points -- I'm still trying wrapping my mind around what
idiomatic Python code looks like, so these pointers help a lot. I think
I'll need to check my MyPy setup as well, as CI is reporting issues that
my editor is not.
I'm going to stop rebasing now that we have a proper branch on Slidge to
look at -- the last major rebase I did was splitting up the files, but
subsequent work will be done in its own commits.
Tests are a bit of a thorn, as I hate not knowing whether refactoring
one part breaks another. Unit tests wouldn't do well here, AFAIK, as a
lot actually depends on the unwritten contract between `whatsmeow` and
the WhatsApp Web API. It doesn't help that `whatsmeow` itself contains
almost zero tests, which probably comes down to the same issue.
I'll look at adding tests on the Python side as you said, though, and
maybe even look to add some mocking/integration scaffolding on the Go
side to at least test the Go-Python contract. I suspect anything more
than what you've raised (or anything I can steal from any of your other
plugins) will land down the line.
Thanks again for the review, and for your patience in guiding me here --
it's definitely been a lot of fun up 'till now! Ideally I'll fix within
this or next week, after which we can look to merge.
- Alex
On 04/12/2022 12:52, Alex wrote:
> That is to say, let's say I have two accounts on WhatsApp, one called
> "Alex", and the other one called "Super-Secret Alter-Ego"
Alex, what did you do‽ This is a public mailing list!
> Linking both of these accounts/devices into Slidge
Slidge will not let you do that. One slidge account=one bare JID, and
there's no way of "legacy multi accounting" with a single JID. I you
really want to do that, just create another JID? Gajim, Dino,
Conversations [...] all allow XMPP multi accounts in a unified view.
> Maybe resources on the legacy user JID will work, but do clients allow
> you to choose which resource to send from (or differentiate history
> based on the resource)?
I don't think so, but clients *could*. Another XMPP account/JID sounds
like a reasonable option if you really want to multi legacy account with
slidge. I guess we could do something based on the resource if we really
wanted it, but I am not sure it's worth the added complexity. Is that
something you need?
> remote changes [...] that aren't reflected in our local DB [...] a
> generic solution would work best
Yes. I think this should be part of a refactor that also changes the
user store implementation and introduces a better way to store
persistent data on disk. I think SQLite would be a reasonable choice?
I'm really not experienced with DBs at all, so I'm happy with any
suggestion as to what to use here.
> converting a lossless format to a lossy one
It is a dilemma indeed. Let the admins decide? It would even be better
to have parameters like this at the user level, configurable via adhoc
commands. Slidge core definitely needs some generic stuff for user
preferences.
> audio uploads [...] people I know swear by them (annoyingly enough)
My god. They must hate you, I don't see any other explanation.
> and I suspect that not having the convenient little player in the chat
> thread will be a bummer.
I agree. Ideally, I'd like people on the other end to not be able to
tell I'm not using the official whatsapp client. But you decide what you
want to do with the plugin you own ;)
> Perhaps there's some easy win here but I suspect including an `ffmpeg`
> dependency will be the only solution.
I think it's a reasonable (optional) dependency? To limit CPU usage, I
guess we could later make sure we only use a single thread at once for
transcoding (with a queue or something).
Don't worry too much about tests. I wish I could be more picky about it,
but it wouldn't be fair given that some plugins don't have a single
automated test ATM.
Alex, thanks again. A plugin using a go lib to the most popular IM
walled garden? A huge win for slidge!
-- Nicoco
Circling back here after fighting with Thunderbird for a few days:
> Slidge will not let you do that. One slidge account=one bare JID, and
> there's no way of "legacy multi accounting" with a single JID. I you
> really want to do that, just create another JID? Gajim, Dino,
> Conversations [...] all allow XMPP multi accounts in a unified view.
> [...]
> I don't think so, but clients *could*. Another XMPP account/JID sounds
> like a reasonable option if you really want to multi legacy account
> with slidge. I guess we could do something based on the resource if we
> really wanted it, but I am not sure it's worth the added complexity.
> Is that something you need?
That makes sense -- I think I misunderstood how multi-device linking
works for Signal, and thought you were able to link multiple devices
(that might each come with their own roster) into one XMPP JID. I agree
this isn't really a use-case that we should support.
To clarify though, the WhatsApp plugin for Slidge will allow for
multiple concurrent sessions on the same bridge for different JIDs --
clients can register and have their authentication handled individually.
It's the ability to register multiple WhatsApp sessions for a single JID
that was in question here.
> It is a dilemma indeed. Let the admins decide? It would even be better
> to have parameters like this at the user level, configurable via adhoc
> commands. Slidge core definitely needs some generic stuff for user
> preferences.
I think I'll leave this question open until some later date, as this
might be a high effort/low payoff kind of thing that I'm not sure will
help improve bridging with WhatsApp in a significant way. I'm sure
someone will be kind enough to open an issue if they feel differently! :)
Having a single, shared `ffmpeg` thread is a great idea, as this will
help with tuning things down the line. One annoyance with the WhatsApp
plugin and GoPy in particular is that converting Python `buffer` types
to Go `[]byte` types (array/slice of bytes) is *extremely* slow, as
there's no way around doing this other than appending to the Go slice
byte-by-byte, which means calling the FFI for every byte. This is the
reason I don't process outgoing attachments on the Slidge side, but pass
the URL down to Go and download and process there (which necessitates
setting a configuration value for ignoring SSL failures for local dev)
-- it took approximately 20 seconds to process a 1.5MB byte buffer for
sending through to the FFI.
I'll see about opening a GoPy issue that covers this, but it might be a
blocker for me doing any large file processing in Python for use in Go.
> Don't worry too much about tests. I wish I could be more picky about
> it, but it wouldn't be fair given that some plugins don't have a
> single automated test ATM.
I might still be able to get something in! I'd have to come up with a
`MockGateway` on the Go side, but it's mostly just busywork and not too
complex.
> Alex, thanks again. A plugin using a go lib to the most popular IM
> walled garden? A huge win for slidge!
I'm glad you announced Slidge when you did and I didn't end up putting
time in an (extraneous) alternative -- XMPP has really been starved for
choice in terms of bridges for a long time!
- Alex
On 04/12/2022 13:25, Nicolas Cedilnik wrote:
> On 04/12/2022 12:52, Alex wrote:
>> That is to say, let's say I have two accounts on WhatsApp, one called
>> "Alex", and the other one called "Super-Secret Alter-Ego"
> Alex, what did you do‽ This is a public mailing list!
>> Linking both of these accounts/devices into Slidge
> Slidge will not let you do that. One slidge account=one bare JID, and
> there's no way of "legacy multi accounting" with a single JID. I you
> really want to do that, just create another JID? Gajim, Dino,
> Conversations [...] all allow XMPP multi accounts in a unified view.
>> Maybe resources on the legacy user JID will work, but do clients
>> allow you to choose which resource to send from (or differentiate
>> history based on the resource)?
> I don't think so, but clients *could*. Another XMPP account/JID sounds
> like a reasonable option if you really want to multi legacy account
> with slidge. I guess we could do something based on the resource if we
> really wanted it, but I am not sure it's worth the added complexity.
> Is that something you need?
>> remote changes [...] that aren't reflected in our local DB [...] a
>> generic solution would work best
> Yes. I think this should be part of a refactor that also changes the
> user store implementation and introduces a better way to store
> persistent data on disk. I think SQLite would be a reasonable choice?
> I'm really not experienced with DBs at all, so I'm happy with any
> suggestion as to what to use here.
>> converting a lossless format to a lossy one
> It is a dilemma indeed. Let the admins decide? It would even be better
> to have parameters like this at the user level, configurable via adhoc
> commands. Slidge core definitely needs some generic stuff for user
> preferences.
>> audio uploads [...] people I know swear by them (annoyingly enough)
> My god. They must hate you, I don't see any other explanation.
>> and I suspect that not having the convenient little player in the
>> chat thread will be a bummer.
> I agree. Ideally, I'd like people on the other end to not be able to
> tell I'm not using the official whatsapp client. But you decide what
> you want to do with the plugin you own ;)
>> Perhaps there's some easy win here but I suspect including an
>> `ffmpeg` dependency will be the only solution.
> I think it's a reasonable (optional) dependency? To limit CPU usage, I
> guess we could later make sure we only use a single thread at once for
> transcoding (with a queue or something).
>
> Don't worry too much about tests. I wish I could be more picky about
> it, but it wouldn't be fair given that some plugins don't have a
> single automated test ATM.
>
> Alex, thanks again. A plugin using a go lib to the most popular IM
> walled garden? A huge win for slidge!
>
> -- Nicoco
>