~mariusor/activitypub-go

6 2

Gitea & go-ap

Loïc Dachary <loic@dachary.org>
Details
Message ID
<a4bb4473-1379-453a-93aa-0c6aa8af49e4@dachary.org>
DKIM signature
missing
Download raw message
Bonjour,

This is a followup discussion of the topic started on mastodon at https://mastodon.social/@humanetech/108284656261295851

To be continued!

-- 
Loïc Dachary, Artisan Logiciel Libre
Loïc Dachary <loic@dachary.org>
Details
Message ID
<5d55a78c-c8b2-af1b-67b6-43b2ffac1fe6@dachary.org>
In-Reply-To
<a4bb4473-1379-453a-93aa-0c6aa8af49e4@dachary.org> (view parent)
DKIM signature
missing
Download raw message
I see you've been using generics for go-ap even before 1.18 was released, back in November 2021.

https://github.com/go-ap/activitypub/blob/master/object.go#L136-L142

I'm yet to figure out how unsafe is used. Interesting mix of reflect/unsafe/generics: great source of inspiration.

On 11/05/2022 21:53, Loïc Dachary wrote:
> Bonjour,
> 
> This is a followup discussion of the topic started on mastodon at https://mastodon.social/@humanetech/108284656261295851
> 
> To be continued!
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
Loïc Dachary <loic@dachary.org>
Details
Message ID
<86496df9-43f3-40af-f609-4c20c969f7d4@dachary.org>
In-Reply-To
<5d55a78c-c8b2-af1b-67b6-43b2ffac1fe6@dachary.org> (view parent)
DKIM signature
missing
Download raw message
So, back in 2019 you started the ToObject function which is the base of a dynamic hierarchy of types (Actor, Activity, etc.)

https://github.com/go-ap/activitypub/blob/master/object.go#L442-L506

which is further derived, for instance for Activity into IntransitiveActivity, Question

https://github.com/go-ap/activitypub/blob/master/activity.go#L736-L760

so you effectively hand crafted a kind of object class hierarchy that bypasses Go strong typing. This is all based on https://pkg.go.dev/unsafe which looks scary at first glance. But I assume it is kind of ok since everything is done within the go-ap package in a tightly controlled environment. And not exposed to the caller of the library.

Am I getting close? Or did I deeply misunderstand something?


On 11/05/2022 22:06, Loïc Dachary wrote:
> I see you've been using generics for go-ap even before 1.18 was released, back in November 2021.
> 
> https://github.com/go-ap/activitypub/blob/master/object.go#L136-L142
> 
> I'm yet to figure out how unsafe is used. Interesting mix of reflect/unsafe/generics: great source of inspiration.
> 
> On 11/05/2022 21:53, Loïc Dachary wrote:
>> Bonjour,
>>
>> This is a followup discussion of the topic started on mastodon at https://mastodon.social/@humanetech/108284656261295851
>>
>> To be continued!
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
Details
Message ID
<20220512063936.isysabbkbyecfpj4@slate>
In-Reply-To
<a4bb4473-1379-453a-93aa-0c6aa8af49e4@dachary.org> (view parent)
DKIM signature
missing
Download raw message
On 22-05-11 21:53:23, Loïc Dachary wrote:
> Bonjour,
>

Bonjour Loïc, merci de votre interet.

I will followup with a more detailed email soon, to (hopefully) answer
some of your questions.

I will also take the opportunity to use that email as a base for
detailing the Go-ActivityPub architecture in the wiki[1].

/Marius

[1] https://man.sr.ht/~mariusor/go-activitypub/
Details
Message ID
<20220512072502.mltep3wqjk2v5ste@slate>
In-Reply-To
<86496df9-43f3-40af-f609-4c20c969f7d4@dachary.org> (view parent)
DKIM signature
missing
Download raw message
Hi again Loïc,

If you don't mind, instead of answering your questions one by one I'll
try to offer a broad picture description of how I made the design
decisions that went into the go-ap/activitypub package:

The main problem of implementing the ActivityPub spec in Go, stems from
the very dynamic nature of JSON-LD and the limited (in this respect)
features of the Go type system.

This incompatibility stems from the fact that an ActivityPub spec
compliant object property can have any of the following values:

* an IRI which can be dereferenced to an ActivityPub object
* a full ActivityPub object
* an array of ActivityPub objects
* an array of IRIs to ActivityPub objects.

This problem, I think, was solved by the go-fed package by using
//go:generate to create all the plumbing types, methods and functionality
for every ActivityPub defined object. This of course has pretty
serious impact on binary size and compilation times, which is what I
believe you already noticed.

Instead of doing that (which honestly, didn't occur to me when I started
to work on go-ap) and because the Go typesystem can't express that union
explicitly, I had to rely on a number of artifices for defining all of
the ActivityPub types but still keep a "clean" public API.

So I implemented these four "meta-types" independently and giving them
unifying behavoiur through an interface: ObjectOrLink[0], which exposes
a restricted number of methods. So, the building blocks for the go-ap
typesystem are the *IRI* (basically a glorified string holding an
internationalized URI), the *Object* (and all its compabile subtypes),
the *IRI* slice and the *Item* slice. (Item is an alias for the
ObjectOrLink interface)

Basically if any type implements it[1], it can be used with the
go-activitypub packages.

Because, I tried to follow the Go interface guidelines, I kept this
interface as minimal as possible and this led to a very limited
guaranteed API for the package.

To circumvent that, I had to implement the convenience functions that
allow a developer to assert the interface instances to actually
meaningful ActivityPub structs. They are the functions starting with
"OnXXX" and "ToXXX" in the package and give the possibility of treating
any struct that impelments "ObjectOrLink" as an "XXX" ActivityPub object,
where "XXX" can be an "Activity", an "Actor", an "Object", a "Link" samd.

This is incidentally the mechanism through which we allow extending the
default AP vocabulary by other packages. Other developers can create
their own type "YYY", implement the interface and add their "OnYYY"
functionality[2].

The first caveat about this type of logic is that it requires heavily
(at least currently) that the memory layout for each type be identical
for the common properties. The properties need to have the **same
order** and the **same types**. Basically the existing "ToXXX" functions
work as a cast does in plain C. It takes the pointer the interface
holds, and converts it to the desired type (using the "unsafe" package)
relying on the fact that the common properties of the Objects are at the
same offsets from the pointer[3].

I hope this is helpful.

/Marius

[0] https://pkg.go.dev/github.com/go-ap/activitypub#ObjectOrLink

[1] The interface matches the ActivityStreams separation between Object
and Link compatible structs:
https://www.w3.org/TR/activitystreams-vocabulary/#object-types

[2] This also requires overriding the default typer function,
which is used to return the correct type based on the "YYY.Type"
property:
https://pkg.go.dev/github.com/go-ap/activitypub#GetItemByType

[3] This behaviour is risky and it can probably change without
warning if the Go dev team changes the language's memory model at a
later date. I hope I'll find a cleaner way to implemnt this in the
future, but for now it serves its purpose.
Loïc Dachary <loic@dachary.org>
Details
Message ID
<906ee1e7-a119-3f9f-7918-0103626aed71@dachary.org>
In-Reply-To
<20220512072502.mltep3wqjk2v5ste@slate> (view parent)
DKIM signature
missing
Download raw message
This is enlightening and will save code reading time to other Gitea developers.

As a long time C developer I'm not shocked by how unsafe imposes a requirement on the types describing the vocabulary: struct member alignment is not my favorite activity but it is occasionally useful. And imposing this same requirement when adding new types which will be necessary for forgefed also sounds reasonable.

But I've never used unsafe and I may be ignorant of the reasons why this might be not be such a good idea. The first thing that comes to mind is that relying on the undocumented (is it undocumented really?) internal layout of the type could mean that in a future release of Go the assumptions you made do not hold true and the entire construct collapses? Or am I seeing monsters where there are none?

On 12/05/2022 08:25, Marius Orcsik wrote:
> Hi again Loïc,
> 
> If you don't mind, instead of answering your questions one by one I'll
> try to offer a broad picture description of how I made the design
> decisions that went into the go-ap/activitypub package:
> 
> The main problem of implementing the ActivityPub spec in Go, stems from
> the very dynamic nature of JSON-LD and the limited (in this respect)
> features of the Go type system.
> 
> This incompatibility stems from the fact that an ActivityPub spec
> compliant object property can have any of the following values:
> 
> * an IRI which can be dereferenced to an ActivityPub object
> * a full ActivityPub object
> * an array of ActivityPub objects
> * an array of IRIs to ActivityPub objects.
> 
> This problem, I think, was solved by the go-fed package by using
> //go:generate to create all the plumbing types, methods and functionality
> for every ActivityPub defined object. This of course has pretty
> serious impact on binary size and compilation times, which is what I
> believe you already noticed.
> 
> Instead of doing that (which honestly, didn't occur to me when I started
> to work on go-ap) and because the Go typesystem can't express that union
> explicitly, I had to rely on a number of artifices for defining all of
> the ActivityPub types but still keep a "clean" public API.
> 
> So I implemented these four "meta-types" independently and giving them
> unifying behavoiur through an interface: ObjectOrLink[0], which exposes
> a restricted number of methods. So, the building blocks for the go-ap
> typesystem are the *IRI* (basically a glorified string holding an
> internationalized URI), the *Object* (and all its compabile subtypes),
> the *IRI* slice and the *Item* slice. (Item is an alias for the
> ObjectOrLink interface)
> 
> Basically if any type implements it[1], it can be used with the
> go-activitypub packages.
> 
> Because, I tried to follow the Go interface guidelines, I kept this
> interface as minimal as possible and this led to a very limited
> guaranteed API for the package.
> 
> To circumvent that, I had to implement the convenience functions that
> allow a developer to assert the interface instances to actually
> meaningful ActivityPub structs. They are the functions starting with
> "OnXXX" and "ToXXX" in the package and give the possibility of treating
> any struct that impelments "ObjectOrLink" as an "XXX" ActivityPub object,
> where "XXX" can be an "Activity", an "Actor", an "Object", a "Link" samd.
> 
> This is incidentally the mechanism through which we allow extending the
> default AP vocabulary by other packages. Other developers can create
> their own type "YYY", implement the interface and add their "OnYYY"
> functionality[2].
> 
> The first caveat about this type of logic is that it requires heavily
> (at least currently) that the memory layout for each type be identical
> for the common properties. The properties need to have the **same
> order** and the **same types**. Basically the existing "ToXXX" functions
> work as a cast does in plain C. It takes the pointer the interface
> holds, and converts it to the desired type (using the "unsafe" package)
> relying on the fact that the common properties of the Objects are at the
> same offsets from the pointer[3].
> 
> I hope this is helpful.
> 
> /Marius
> 
> [0] https://pkg.go.dev/github.com/go-ap/activitypub#ObjectOrLink
> 
> [1] The interface matches the ActivityStreams separation between Object
> and Link compatible structs:
> https://www.w3.org/TR/activitystreams-vocabulary/#object-types
> 
> [2] This also requires overriding the default typer function,
> which is used to return the correct type based on the "YYY.Type"
> property:
> https://pkg.go.dev/github.com/go-ap/activitypub#GetItemByType
> 
> [3] This behaviour is risky and it can probably change without
> warning if the Go dev team changes the language's memory model at a
> later date. I hope I'll find a cleaner way to implemnt this in the
> future, but for now it serves its purpose.
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
Details
Message ID
<20220512085628.qpjkckwdtvzexwha@slate>
In-Reply-To
<906ee1e7-a119-3f9f-7918-0103626aed71@dachary.org> (view parent)
DKIM signature
missing
Download raw message
On 22-05-12 09:06:59, Loïc Dachary wrote:
> But I've never used unsafe and I may be ignorant of the reasons why this might be not be such a good idea. The first thing that comes to mind is that relying on the undocumented (is it undocumented really?) internal layout of the type could mean that in a future release of Go the assumptions you made do not hold true and the entire construct collapses? Or am I seeing monsters where there are none?

The main reason is stated in the documentation for unsafe.Pointer[1]:

> The following patterns involving Pointer are valid. Code not using
> these patterns is likely to be invalid today or to become invalid in
> the future. Even the valid patterns below come with important caveats.
> Running "go vet" can help find uses of Pointer that do not conform to
> these patterns, but silence from "go vet" is not a guarantee that the
> code is valid.
> (1) Conversion of a *T1 to Pointer to *T2.
> Provided that T2 is no larger than T1 and that the two share an
> equivalent memory layout, this conversion allows reinterpreting data of
> one type as data of another type.

The types in the activitypub package are not all the same size, Activity
types have extra properties, Actor types have extra properties,
Collection types also have extra properties. Link types have a
completely different layout all together.

So it's important that for the minimal size (which for Object types
matches the size of the Object struct, and for Links matches the size of
the Link struct, btw, these two are not compatible between themselves),
they have the same memory layout, so we can convert from the larger type
to the smaller one. Converting the other way (from a smaller Obejct to a
larger Activity, for example) should not be done, as the "bottom" of the
struct will point to invalid/not owned memory.

[1] https://pkg.go.dev/unsafe#Pointer
Reply to thread Export thread (mbox)