~ivilata/gwit-spec

3 3

URL retrieval/transmission

Details
Message ID
<b4d26038-f2cf-477e-8358-a5aa7805734a@zaclys.net>
DKIM signature
pass
Download raw message
Hi !

I have needed to share the link of a post on my gwit site.

The spec defines how to compose/follow a link in the context of a qwit 
site (through well-known site), but not outside a gwit site (ex. sharing 
by email, http, ect.).

I suggest a syntax similar to GET requests parameters, using the 
character ? as a separator.

The link MAY contain the address of a remote for the site, in the form 
of an http-encoded string, provided after the ? character..

This would give :

gwit://[<VERSION>@]<SITE><PATH>[#<FRAGMENT>][?<URL_ENCODED_REMOTE_ADDRESS>]

This would give, in my case :

gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi?remote=https%3A%2F%2Fframagit.org%2Fmatograine%2Fgwitsite.git
Details
Message ID
<ZweaKB7LWIrJvM3Q@sax>
In-Reply-To
<b4d26038-f2cf-477e-8358-a5aa7805734a@zaclys.net> (view parent)
DKIM signature
pass
Download raw message
Hi matograine, thanks for proposal and sorry for the delay.  More below.

matograine (2024-10-01 20:37:51 +0200) wrote:

> […] I have needed to share the link of a post on my gwit site.
> 
> The spec defines how to compose/follow a link in the context of a qwit site
> (through well-known site), but not outside a gwit site (ex. sharing by
> email, http, ect.).
> 

Actually the Well-Known URI is rather intended for discovering whether a site
that you're already accessing by other means (HTTP) is also available via
gwit.  The most similar thing currently supported for sharing the link as you
mentioned would be for the email containing the gwit link to attach or inline
a site introduction.

But yes, that's cumbersome and I therefore see the value or your proposal.  If
I get it correctly, the gwit client receiving the "extended" gwit URI should
be able to split the remote URI out of it and use it to setup an initial
retrieval of the site if it didn't know about it in advance, right?

> I suggest a syntax similar to GET requests parameters, using the character ?
> as a separator.
> 
> The link MAY contain the address of a remote for the site, in the form of an
> http-encoded string, provided after the ? character..
> 
> This would give :
> 
> gwit://[<VERSION>@]<SITE><PATH>[#<FRAGMENT>][?<URL_ENCODED_REMOTE_ADDRESS>]
> 
> This would give, in my case :
> 
> gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi?remote=https%3A%2F%2Fframagit.org%2Fmatograine%2Fgwitsite.git

The proposal reminds me of how Magnet URIs include tracker or source info, or
how some PyPI URLS included an `#md5=…` fragment for client-side hash
verification.

We should be careful not to break RFC 3986 (URI Generic Syntax) here: query
strings come between path and fragment, otherwise the query is part of the
fragment (i.e. it comes after the `#`).  gwit URIs don't support query strings
and that's a whole can of worms not worth opening here in my opinion.

But using the fragment may make sense, as its info doesn't alter the identity
of the resource being retrieved (RFC section 3.4), and it's supposed to
somehow be removed by the client from the request (not that there are requests
as such in gwit, though).  So an option would be to allow a query-like tail
that would be removed by the client, e.g. (note the `#`):

    gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi#?remote=https%3A%2F%2Fframagit.org%2Fmatograine%2Fgwitsite.git
    becomes
    gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi

Or:

    gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi#…?<existing stuff>&remote=https%3A%2F%2Fframagit.org%2Fmatograine%2Fgwitsite.git
    becomes
    gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi#…?<existing stuff>

One could argue that that may interfere with some use of fragments for certain
document type (though none comes to my mind).  Of course the alternative
brings us back to query strings:

    gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024/offlirsoch.gmi?remote=https%3A%2F%2Fframagit.org%2Fmatograine%2Fgwitsite.git#…?<existing stuff>

Note how the existing fragment had to be moved to the end, which may not be
evident when creating the URI, besides breaking the RFC's statement that the
query applies to the resource and not the whole URI…  However, using the
fragment also abuses the RFC's stated purpose (section 3.5) of identifying
something within the resource.  But gwit already makes a "creative" use of the
user information part of the URI for specifying site versions anyway, so it
wouldn't be a first… 😁

So I lean towards the fragment approach, though it may deserve some more
thinking…

Cheers,

-- 
Ivan Vilata i Balaguer -- https://elvil.net/
Details
Message ID
<91bf13db-e9cf-4a4d-b255-3184e153826e@zaclys.net>
In-Reply-To
<ZweaKB7LWIrJvM3Q@sax> (view parent)
Sender timestamp
1737613867
DKIM signature
pass
Download raw message
Hi Ivan !

I repost our private discussion :

 > matograine@zaclys.net (2025-01-15) wrote :
 >> Using the fragment instead of requests string seems relevant to me.
 >>
 >> Moreover, since we don't want the protocol to be extended, we do not
 >> need the part `remote=` (this opens the door to query-string like
 >> fragment processing).
 >> Fragment could be split on `?`, the first part being an indication on
 >> which part of the page to show, the second part being the remote to
 >> use. This would give such URIs :
 >>
 >> => gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024
 >> /offlirsoch.gmi#?https%3A%2F%2Fframagit.org%2Fmatograine
 >> %2Fgwitsite.git
 >>
 >> => gwit://16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024
 >> /offlirsoch.gmi#<existing stuff>?https%3A%2F%2Fframagit.org
 >> %2Fmatograine%2Fgwitsite.git

Le 22/01/2025 à 17:02, Ivan Vilata i Balaguer a écrit :
 > The `remote=` was mostly there for readability, but you have a very
 > good point about non-extendability!
 >
 > Checking section 3.5 of RFC3986, fragments may contain path components
 > (`pchar` category), `/` and `?`.  This means two things:
 >
 > 1. There's no need to percent-encode many chars in the remote URL.
 > The ones above may be written as `https://framagit.org/matograine
 > /gwitsite.git`. Way more readable. 🙂
 >
 > 2. Sadly, a `?` may appear in the remote URL, making the splitting of 
 > the fragment more involved if we use `?` to delimit the remote URL.
 >
 > Instead, we may try to use a separator string that wouldn't be usually
 > found in URLs (nor fragments); we can count on the two extra
 > characters `/?` allowed in fragments, which will BTW be used with the 
 > usual restrictions in the remote URL.  So with `/` we may use `///` 
as > a separator (`//` is obvously too frequent); with `?` we may use 
`??`.  >
 > Path components also allow `:` and `@`;
 > I've seen chains of `:` in some URLs, but never chains of `@`, so `@@`
 > may be another option (I like that it looks like a pair of eyes, or a
 > "double at" for pointing at a location).
 >
 > Though I like `@@` best, `??` may be the safest option, so your
 > example would become:
 >
 > => gwit://0x16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024
 > /offlirsoch.gmi#existing-stuff??https://framagit.org/matograine
 > /gwitsite.git>
 >
 > A client parsing that would get the fragment, cut it on the last `??`,
 > use the URL to its right as a remote, and leave the stuff to its left
 > as the fragment (removing it altogether if left empty).
 >
 > How does this look to you?

This solution seems fine !

I prefer ?? because I understand it as "Where do I find this ?"

So, when a gwit client weets such a URI :

- if it does not already know the site, it uses the provided remote

- if it already knows the site, it MAY use the provided remote but MAY 
as well use already-known remotes. The client MAY have an interface so 
user car indicate which  remote(s) to use.

Sincerely,

Matograine
Details
Message ID
<Z5fZav8V7dOROj14@sax>
In-Reply-To
<91bf13db-e9cf-4a4d-b255-3184e153826e@zaclys.net> (view parent)
Sender timestamp
1738008442
DKIM signature
pass
Download raw message
matograine (2025-01-23 06:31:07 +0100) wrote:

> Le 22/01/2025 à 17:02, Ivan Vilata i Balaguer a écrit :
> >
> > Instead, we may try to use a separator string that wouldn't be usually
> > found in URLs (nor fragments); we can count on the two extra characters
> > `/?` allowed in fragments, which will BTW be used with the usual
> > restrictions in the remote URL.  So with `/` we may use `///` as a
> > separator (`//` is obvously too frequent); with `?` we may use `??`.  >
> > Path components also allow `:` and `@`; I've seen chains of `:` in some
> > URLs, but never chains of `@`, so `@@` may be another option (I like that
> > it looks like a pair of eyes, or a "double at" for pointing at a
> > location).
> >
> > Though I like `@@` best, `??` may be the safest option, so your
> > example would become:
> >
> > => gwit://0x16C8A566BB88303C2513CF6328996D46E0440E85/blog/2024
> > /offlirsoch.gmi#existing-stuff??https://framagit.org/matograine
> > /gwitsite.git
> >
> > A client parsing that would get the fragment, cut it on the last `??`,
> > use the URL to its right as a remote, and leave the stuff to its left
> > as the fragment (removing it altogether if left empty).
> >
> > How does this look to you?
> 
> This solution seems fine !
> 
> I prefer ?? because I understand it as "Where do I find this ?"
> 
> So, when a gwit client weets such a URI :
> 
> - if it does not already know the site, it uses the provided remote
> 
> - if it already knows the site, it MAY use the provided remote but MAY as
> well use already-known remotes. The client MAY have an interface so user car
> indicate which  remote(s) to use.

Yes, I guess it makes sense that every client SHOULD split the fragment on
`??`, but they MAY implement the handling of the remote as they see fit.

I'll add an issue to the spec describing this feature request as a reminder.

Thanks for you suggestions!

-- 
Ivan Vilata i Balaguer -- https://elvil.net/
Reply to thread Export thread (mbox)