~adnano/go-gemini-devel

8 2

Potential tweaks for go-gemini

Kaleb Elwert <kaleb@coded.io>
Details
Message ID
<98b866dc-2de7-4075-a21a-fbcb0a2d99d5@www.fastmail.com>
DKIM signature
pass
Download raw message
I actually started work on a separate go-gemini package (at https://github.com/go-gemini/gemini so gopkg.in/gemini could be used - I'm happy to give this up if you want it) and ran across a number of things that might be worth considering here. My package isn't anywhere near as complete as this, so it would be nice to drop it in favor of this implementation.

* It would probably make more sense for Request to implement io.WriterTo rather than overloading the Write method.
* I found it easier in parsing to read until \n and check that the string ended in \r\n. It would let you avoid a separate call to Read for the next byte.
* Certain calls on a bufio.Reader won't increase the internal buffer (ReadSlice in particular). This can be used to limit read data to a certain number of bytes, making it possible to properly limit both Requests and Responses.
* It may make sense to call utf8.Valid to check any read lines. It's possible for a bad actor to send a malformed line.
* The ResponseWriter interface seems bigger than it needs to be. In particular, Conn, Close, Flush, Hijack, and reset seemed a bit outside the scope of that type. In HTTP, Hijack is mostly used for Websockets and HTTP/2, both streaming protocols which don't really line up with Gemini very well. I do see that Hijack is in place for the TimeoutHandler. This could be implemented without Hijack by wrapping the ResponseWriter which would be more viable if you moved back to the minimal ResponseWriter implementation. I'm happy to clarify if that isn't clear. The only things that I ended up with in my implementation were WriteStatus(status int, meta string) and io.Writer.
* TLS property from ResponseWriter would make more sense on the Request type. It's not really a property of the ResponseWriter.

If you're interested in them but don't have the time, I'd be happy to submit patches for most of those.

Cheers,
Kaleb Elwert
Details
Message ID
<6c791f81-7bbc-4c12-ad93-0fd885a5b40c@www.fastmail.com>
In-Reply-To
<98b866dc-2de7-4075-a21a-fbcb0a2d99d5@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message
Sorry, my client missed the autowrap - here's the same text with autowrap re-enabled and some extra spacing.

I actually started work on a separate go-gemini package (at
https://github.com/go-gemini/gemini so gopkg.in/gemini could be used - I'm happy
to give this up if you want it) and ran across a number of things that might be
worth considering here. My package isn't anywhere near as complete as this, so
it would be nice to drop it in favor of this implementation.

* It would probably make more sense for Request to implement io.WriterTo rather
  than overloading the Write method.

* I found it easier in parsing to read until \n and check that the string ended
  in \r\n. It would let you avoid a separate call to Read for the next byte.
  
* Certain calls on a bufio.Reader won't increase the internal buffer (ReadSlice
  in particular). This can be used to limit read data to a certain number of
  bytes, making it possible to properly limit both Requests and Responses.

* It may make sense to call utf8.Valid to check any read lines. It's possible
  for a bad actor to send a malformed line.
  
* The ResponseWriter interface seems bigger than it needs to be. In
  particular, Conn, Close, Flush, Hijack, and reset seemed a bit outside
  the scope of that type. In HTTP, Hijack is mostly used for Websockets
  and HTTP/2, both streaming protocols which don't really line up with
  Gemini very well. I do see that Hijack is in place for the
  TimeoutHandler. This could be implemented without Hijack by wrapping
  the ResponseWriter which would be more viable if you moved back to the
  minimal ResponseWriter implementation. I'm happy to clarify if that
  isn't clear. The only things that I ended up with in my implementation
  were WriteStatus(status int, meta string) and io.Writer.
  
* TLS property from ResponseWriter would make more sense on the Request type.
  It's not really a property of the ResponseWriter.

If you're interested in them but don't have the time, I'd be happy to submit
patches for most of those.

Cheers,
Kaleb Elwert

On Wed, Feb 24, 2021, at 12:41 AM, Kaleb Elwert wrote:
> I actually started work on a separate go-gemini package (at 
> https://github.com/go-gemini/gemini so gopkg.in/gemini could be used - 
> I'm happy to give this up if you want it) and ran across a number of 
> things that might be worth considering here. My package isn't anywhere 
> near as complete as this, so it would be nice to drop it in favor of 
> this implementation.
> 
> * It would probably make more sense for Request to implement 
> io.WriterTo rather than overloading the Write method.
> * I found it easier in parsing to read until \n and check that the 
> string ended in \r\n. It would let you avoid a separate call to Read 
> for the next byte.
> * Certain calls on a bufio.Reader won't increase the internal buffer 
> (ReadSlice in particular). This can be used to limit read data to a 
> certain number of bytes, making it possible to properly limit both 
> Requests and Responses.
> * It may make sense to call utf8.Valid to check any read lines. It's 
> possible for a bad actor to send a malformed line.
> * The ResponseWriter interface seems bigger than it needs to be. In 
> particular, Conn, Close, Flush, Hijack, and reset seemed a bit outside 
> the scope of that type. In HTTP, Hijack is mostly used for Websockets 
> and HTTP/2, both streaming protocols which don't really line up with 
> Gemini very well. I do see that Hijack is in place for the 
> TimeoutHandler. This could be implemented without Hijack by wrapping 
> the ResponseWriter which would be more viable if you moved back to the 
> minimal ResponseWriter implementation. I'm happy to clarify if that 
> isn't clear. The only things that I ended up with in my implementation 
> were WriteStatus(status int, meta string) and io.Writer.
> * TLS property from ResponseWriter would make more sense on the Request 
> type. It's not really a property of the ResponseWriter.
> 
> If you're interested in them but don't have the time, I'd be happy to 
> submit patches for most of those.
> 
> Cheers,
> Kaleb Elwert
>
Details
Message ID
<C9HRQCF1LDQ2.1AIJ03N6EI6EU@nitro>
In-Reply-To
<6c791f81-7bbc-4c12-ad93-0fd885a5b40c@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message
Thanks for the feedback!

On Wed Feb 24, 2021 at 3:45 AM EST, Kaleb Elwert wrote:
> * It would probably make more sense for Request to implement io.WriterTo rather
> than overloading the Write method.

I named it Write because the net/http Request has a Write method. I'm
willing to change this.

> * I found it easier in parsing to read until \n and check that the string ended
> in \r\n. It would let you avoid a separate call to Read for the next byte.

This would break the practice of using multiple \n before a final \r\n.
I'm fine with breaking this practice, as in my opinion Meta should only
be a single line. I haven't encountered any multi-line input prompts on
my time in Gemini space.

> * Certain calls on a bufio.Reader won't increase the internal buffer (ReadSlice
> in particular). This can be used to limit read data to a certain number of
> bytes, making it possible to properly limit both Requests and Responses.

Could you explain how exactly this can limit Requests and Responses?
Requests are already limited to 1026 bytes by the server, and Responses
are currently unlimited when streamed by the client. Clients can wrap
the response in an io.LimitedReader if they want to limit the response
size.

> * It may make sense to call utf8.Valid to check any read lines. It's
> possible for a bad actor to send a malformed line.

Would sending an invalid line cause an issue? I would rather avoid the
overhead of utf8.Valid.

> * The ResponseWriter interface seems bigger than it needs to be. In
> particular, Conn, Close, Flush, Hijack, and reset seemed a bit outside
> the scope of that type. In HTTP, Hijack is mostly used for Websockets
> and HTTP/2, both streaming protocols which don't really line up with
> Gemini very well. I do see that Hijack is in place for the
> TimeoutHandler. This could be implemented without Hijack by wrapping
> the ResponseWriter which would be more viable if you moved back to the
> minimal ResponseWriter implementation. I'm happy to clarify if that
> isn't clear. The only things that I ended up with in my implementation
> were WriteStatus(status int, meta string) and io.Writer.

You're right, the ResponseWriter interface has grown some methods.
Conn and Hijack are not exactly necessary. I would remove Hijack and
move Conn to Request. However, Flush and Close are somewhat necessary to
work around limitations of the ResponseWriter. Flush allows for the
implementation of streaming responses (see examples/stream.go) and Close
allows the Handler to signal that they have finished without returning.

> * TLS property from ResponseWriter would make more sense on the Request
> type. It's not really a property of the ResponseWriter.

You're right, it does make more sense on the Request type. I'll move it.
Details
Message ID
<2c0f8ea6-ac55-493d-b62e-94e2c7985098@www.fastmail.com>
In-Reply-To
<C9HRQCF1LDQ2.1AIJ03N6EI6EU@nitro> (view parent)
DKIM signature
pass
Download raw message
Thanks for the thought-out response!

On Wed, Feb 24, 2021, at 4:43 AM, Adnan Maolood wrote:

> I named it Write because the net/http Request has a Write method. I'm
> willing to change this.

Ah, I didn't realize that. The current design probably makes sense then.
 
> > * I found it easier in parsing to read until \n and check that the string ended
> > in \r\n. It would let you avoid a separate call to Read for the next byte.
> 
> This would break the practice of using multiple \n before a final \r\n.
> I'm fine with breaking this practice, as in my opinion Meta should only
> be a single line. I haven't encountered any multi-line input prompts on
> my time in Gemini space.

I wasn't aware that was valid. This is definitely a "thanks, I hate it" moment for me.
 
> > * Certain calls on a bufio.Reader won't increase the internal buffer (ReadSlice
> > in particular). This can be used to limit read data to a certain number of
> > bytes, making it possible to properly limit both Requests and Responses.
> 
> Could you explain how exactly this can limit Requests and Responses?
> Requests are already limited to 1026 bytes by the server, and Responses
> are currently unlimited when streamed by the client. Clients can wrap
> the response in an io.LimitedReader if they want to limit the response
> size.

Sure, I was talking about the Response headers. It's possible for a bad server
to send a malformed header (extremely long with no newline) and it would crash
the client.
 
> Would sending an invalid line cause an issue? I would rather avoid the
> overhead of utf8.Valid.

I don't particularly know, it's probably fine.
 
> You're right, the ResponseWriter interface has grown some methods.
> Conn and Hijack are not exactly necessary. I would remove Hijack and
> move Conn to Request. However, Flush and Close are somewhat necessary to
> work around limitations of the ResponseWriter. Flush allows for the
> implementation of streaming responses (see examples/stream.go) and Close
> allows the Handler to signal that they have finished without returning.

If the underlying writer doesn't use bufio, removing Flush and Close shouldn't
be an issue. The client would be responsible for using .Write directly and
flushing wouldn't be necessary.

Is there ever an instance where you'd want to close the underlying connection
without returning? I believe the net/http package makes it clear that the
connection is done when the handler returns. If you need to spin off a
long-running task, or something that continues after the connection closes, you
can always fire off a goroutine.
Details
Message ID
<C9HZFK67KAF6.1I1UGLF1GQ5WR@nitro>
In-Reply-To
<2c0f8ea6-ac55-493d-b62e-94e2c7985098@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message
On Wed Feb 24, 2021 at 12:54 PM EST, Kaleb Elwert wrote:
> > > * I found it easier in parsing to read until \n and check that the string ended
> > > in \r\n. It would let you avoid a separate call to Read for the next byte.
> > 
> > This would break the practice of using multiple \n before a final \r\n.
> > I'm fine with breaking this practice, as in my opinion Meta should only
> > be a single line. I haven't encountered any multi-line input prompts on
> > my time in Gemini space.
>
> I wasn't aware that was valid. This is definitely a "thanks, I hate it"
> moment for me.

I haven't seen this behavior in the wild, but I did bring it up on the
Gemini mailing list. Someone then commented that they thought it was a
common way to create multi-line input prompts.

Relevant discussion:
https://lists.orbitalfox.eu/archives/gemini/2020/003065.html
https://lists.orbitalfox.eu/archives/gemini/2020/003067.html

If it were up to me I would make all requests and response headers end
with '\n' only, and so this behavior would be invalid.

> > Could you explain how exactly this can limit Requests and Responses?
> > Requests are already limited to 1026 bytes by the server, and Responses
> > are currently unlimited when streamed by the client. Clients can wrap
> > the response in an io.LimitedReader if they want to limit the response
> > size.
>
> Sure, I was talking about the Response headers. It's possible for a bad server
> to send a malformed header (extremely long with no newline) and it would crash
> the client.

I think we can use an io.LimitedReader to read the response header than.

> > You're right, the ResponseWriter interface has grown some methods.
> > Conn and Hijack are not exactly necessary. I would remove Hijack and
> > move Conn to Request. However, Flush and Close are somewhat necessary to
> > work around limitations of the ResponseWriter. Flush allows for the
> > implementation of streaming responses (see examples/stream.go) and Close
> > allows the Handler to signal that they have finished without returning.
>
> If the underlying writer doesn't use bufio, removing Flush and Close shouldn't
> be an issue. The client would be responsible for using .Write directly and
> flushing wouldn't be necessary.

Note that the net/http ResponseWriter implements the optional interface
http.Flusher. The net/http ResponseWriters use a bufio.Writer under the
hood.

The reason I have not made it an optional interface is because the Go
developers themselves regret that decision. It is better for
ResponseWriter implementations to only have to implement one interface.
See https://github.com/golang/go/issues/16100#issuecomment-327235587
and https://github.com/golang/go/issues/16100#issuecomment-327230028
I have not yet decided if ResponseWriter should be a struct or an
interface, and whether or not it should have an unexported method.
Using a struct or an interface with an unexported method would allow us
to extend the ResponseWriter interface over time, but it would make
custom ResponseWriter implementations impossible for the struct and
somewhat more difficult for the interface (implementors must embed an
existing ResponseWriter).

> Is there ever an instance where you'd want to close the underlying connection
> without returning? I believe the net/http package makes it clear that
> the connection is done when the handler returns. If you need to spin off a
> long-running task, or something that continues after the connection closes, you
> can always fire off a goroutine.

It can be useful for implementing manual timeouts.
See https://github.com/golang/go/issues/16100
and https://github.com/golang/go/issues/16100#issuecomment-309229988
We also expose the underlying net.Conn in Request.Conn so that Handlers
can set their own read and write deadlines.

I'd rather not inherit these problems from net/http's design. I think it
makes sense to deviate from net/http here since we are not limited by
backwards compatibility.
Details
Message ID
<0d65d143-7bfc-4ca0-a162-7809b59f8e8f@www.fastmail.com>
In-Reply-To
<C9HZFK67KAF6.1I1UGLF1GQ5WR@nitro> (view parent)
DKIM signature
pass
Download raw message
On Wed, Feb 24, 2021, at 10:45 AM, Adnan Maolood wrote:
> I haven't seen this behavior in the wild, but I did bring it up on the
> Gemini mailing list. Someone then commented that they thought it was a
> common way to create multi-line input prompts.
>
> Relevant discussion:
> https://lists.orbitalfox.eu/archives/gemini/2020/003065.html
> https://lists.orbitalfox.eu/archives/gemini/2020/003067.html
> 
> If it were up to me I would make all requests and response headers end
> with '\n' only, and so this behavior would be invalid.

Yeah, strictly conforming to the spec means your implementation is probably
better. I was basing mine off of how I've handled IRC in the past which doesn't
allow \n anywhere in the lines (and some servers/clients don't even send the
full \r\n).

> I think we can use an io.LimitedReader to read the response header than.

That makes sense - I don't know why I was under the impression that an
io.LimitedReader used a bufio.Reader under the hood.

> The reason I have not made it an optional interface is because the Go
> developers themselves regret that decision. It is better for
> ResponseWriter implementations to only have to implement one interface.
> See https://github.com/golang/go/issues/16100#issuecomment-327235587
> and https://github.com/golang/go/issues/16100#issuecomment-327230028
> I have not yet decided if ResponseWriter should be a struct or an
> interface, and whether or not it should have an unexported method.
> Using a struct or an interface with an unexported method would allow us
> to extend the ResponseWriter interface over time, but it would make
> custom ResponseWriter implementations impossible for the struct and
> somewhat more difficult for the interface (implementors must embed an
> existing ResponseWriter).

I'm not a huge fan of making it a struct or requiring embedding - one of the
powerful features of making it into an interface is that you can change out
the implementation or wrap it in middleware. It's how chi implements their
logger middleware. Maybe that would work if it's only a marker method, but
the current version exposes an internal API (reset) via this interface which makes
it impossible to wrap properly.

Admittedly, it *may* be possible to make it concrete if all the functionality needed
by other people is exposed here (as an example I'd need BytesWritten,
WrittenStatus and WrittenMeta and maybe HasWritten to properly implement a
logger middleware). If it's concrete, I don't see as much of an issue with making
it fairly complex.
 
> > Is there ever an instance where you'd want to close the underlying connection
> > without returning? I believe the net/http package makes it clear that
> > the connection is done when the handler returns. If you need to spin off a
> > long-running task, or something that continues after the connection closes, you
> > can always fire off a goroutine.
> 
> It can be useful for implementing manual timeouts.
> See https://github.com/golang/go/issues/16100
> and https://github.com/golang/go/issues/16100#issuecomment-309229988
> We also expose the underlying net.Conn in Request.Conn so that Handlers
> can set their own read and write deadlines.

Though doesn't this make it much easier to mess up the common case? If you're
serving a response and the server does not close the connection when the handler
is done, you would have to make sure every Handler explicitly calls writer.Close() or
risk having a hanging connection to a misbehaving client.

> I'd rather not inherit these problems from net/http's design. I think it
> makes sense to deviate from net/http here since we are not limited by
> backwards compatibility.

Interesting, I've never used those other interfaces, but it seems like some of them
(Flusher in particular seems to only needed because in http the Writer is backed by a
bufio.Writer, Hijacker is mostly used for upgrading connections from HTTP/1 to
WebSockets and HTTP/2) are for functionalities which this package doesn't really
need because gemini is a much simpler protocol. Maybe it's still a good idea, and
I'll accept whatever you decide, but I don't personally see the value there when
a handler could already wrap the RequestWriter in a bufio.Writer if they really
need that functionality.

I definitely agree with keeping them all in one interface (chi has to do some extra
work to make sure they're wrapping all the relevant interfaces if they exist), I'm
just not sure I agree with the value those additional methods provide.

https://github.com/go-chi/chi/blob/32e49a13fe6ec25275edf723814166dc092fe69f/middleware/wrap_writer.go
Details
Message ID
<C9I1JEU0L528.1DUNJRVK7Z62R@nitro>
In-Reply-To
<0d65d143-7bfc-4ca0-a162-7809b59f8e8f@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message
On Wed Feb 24, 2021 at 3:14 PM EST, Kaleb Elwert wrote:
> I'm not a huge fan of making it a struct or requiring embedding - one of the
> powerful features of making it into an interface is that you can change out
> the implementation or wrap it in middleware. It's how chi implements their
> logger middleware. Maybe that would work if it's only a marker method, but
> the current version exposes an internal API (reset) via this interface which
> makes it impossible to wrap properly.

I have removed the unexported methods before releasing v0.1.15. So it is
now possible to implement ResponseWriter without embedding.

> Admittedly, it *may* be possible to make it concrete if all the functionality
> needed by other people is exposed here (as an example I'd need BytesWritten,
> WrittenStatus and WrittenMeta and maybe HasWritten to properly implement
> a logger middleware). If it's concrete, I don't see as much of an issue
> with making it fairly complex.

I may eventually go this route in a future version. It definitely would
need some thought and design, though.

The fasthttp package has a RequestCtx struct with *lots* of methods, and
passes a RequestCtx to Handlers instead of a ResponseWriter and Request. 
See https://godocs.io/github.com/valyala/fasthttp#RequestCtx
We probably don't need that many methods, but we could adopt a similar
approach.

> > > Is there ever an instance where you'd want to close the underlying connection
> > > without returning? I believe the net/http package makes it clear that
> > > the connection is done when the handler returns. If you need to spin off a
> > > long-running task, or something that continues after the connection closes, you
> > > can always fire off a goroutine.
> > 
> > It can be useful for implementing manual timeouts.
> > See https://github.com/golang/go/issues/16100
> > and https://github.com/golang/go/issues/16100#issuecomment-309229988
> > We also expose the underlying net.Conn in Request.Conn so that Handlers
> > can set their own read and write deadlines.
>
> Though doesn't this make it much easier to mess up the common case? If you're
> serving a response and the server does not close the connection when the handler
> is done, you would have to make sure every Handler explicitly calls writer.Close()
> or risk having a hanging connection to a misbehaving client.

The Server still closes the connection after returning, Close just allows the
Handler to close the connection before returning. Close is not really
necessary since you can set the read and write deadline on the
Request.Conn, so I am fine with removing it.

> Interesting, I've never used those other interfaces, but it seems like some of them
> (Flusher in particular seems to only needed because in http the Writer is backed by a
> bufio.Writer, Hijacker is mostly used for upgrading connections from HTTP/1 to
> WebSockets and HTTP/2) are for functionalities which this package doesn't really
> need because gemini is a much simpler protocol. Maybe it's still a good idea, and
> I'll accept whatever you decide, but I don't personally see the value there when
> a handler could already wrap the RequestWriter in a bufio.Writer if they really
> need that functionality.

Since you can set the read and write deadline on the Conn, I am fine
with removing Close in a future version. However, buffered I/O is more
efficient than using a raw io.Writer, so I am reluctant to remove Flush.
I also don't want to make Flush part of an optional interface.
ResponseWriters that don't support flushing can just return nil.

> I definitely agree with keeping them all in one interface (chi has to do some extra
> work to make sure they're wrapping all the relevant interfaces if they exist), I'm
> just not sure I agree with the value those additional methods provide.
>
> https://github.com/go-chi/chi/blob/32e49a13fe6ec25275edf723814166dc092fe69f/middleware/wrap_writer.go

Yes, this is why we want to avoid optional interfaces.
Details
Message ID
<28d63eb6-5c7d-44da-9454-479b503458a6@www.fastmail.com>
In-Reply-To
<C9I1JEU0L528.1DUNJRVK7Z62R@nitro> (view parent)
DKIM signature
pass
Download raw message
On Wed, Feb 24, 2021, at 12:24 PM, Adnan Maolood wrote:
> I have removed the unexported methods before releasing v0.1.15. So it is
> now possible to implement ResponseWriter without embedding.

I missed that. Awesome!

> The Server still closes the connection after returning, Close just allows the
> Handler to close the connection before returning. Close is not really
> necessary since you can set the read and write deadline on the
> Request.Conn, so I am fine with removing it.

Ah, that makes way more sense.
 
> Since you can set the read and write deadline on the Conn, I am fine
> with removing Close in a future version. However, buffered I/O is more
> efficient than using a raw io.Writer, so I am reluctant to remove Flush.
> I also don't want to make Flush part of an optional interface.
> ResponseWriters that don't support flushing can just return nil.

My main reason for removing them was to keep the interface simple, but if there's
a reason to use those methods, it makes sense to keep them.

Thanks for taking the time to respond! Looking forward to building more stuff with this.
Details
Message ID
<C9KJSZEABVEW.8Z4K2SD6L0BM@nitro>
In-Reply-To
<C9HZFK67KAF6.1I1UGLF1GQ5WR@nitro> (view parent)
DKIM signature
pass
Download raw message
On Wed Feb 24, 2021 at 1:45 PM EST, Adnan Maolood wrote:
> I think we can use an io.LimitedReader to read the response header than.

Limiting the read of the Response header turned out to be more
complicated than I anticipated. We can't simply use an io.LimitedReader,
nor can we limit the buffer size and reset the buffer size later on
without potentially losing some read data.

For now it is probably best to use a context with a timeout and pass
that context to Client.Do. It's not usually in the server's interest to
send very long response headers to the client, so this shouldn't be much
of a problem.
Reply to thread Export thread (mbox)