~migadu/alps-devel

This thread contains a patchset. You're looking at the original emails, but you may wish to use the patch review UI. Review patch
5 3

[PATCH] base: select first text part as the best one

Details
Message ID
<20231223021009.631639-1-m@rako.space>
DKIM signature
missing
Download raw message
Patch: +5 -1
As soon as we have found a "best" part to display, stop recursing into
the whole bodystructure
---
 plugins/base/imap.go | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/plugins/base/imap.go b/plugins/base/imap.go
index e9636e3..2e85287 100644
--- a/plugins/base/imap.go
+++ b/plugins/base/imap.go
@@ -211,6 +211,10 @@ func (msg *IMAPMessage) TextPart() *IMAPPartNode {
	var best *IMAPPartNode
	isTextPlain := false
	msg.BodyStructure.Walk(func(path []int, part imap.BodyStructure) bool {
		if best != nil {
			return false
		}

		singlePart, ok := part.(*imap.BodyStructureSinglePart)
		if !ok {
			return true
@@ -232,7 +236,7 @@ func (msg *IMAPMessage) TextPart() *IMAPPartNode {
				best = newIMAPPartNode(msg, path, singlePart)
			}
		}
		return true
		return best == nil
	})

	return best
-- 
2.43.0
Details
Message ID
<VUYGRWkCPjqwbt3uKNE_yKHM-wD7k-6y3MczjQGSZhJj9-fj91gHxOG_eOYf_zva37XLEY-CV9gIqCOCacUzJgPi7gMY_0vNkjiixE9ixeU=@emersion.fr>
In-Reply-To
<20231223021009.631639-1-m@rako.space> (view parent)
DKIM signature
missing
Download raw message
Hm, but with this we'd stop iterating at the first text/html part even
if there is a text/plain part after…

We probably want to add isTextPlain to the check as well.
Details
Message ID
<CXVSVXXOY6MZ.1N0YPV0384E9X@lahaute>
In-Reply-To
<VUYGRWkCPjqwbt3uKNE_yKHM-wD7k-6y3MczjQGSZhJj9-fj91gHxOG_eOYf_zva37XLEY-CV9gIqCOCacUzJgPi7gMY_0vNkjiixE9ixeU=@emersion.fr> (view parent)
DKIM signature
missing
Download raw message
> Hm, but with this we'd stop iterating at the first text/html part even
> if there is a text/plain part after…
> 
> We probably want to add isTextPlain to the check as well.

We need to decide what the heuristic is: is it "first text/plain part, and if none take text/html" ?

I receive some mails from a mailing list that roughly have this structure:

1. alternative
  1.1 text/plain    (content)
  1.1 text/html     (same content but in html)
2. text/plain      (footer of the mailing list)

with the previous code the last part was chosen. I fear that even if the structure is like this:

1. text/html
2. text/plain

Favoring text/plain will display the last part, but because it's not an alternative it might be a completely different content.

Looking back, maybe the correct question is different: "what do we want to display ?". In other softwares multiple parts are displayed one after the other. Is that not what we want to show in the end ?
Details
Message ID
<nYYpMQWEt7rgk5yKR3b_7u8FloYvG7p5JgN8ztcSYfoekXsm0vISY_xpmXmPfQLvdB6XsbGVuy7RunAw3clckw5CLX3n7CU7HnoeLPq4364=@emersion.fr>
In-Reply-To
<CXVSVXXOY6MZ.1N0YPV0384E9X@lahaute> (view parent)
DKIM signature
missing
Download raw message
On Saturday, December 23rd, 2023 at 16:24, m@rako.space <m@rako.space> wrote:

> > Hm, but with this we'd stop iterating at the first text/html part even
> 
> > if there is a text/plain part after…
> > 
> > We probably want to add isTextPlain to the check as well.
> 
> We need to decide what the heuristic is: is it "first text/plain part, and if none take text/html" ?

Yeah, I think so. Maybe there's an opportunity to simplify some code here,
by introducing a function which matches exactly a text/* MIME type without any
fallback logic like:

    func (msg *IMAPMessage) inlineTextPart(subtype string) *IMAPPartNode

And then using that to implement TextPart and HTMLPart, e.g.

    if part := msg.inlineTextPart("plain"); part != nil {
        return part
    }
    return msg.inlineTextPart("html")

> I receive some mails from a mailing list that roughly have this structure:
> 
> 1. alternative
> 1.1 text/plain (content)
> 1.1 text/html (same content but in html)
> 2. text/plain (footer of the mailing list)
> 
> with the previous code the last part was chosen. I fear that even if the structure is like this:
> 
> 1. text/html
> 2. text/plain
> 
> Favoring text/plain will display the last part, but because it's not an alternative it might be a completely different content.
> 
> Looking back, maybe the correct question is different: "what do we want to display ?". In other softwares multiple parts are displayed one after the other. Is that not what we want to show in the end ?

It really depends what the MIME structure looks like. Is the root message MIME
type "multipart/mixed"? If so, theorically we should display each part after the
other, but that sounds complicated/annoying to do (especially since some parts
may not be text, e.g. images and other MIME types).

But for some other root MIME types like multipart/alternative, we should pick a
single part.
Details
Message ID
<25XVARMTRGTV6.35MWZV2X7PRQY@rako.space>
In-Reply-To
<nYYpMQWEt7rgk5yKR3b_7u8FloYvG7p5JgN8ztcSYfoekXsm0vISY_xpmXmPfQLvdB6XsbGVuy7RunAw3clckw5CLX3n7CU7HnoeLPq4364=@emersion.fr> (view parent)
DKIM signature
pass
Download raw message
Simon Ser <contact@emersion.fr> wrote:
> Yeah, I think so. Maybe there's an opportunity to simplify some code here,
> by introducing a function which matches exactly a text/* MIME type without any
> fallback logic like:
> 
>     func (msg *IMAPMessage) inlineTextPart(subtype string) *IMAPPartNode
> 
> And then using that to implement TextPart and HTMLPart, e.g.
> 
>     if part := msg.inlineTextPart("plain"); part != nil {
>         return part
>     }
>     return msg.inlineTextPart("html")

That is actually way cleaner, but is maybe a bit too specific

> It really depends what the MIME structure looks like. Is the root message MIME
> type "multipart/mixed"? If so, theorically we should display each part after the
> other, but that sounds complicated/annoying to do (especially since some parts
> may not be text, e.g. images and other MIME types).
> 
> But for some other root MIME types like multipart/alternative, we should pick a
> single part.

The more I think about it and actually use alps, the more I think this is what we need to do. We could take some inspiration from mblaze (https://github.com/leahneukirchen/mblaze) that tries to render all parts (respecting alternative), with user-defined renderers if needed, and prefixes parts with a small description (the sequential index, the mime type, the size, the command that was used to render)

Maybe something like:

	type BodyStructure interface {
		// Reuse existing

		// Render decides the best rendering strategy for the given structure
		// and writes string readable by the user to the writer
		// If it is text, it should probably write it
		// If it is a pdf, it should probably say so with the name
		Render(w io.Writer)
	}

	func (msg *IMAPMessage) Render(w io.Writer) {
		msg.renderHeader(w)
		fmt.Fprintf(w, "\n")
		msg.BodyStructure.Render(w)
	}

	func (bs BodyStructureSinglePart) Render(w io.Writer) {
		_, part, _ := getMessagePart(blablabla)
		if bs.Text != nil {
			part.WriteTo(w)
			return
		}

		if bs.Filename() != nil {
			fmt.Fprintf(w, "A %s file called %s available at ", bs.MediaType(), bs.FileName(), part.URL())
			return
		}

		// ...
		
	}

	func (bs BodyStructureMultiPart) Render(w io.Writer) {
		if bs.Subtype == "mixed" {
			bs.Walk(func (path []int, part BodyStructure) bool {
				part.Render(w)
				fmt.Fprintf(w, "\n")
				return false
			})
			return
		}

		if bs.Subtype == "alternative" {
			var part BodyStructure
			if p := bs.inlineTextPart("plain"); p != nil {
				part = p
			} else if p := bs.inlineTextPart("html"); p != nil {
				part = p
			} else {
				// take the first one and leave
				bs.Walk(func (path []int, p BodyStructure) bool {
					part = p
					return false
				})
			}
			part.Render(w) 
		}
	}

Problem: this puts all the logic of rendering into the imap layer, where it should more probably be in the routes phase. That's visible in the way an attachment should be displayed: we probably want to show a link, but we don't want to hardcode that. 

-- 
Matthieu Rakotojaona
Details
Message ID
<ivzidOtayX-wQgrYvXoZhQHg45tSS-ScR10liJZuKE3y-UAeZS7A_-CPi9DYqR43I4sumNnwHlwmRnIuhztuA7QTZHHVNYycl2PL7sQ8tRw=@emersion.fr>
In-Reply-To
<25XVARMTRGTV6.35MWZV2X7PRQY@rako.space> (view parent)
DKIM signature
pass
Download raw message
I don't think this can work with the way we render HTML vs plaintext.
I'd rather avoid multipart/mixed rendering for now.
Reply to thread Export thread (mbox)