~rjarry/aerc-devel

This thread contains a patchset. You're looking at the original emails, but you may wish to use the patch review UI. Review patch
9 3

[PATCH v3] maildir: fix data race in maildir worker

Details
Message ID
<20220306090634.53120-1-w@104d.net>
DKIM signature
missing
Download raw message
Patch: +82 -25
Fix a data race due dirInfo pointer being read in the main goroutine
by NewMessageStore and written in the anonymous goroutine launched in
Worker.getDirectoryInfo.

To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we
use readdir(3) once, parse and cache its results, this replaces
go-maildir library Dir.Flags based  stat(3) and filepath.Glob
causing the issue when N (emails) is large.

Signed-off-by: wagner riffel <w@104d.net>
---

All previous versions of this patch were broken because it's missing a
seek to reset readdir position to the beginning, I found this nicer
workaround, which is also faster than the previous one, but always
falls back to go-maildir Flags if it can't parse a flag, (in my tests,
it could parse all emails in generated and my personal maildir)
currently the main loop is running 400,000 emails in 165~175ms.

 worker/maildir/worker.go | 107 ++++++++++++++++++++++++++++++---------
 1 file changed, 82 insertions(+), 25 deletions(-)

diff --git a/worker/maildir/worker.go b/worker/maildir/worker.go
index 6ff66b2..df37291 100644
--- a/worker/maildir/worker.go
+++ b/worker/maildir/worker.go
@@ -9,6 +9,8 @@ import (
	"net/url"
	"os"
	"path/filepath"
	"sort"
	"strings"

	"github.com/emersion/go-maildir"
	"github.com/fsnotify/fsnotify"
@@ -120,6 +122,36 @@ func (w *Worker) err(msg types.WorkerMessage, err error) {
	}, nil)
}

func splitMaildirFile(name string) (uniq string, flags []maildir.Flag, err error) {
	i := strings.LastIndexByte(name, ':')
	if i < 0 {
		return "", nil, &maildir.MailfileError{Name: name}
	}
	info := name[i+1:]
	uniq = name[:i]
	if len(info) < 2 {
		return "", nil, &maildir.FlagError{Info: info, Experimental: false}
	}
	if info[1] != ',' || info[0] != '2' {
		return "", nil, &maildir.FlagError{Info: info, Experimental: false}
	}
	if info[0] == '1' {
		return "", nil, &maildir.FlagError{Info: info, Experimental: true}
	}
	flags = []maildir.Flag(info[2:])
	sort.Slice(flags, func(i, j int) bool { return info[i] < info[j] })
	return uniq, flags, nil
}

func dirFiles(name string) ([]string, error) {
	dir, err := os.Open(filepath.Join(name, "cur"))
	if err != nil {
		return nil, err
	}
	defer dir.Close()
	return dir.Readdirnames(-1)
}

func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
	dirInfo := &models.DirectoryInfo{
		Name:     name,
@@ -136,6 +168,21 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
	}

	dir := w.c.Dir(name)
	var keyFlags map[string][]maildir.Flag
	files, err := dirFiles(string(dir))
	if err == nil {
		keyFlags = make(map[string][]maildir.Flag, len(files))
		for _, v := range files {
			key, flags, err := splitMaildirFile(v)
			if err != nil {
				w.worker.Logger.Printf("%q: error parsing flags (%q): %v", v, key, err)
				continue
			}
			keyFlags[key] = flags
		}
	} else {
		w.worker.Logger.Printf("disabled flags cache: %q: %v", dir, err)
	}

	uids, err := w.c.UIDs(dir)
	if err != nil {
@@ -144,38 +191,48 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
	}

	dirInfo.Exists = len(uids)

	go func() {
		info := dirInfo
		for _, uid := range uids {
			message, err := w.c.Message(dir, uid)
			if err != nil {
				w.worker.Logger.Printf("could not get message: %v", err)
				continue
	for _, uid := range uids {
		message, err := w.c.Message(dir, uid)
		if err != nil {
			w.worker.Logger.Printf("could not get message: %v", err)
			continue
		}
		var flags []maildir.Flag
		if keyFlags != nil {
			ok := false
			flags, ok = keyFlags[message.key]
			if !ok {
				w.worker.Logger.Printf("message (key=%q uid=%d) not found in map cache", message.key, message.uid)
				flags, err = message.Flags()
				if err != nil {
					w.worker.Logger.Printf("could not get flags: %v", err)
					continue
				}
			}
			flags, err := message.Flags()
		} else {
			flags, err = message.Flags()
			if err != nil {
				w.worker.Logger.Printf("could not get flags: %v", err)
				continue
			}
			seen := false
			for _, flag := range flags {
				if flag == maildir.FlagSeen {
					seen = true
				}
			}
			if !seen {
				info.Unseen++
			}
			if w.c.IsRecent(uid) {
				info.Recent++
		}
		seen := false
		for _, flag := range flags {
			if flag == maildir.FlagSeen {
				seen = true
				break
			}
		}
		info.Unseen += info.Recent
		info.Exists += info.Recent
		info.AccurateCounts = true
	}()

		if !seen {
			dirInfo.Unseen++
		}
		if w.c.IsRecent(uid) {
			dirInfo.Recent++
		}
	}
	dirInfo.Unseen += dirInfo.Recent
	dirInfo.Exists += dirInfo.Recent
	dirInfo.AccurateCounts = true
	return dirInfo
}

--
2.34.1
Details
Message ID
<CICQBP4CC4D5.1E5GQKH5PT0MW@bisio>
In-Reply-To
<20220306090634.53120-1-w@104d.net> (view parent)
DKIM signature
missing
Download raw message
The error flooding has disappeared, thanks!

However, I'm attaching two logs produced after opening my "sent" folder (~20K
messages), one with the Maildir interface and other with notmuch. As you can
see, there is a lag of ~20s. In both cases, the lag occurs after the call to
types.FetchDirectoryContents, which stalls silently and then produce errors
related to the mime structure of some of my mails.

Is this an expected behaviour? I'm not surprised that some of my mails have a
wrong structure, but other softwares do not complain of this and gets to open
my folder much faster (especially notmuch).

On Sun Mar 6, 2022 at 10:06 AM CET, wagner riffel wrote:
> Fix a data race due dirInfo pointer being read in the main goroutine
> by NewMessageStore and written in the anonymous goroutine launched in
> Worker.getDirectoryInfo.
>
> To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we
> use readdir(3) once, parse and cache its results, this replaces
> go-maildir library Dir.Flags based  stat(3) and filepath.Glob
> causing the issue when N (emails) is large.
>
> Signed-off-by: wagner riffel <w@104d.net>
> ---
>
> All previous versions of this patch were broken because it's missing a
> seek to reset readdir position to the beginning, I found this nicer
> workaround, which is also faster than the previous one, but always
> falls back to go-maildir Flags if it can't parse a flag, (in my tests,
> it could parse all emails in generated and my personal maildir)
> currently the main loop is running 400,000 emails in 165~175ms.
>
>  worker/maildir/worker.go | 107 ++++++++++++++++++++++++++++++---------
>  1 file changed, 82 insertions(+), 25 deletions(-)
>
> diff --git a/worker/maildir/worker.go b/worker/maildir/worker.go
> index 6ff66b2..df37291 100644
> --- a/worker/maildir/worker.go
> +++ b/worker/maildir/worker.go
> @@ -9,6 +9,8 @@ import (
>  	"net/url"
>  	"os"
>  	"path/filepath"
> +	"sort"
> +	"strings"
>
>  	"github.com/emersion/go-maildir"
>  	"github.com/fsnotify/fsnotify"
> @@ -120,6 +122,36 @@ func (w *Worker) err(msg types.WorkerMessage, err error) {
>  	}, nil)
>  }
>
> +func splitMaildirFile(name string) (uniq string, flags []maildir.Flag, err error) {
> +	i := strings.LastIndexByte(name, ':')
> +	if i < 0 {
> +		return "", nil, &maildir.MailfileError{Name: name}
> +	}
> +	info := name[i+1:]
> +	uniq = name[:i]
> +	if len(info) < 2 {
> +		return "", nil, &maildir.FlagError{Info: info, Experimental: false}
> +	}
> +	if info[1] != ',' || info[0] != '2' {
> +		return "", nil, &maildir.FlagError{Info: info, Experimental: false}
> +	}
> +	if info[0] == '1' {
> +		return "", nil, &maildir.FlagError{Info: info, Experimental: true}
> +	}
> +	flags = []maildir.Flag(info[2:])
> +	sort.Slice(flags, func(i, j int) bool { return info[i] < info[j] })
> +	return uniq, flags, nil
> +}
> +
> +func dirFiles(name string) ([]string, error) {
> +	dir, err := os.Open(filepath.Join(name, "cur"))
> +	if err != nil {
> +		return nil, err
> +	}
> +	defer dir.Close()
> +	return dir.Readdirnames(-1)
> +}
> +
>  func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
>  	dirInfo := &models.DirectoryInfo{
>  		Name:     name,
> @@ -136,6 +168,21 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
>  	}
>
>  	dir := w.c.Dir(name)
> +	var keyFlags map[string][]maildir.Flag
> +	files, err := dirFiles(string(dir))
> +	if err == nil {
> +		keyFlags = make(map[string][]maildir.Flag, len(files))
> +		for _, v := range files {
> +			key, flags, err := splitMaildirFile(v)
> +			if err != nil {
> +				w.worker.Logger.Printf("%q: error parsing flags (%q): %v", v, key, err)
> +				continue
> +			}
> +			keyFlags[key] = flags
> +		}
> +	} else {
> +		w.worker.Logger.Printf("disabled flags cache: %q: %v", dir, err)
> +	}
>
>  	uids, err := w.c.UIDs(dir)
>  	if err != nil {
> @@ -144,38 +191,48 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
>  	}
>
>  	dirInfo.Exists = len(uids)
> -
> -	go func() {
> -		info := dirInfo
> -		for _, uid := range uids {
> -			message, err := w.c.Message(dir, uid)
> -			if err != nil {
> -				w.worker.Logger.Printf("could not get message: %v", err)
> -				continue
> +	for _, uid := range uids {
> +		message, err := w.c.Message(dir, uid)
> +		if err != nil {
> +			w.worker.Logger.Printf("could not get message: %v", err)
> +			continue
> +		}
> +		var flags []maildir.Flag
> +		if keyFlags != nil {
> +			ok := false
> +			flags, ok = keyFlags[message.key]
> +			if !ok {
> +				w.worker.Logger.Printf("message (key=%q uid=%d) not found in map cache", message.key, message.uid)
> +				flags, err = message.Flags()
> +				if err != nil {
> +					w.worker.Logger.Printf("could not get flags: %v", err)
> +					continue
> +				}
>  			}
> -			flags, err := message.Flags()
> +		} else {
> +			flags, err = message.Flags()
>  			if err != nil {
>  				w.worker.Logger.Printf("could not get flags: %v", err)
>  				continue
>  			}
> -			seen := false
> -			for _, flag := range flags {
> -				if flag == maildir.FlagSeen {
> -					seen = true
> -				}
> -			}
> -			if !seen {
> -				info.Unseen++
> -			}
> -			if w.c.IsRecent(uid) {
> -				info.Recent++
> +		}
> +		seen := false
> +		for _, flag := range flags {
> +			if flag == maildir.FlagSeen {
> +				seen = true
> +				break
>  			}
>  		}
> -		info.Unseen += info.Recent
> -		info.Exists += info.Recent
> -		info.AccurateCounts = true
> -	}()
> -
> +		if !seen {
> +			dirInfo.Unseen++
> +		}
> +		if w.c.IsRecent(uid) {
> +			dirInfo.Recent++
> +		}
> +	}
> +	dirInfo.Unseen += dirInfo.Recent
> +	dirInfo.Exists += dirInfo.Recent
> +	dirInfo.AccurateCounts = true
>  	return dirInfo
>  }
>
> --
> 2.34.1
Details
Message ID
<CICQEX8K24UR.VPDOXM635XEO@bisio>
In-Reply-To
<CICQBP4CC4D5.1E5GQKH5PT0MW@bisio> (view parent)
DKIM signature
missing
Download raw message
Here are the forgotten attachments. Sorry about it.


On Sun Mar 6, 2022 at 11:58 AM CET, inwit wrote:
> The error flooding has disappeared, thanks!
>
> However, I'm attaching two logs produced after opening my "sent" folder (~20K
> messages), one with the Maildir interface and other with notmuch. As you can
> see, there is a lag of ~20s. In both cases, the lag occurs after the call to
> types.FetchDirectoryContents, which stalls silently and then produce errors
> related to the mime structure of some of my mails.
>
> Is this an expected behaviour? I'm not surprised that some of my mails have a
> wrong structure, but other softwares do not complain of this and gets to open
> my folder much faster (especially notmuch).
>
> On Sun Mar 6, 2022 at 10:06 AM CET, wagner riffel wrote:
> > Fix a data race due dirInfo pointer being read in the main goroutine
> > by NewMessageStore and written in the anonymous goroutine launched in
> > Worker.getDirectoryInfo.
> >
> > To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we
> > use readdir(3) once, parse and cache its results, this replaces
> > go-maildir library Dir.Flags based  stat(3) and filepath.Glob
> > causing the issue when N (emails) is large.
> >
> > Signed-off-by: wagner riffel <w@104d.net>
> > ---
> >
> > All previous versions of this patch were broken because it's missing a
> > seek to reset readdir position to the beginning, I found this nicer
> > workaround, which is also faster than the previous one, but always
> > falls back to go-maildir Flags if it can't parse a flag, (in my tests,
> > it could parse all emails in generated and my personal maildir)
> > currently the main loop is running 400,000 emails in 165~175ms.
> >
> >  worker/maildir/worker.go | 107 ++++++++++++++++++++++++++++++---------
> >  1 file changed, 82 insertions(+), 25 deletions(-)
> >
> > diff --git a/worker/maildir/worker.go b/worker/maildir/worker.go
> > index 6ff66b2..df37291 100644
> > --- a/worker/maildir/worker.go
> > +++ b/worker/maildir/worker.go
> > @@ -9,6 +9,8 @@ import (
> >  	"net/url"
> >  	"os"
> >  	"path/filepath"
> > +	"sort"
> > +	"strings"
> >
> >  	"github.com/emersion/go-maildir"
> >  	"github.com/fsnotify/fsnotify"
> > @@ -120,6 +122,36 @@ func (w *Worker) err(msg types.WorkerMessage, err error) {
> >  	}, nil)
> >  }
> >
> > +func splitMaildirFile(name string) (uniq string, flags []maildir.Flag, err error) {
> > +	i := strings.LastIndexByte(name, ':')
> > +	if i < 0 {
> > +		return "", nil, &maildir.MailfileError{Name: name}
> > +	}
> > +	info := name[i+1:]
> > +	uniq = name[:i]
> > +	if len(info) < 2 {
> > +		return "", nil, &maildir.FlagError{Info: info, Experimental: false}
> > +	}
> > +	if info[1] != ',' || info[0] != '2' {
> > +		return "", nil, &maildir.FlagError{Info: info, Experimental: false}
> > +	}
> > +	if info[0] == '1' {
> > +		return "", nil, &maildir.FlagError{Info: info, Experimental: true}
> > +	}
> > +	flags = []maildir.Flag(info[2:])
> > +	sort.Slice(flags, func(i, j int) bool { return info[i] < info[j] })
> > +	return uniq, flags, nil
> > +}
> > +
> > +func dirFiles(name string) ([]string, error) {
> > +	dir, err := os.Open(filepath.Join(name, "cur"))
> > +	if err != nil {
> > +		return nil, err
> > +	}
> > +	defer dir.Close()
> > +	return dir.Readdirnames(-1)
> > +}
> > +
> >  func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
> >  	dirInfo := &models.DirectoryInfo{
> >  		Name:     name,
> > @@ -136,6 +168,21 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
> >  	}
> >
> >  	dir := w.c.Dir(name)
> > +	var keyFlags map[string][]maildir.Flag
> > +	files, err := dirFiles(string(dir))
> > +	if err == nil {
> > +		keyFlags = make(map[string][]maildir.Flag, len(files))
> > +		for _, v := range files {
> > +			key, flags, err := splitMaildirFile(v)
> > +			if err != nil {
> > +				w.worker.Logger.Printf("%q: error parsing flags (%q): %v", v, key, err)
> > +				continue
> > +			}
> > +			keyFlags[key] = flags
> > +		}
> > +	} else {
> > +		w.worker.Logger.Printf("disabled flags cache: %q: %v", dir, err)
> > +	}
> >
> >  	uids, err := w.c.UIDs(dir)
> >  	if err != nil {
> > @@ -144,38 +191,48 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
> >  	}
> >
> >  	dirInfo.Exists = len(uids)
> > -
> > -	go func() {
> > -		info := dirInfo
> > -		for _, uid := range uids {
> > -			message, err := w.c.Message(dir, uid)
> > -			if err != nil {
> > -				w.worker.Logger.Printf("could not get message: %v", err)
> > -				continue
> > +	for _, uid := range uids {
> > +		message, err := w.c.Message(dir, uid)
> > +		if err != nil {
> > +			w.worker.Logger.Printf("could not get message: %v", err)
> > +			continue
> > +		}
> > +		var flags []maildir.Flag
> > +		if keyFlags != nil {
> > +			ok := false
> > +			flags, ok = keyFlags[message.key]
> > +			if !ok {
> > +				w.worker.Logger.Printf("message (key=%q uid=%d) not found in map cache", message.key, message.uid)
> > +				flags, err = message.Flags()
> > +				if err != nil {
> > +					w.worker.Logger.Printf("could not get flags: %v", err)
> > +					continue
> > +				}
> >  			}
> > -			flags, err := message.Flags()
> > +		} else {
> > +			flags, err = message.Flags()
> >  			if err != nil {
> >  				w.worker.Logger.Printf("could not get flags: %v", err)
> >  				continue
> >  			}
> > -			seen := false
> > -			for _, flag := range flags {
> > -				if flag == maildir.FlagSeen {
> > -					seen = true
> > -				}
> > -			}
> > -			if !seen {
> > -				info.Unseen++
> > -			}
> > -			if w.c.IsRecent(uid) {
> > -				info.Recent++
> > +		}
> > +		seen := false
> > +		for _, flag := range flags {
> > +			if flag == maildir.FlagSeen {
> > +				seen = true
> > +				break
> >  			}
> >  		}
> > -		info.Unseen += info.Recent
> > -		info.Exists += info.Recent
> > -		info.AccurateCounts = true
> > -	}()
> > -
> > +		if !seen {
> > +			dirInfo.Unseen++
> > +		}
> > +		if w.c.IsRecent(uid) {
> > +			dirInfo.Recent++
> > +		}
> > +	}
> > +	dirInfo.Unseen += dirInfo.Recent
> > +	dirInfo.Exists += dirInfo.Recent
> > +	dirInfo.AccurateCounts = true
> >  	return dirInfo
> >  }
> >
> > --
> > 2.34.1
Details
Message ID
<CICYK9I5P8FD.2PR1YH5IMT2SZ@pampas>
In-Reply-To
<CICQBP4CC4D5.1E5GQKH5PT0MW@bisio> (view parent)
DKIM signature
missing
Download raw message
On Sun Mar 6, 2022 at 11:58 AM CET, inwit wrote:
> However, I'm attaching two logs produced after opening my "sent" folder (~20K
> messages), one with the Maildir interface and other with notmuch. As you can
> see, there is a lag of ~20s. In both cases, the lag occurs after the call to
> types.FetchDirectoryContents, which stalls silently and then produce errors
> related to the mime structure of some of my mails.
>
> Is this an expected behaviour? I'm not surprised that some of my mails have a
> wrong structure, but other softwares do not complain of this and gets to open
> my folder much faster (especially notmuch).
>

Those errors seems unrelated with this changes, 20s lag is very
surprising, can you confirm that they only happen after applying this
patch?
Details
Message ID
<CID01WIMLU3J.2BR6J4PDYH976@bisio>
In-Reply-To
<CICYK9I5P8FD.2PR1YH5IMT2SZ@pampas> (view parent)
DKIM signature
missing
Download raw message
On Sun Mar 6, 2022 at 6:25 PM CET, wagner riffel wrote:
> Those errors seems unrelated with this changes, 20s lag is very surprising,
> can you confirm that they only happen after applying this patch?
Indeed no. They happened just the same before your patch. I'm very sorry for
not being clear before about this. I guess I was hoping that maybe this could
ring some bell, since you know that part of the code. I wish I knew how to
debug this. I guess there's some spurious processing going on on those silent
20s, but I really don't have a clue about how would I look into it.
Details
Message ID
<CID2B6CLJ0HE.1GNEHNF8PJRTI@pampas>
In-Reply-To
<CID01WIMLU3J.2BR6J4PDYH976@bisio> (view parent)
DKIM signature
missing
Download raw message
On Sun Mar 6, 2022 at 7:35 PM CET, inwit wrote:
>
> Indeed no. They happened just the same before your patch. I'm very sorry for
> not being clear before about this. I guess I was hoping that maybe this could
> ring some bell, since you know that part of the code. I wish I knew how to
> debug this. I guess there's some spurious processing going on on those silent
> 20s, but I really don't have a clue about how would I look into it.

There's any open issue for this? If not I think you should create one
and move this topic to there, what I did to debug was use go
runtime/pprof, I suggest you do the same, apply the patch attached,
which simply profiles for 5 seconds, saves the pprof data to aerc.prof
and exits aerc, then you can investigate with "go tool pprof" command,
if you have no familiarity with its command line idiom, I suggest you
use -http=":8080" option (go tool pprof -http=":8080" aerc.prof), that
serves a web interface for the profiled data.

ps: if you have problem analyzing this data, I'm not sure if it's safe
to share the prof file, I suggest not doing so, since it can also dump
memory which might have sensitive data, instead you can use "go tool
pprof -(png|svg|pdf) aerc.prof" to create an image with the call
graph, which only aerc program flow.
Details
Message ID
<CID3Q5TQIKPR.2NZT6G6DCN45V@bisio>
In-Reply-To
<CID2B6CLJ0HE.1GNEHNF8PJRTI@pampas> (view parent)
DKIM signature
missing
Download raw message
On Sun Mar 6, 2022 at 9:21 PM CET, wagner riffel wrote:
> ps: if you have problem analyzing this data, I'm not sure if it's safe to
> share the prof file, I suggest not doing so, since it can also dump memory
> which might have sensitive data, instead you can use "go tool pprof
> -(png|svg|pdf) aerc.prof" to create an image with the call graph, which only
> aerc program flow.
I've captured my screen at the important bit [0]. I'm sharing it because I
don't really feel that I can understand it fully. 

However, indeed, it seems that the bulk of the time aerc spends on my message
base is parsing multipart messages using the go-maildir library. This is
somehow coherent with the fact that the errors I'm observing in the logs have
all to do with mime-related questions. However, I'm confused with the fact that
it is spending almost one third of the time in textproto.(*Part).Close. 

I don't even know if this is a bug in aerc or not, that's why I'm hesitant to
report it as such. I've reported at go-maildir, however [1].

[0] https://transfer.sh/8PyiDz/Screenshot_2022-03-06_22-06-40.png
[1] https://github.com/emersion/go-maildir/issues/12
Details
Message ID
<CID7BMO3RPYE.ZB05GUNQ235K@pampas>
In-Reply-To
<CID3Q5TQIKPR.2NZT6G6DCN45V@bisio> (view parent)
DKIM signature
missing
Download raw message
On Sun Mar 6, 2022 at 10:28 PM CET, inwit wrote:
> I'm confused with the fact that
> it is spending almost one third of the time in textproto.(*Part).Close.

The x-axis of this graph is not time, it's just sorted by name, the
y-axis is the stack depth, so which function that took majority of
time was syscall.Read, not textproto.(*Part).Close, you can read this
graph as Run calls handleAction, handleActions calls handleMessage ...
NextPart calls Close which calls io.Copy ... until syscall.Read, which
is the one that uses most of time, it looks like a similar problem as
getDirectoryInfo, it's doing way too much i/o.

> I don't even know if this is a bug in aerc or not, that's why I'm hesitant to
> report it as such. I've reported at go-maildir, however [1].

I think it's fine even if it's not a bug in aerc to at least document
that it exists.

-w
Details
Message ID
<CIDGR6ZQ9IKL.3028AAD3FHD7X@bisio>
In-Reply-To
<CID7BMO3RPYE.ZB05GUNQ235K@pampas> (view parent)
DKIM signature
missing
Download raw message
On Mon Mar 7, 2022 at 1:17 AM CET, wagner riffel wrote:
> The x-axis of this graph is not time, it's just sorted by name, the y-axis is
> the stack depth, so which function that took majority of time was
> syscall.Read, not textproto.(*Part).Close, you can read this graph as Run
> calls handleAction, handleActions calls handleMessage ...  NextPart calls
> Close which calls io.Copy ... until syscall.Read, which is the one that uses
> most of time, it looks like a similar problem as getDirectoryInfo, it's doing
> way too much i/o.
Thanks, Wagner, that was helpful. However, I was reading the times from the
tooltips and the message at the top of the orange bars, not the x-axis: in my
capture, textproto.(*Part).Close is marked as consuming 36.27% of time, 3.71s.

> I think it's fine even if it's not a bug in aerc to at least document that it
> exists.
I'll do that. Thanks again!
Details
Message ID
<CIDIL2ZVR2D1.TYG1OVVRF2IA@dave>
In-Reply-To
<20220306090634.53120-1-w@104d.net> (view parent)
DKIM signature
missing
Download raw message
wagner riffel, Mar 06, 2022 at 10:06:
> Fix a data race due dirInfo pointer being read in the main goroutine
> by NewMessageStore and written in the anonymous goroutine launched in
> Worker.getDirectoryInfo.
>
> To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we
> use readdir(3) once, parse and cache its results, this replaces
> go-maildir library Dir.Flags based  stat(3) and filepath.Glob
> causing the issue when N (emails) is large.
>
> Signed-off-by: wagner riffel <w@104d.net>
> ---
>
> All previous versions of this patch were broken because it's missing a
> seek to reset readdir position to the beginning, I found this nicer
> workaround, which is also faster than the previous one, but always
> falls back to go-maildir Flags if it can't parse a flag, (in my tests,
> it could parse all emails in generated and my personal maildir)
> currently the main loop is running 400,000 emails in 165~175ms.

Thanks wagner.

Acked-by: Robin Jarry <robin@jarry.cc>

To git.sr.ht:~rjarry/aerc
   c1636f8d757f..55ae3d2cab84  master -> master

In the future, please prefix your patches with PATCH aerc so that
a build is triggered when you send one on the mailing list:

$ git config format.subjectPrefix "PATCH aerc"
$ git config sendemail.to "~rjarry/aerc-devel@lists.sr.ht"
Reply to thread Export thread (mbox)