Fix a data race due dirInfo pointer being read in the main goroutine
by NewMessageStore and written in the anonymous goroutine launched in
Worker.getDirectoryInfo.
To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we
use readdir(3) once, parse and cache its results, this replaces
go-maildir library Dir.Flags based stat(3) and filepath.Glob
causing the issue when N (emails) is large.
Signed-off-by: wagner riffel <w@104d.net>
---
All previous versions of this patch were broken because it's missing a
seek to reset readdir position to the beginning, I found this nicer
workaround, which is also faster than the previous one, but always
falls back to go-maildir Flags if it can't parse a flag, (in my tests,
it could parse all emails in generated and my personal maildir)
currently the main loop is running 400,000 emails in 165~175ms.
worker/maildir/worker.go | 107 ++++++++++++++++++++++++++++++---------
1 file changed, 82 insertions(+), 25 deletions(-)
diff --git a/worker/maildir/worker.go b/worker/maildir/worker.go
index 6ff66b2..df37291 100644
--- a/worker/maildir/worker.go+++ b/worker/maildir/worker.go
@@ -9,6 +9,8 @@ import (
"net/url"
"os"
"path/filepath"
+ "sort"+ "strings" "github.com/emersion/go-maildir"
"github.com/fsnotify/fsnotify"
@@ -120,6 +122,36 @@ func (w *Worker) err(msg types.WorkerMessage, err error) {
}, nil)
}
+func splitMaildirFile(name string) (uniq string, flags []maildir.Flag, err error) {+ i := strings.LastIndexByte(name, ':')+ if i < 0 {+ return "", nil, &maildir.MailfileError{Name: name}+ }+ info := name[i+1:]+ uniq = name[:i]+ if len(info) < 2 {+ return "", nil, &maildir.FlagError{Info: info, Experimental: false}+ }+ if info[1] != ',' || info[0] != '2' {+ return "", nil, &maildir.FlagError{Info: info, Experimental: false}+ }+ if info[0] == '1' {+ return "", nil, &maildir.FlagError{Info: info, Experimental: true}+ }+ flags = []maildir.Flag(info[2:])+ sort.Slice(flags, func(i, j int) bool { return info[i] < info[j] })+ return uniq, flags, nil+}++func dirFiles(name string) ([]string, error) {+ dir, err := os.Open(filepath.Join(name, "cur"))+ if err != nil {+ return nil, err+ }+ defer dir.Close()+ return dir.Readdirnames(-1)+}+func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
dirInfo := &models.DirectoryInfo{
Name: name,
@@ -136,6 +168,21 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
}
dir := w.c.Dir(name)
+ var keyFlags map[string][]maildir.Flag+ files, err := dirFiles(string(dir))+ if err == nil {+ keyFlags = make(map[string][]maildir.Flag, len(files))+ for _, v := range files {+ key, flags, err := splitMaildirFile(v)+ if err != nil {+ w.worker.Logger.Printf("%q: error parsing flags (%q): %v", v, key, err)+ continue+ }+ keyFlags[key] = flags+ }+ } else {+ w.worker.Logger.Printf("disabled flags cache: %q: %v", dir, err)+ } uids, err := w.c.UIDs(dir)
if err != nil {
@@ -144,38 +191,48 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {
}
dirInfo.Exists = len(uids)
-- go func() {- info := dirInfo- for _, uid := range uids {- message, err := w.c.Message(dir, uid)- if err != nil {- w.worker.Logger.Printf("could not get message: %v", err)- continue+ for _, uid := range uids {+ message, err := w.c.Message(dir, uid)+ if err != nil {+ w.worker.Logger.Printf("could not get message: %v", err)+ continue+ }+ var flags []maildir.Flag+ if keyFlags != nil {+ ok := false+ flags, ok = keyFlags[message.key]+ if !ok {+ w.worker.Logger.Printf("message (key=%q uid=%d) not found in map cache", message.key, message.uid)+ flags, err = message.Flags()+ if err != nil {+ w.worker.Logger.Printf("could not get flags: %v", err)+ continue+ } }
- flags, err := message.Flags()+ } else {+ flags, err = message.Flags() if err != nil {
w.worker.Logger.Printf("could not get flags: %v", err)
continue
}
- seen := false- for _, flag := range flags {- if flag == maildir.FlagSeen {- seen = true- }- }- if !seen {- info.Unseen++- }- if w.c.IsRecent(uid) {- info.Recent+++ }+ seen := false+ for _, flag := range flags {+ if flag == maildir.FlagSeen {+ seen = true+ break }
}
- info.Unseen += info.Recent- info.Exists += info.Recent- info.AccurateCounts = true- }()-+ if !seen {+ dirInfo.Unseen+++ }+ if w.c.IsRecent(uid) {+ dirInfo.Recent+++ }+ }+ dirInfo.Unseen += dirInfo.Recent+ dirInfo.Exists += dirInfo.Recent+ dirInfo.AccurateCounts = true return dirInfo
}
--
2.34.1
The error flooding has disappeared, thanks!
However, I'm attaching two logs produced after opening my "sent" folder (~20K
messages), one with the Maildir interface and other with notmuch. As you can
see, there is a lag of ~20s. In both cases, the lag occurs after the call to
types.FetchDirectoryContents, which stalls silently and then produce errors
related to the mime structure of some of my mails.
Is this an expected behaviour? I'm not surprised that some of my mails have a
wrong structure, but other softwares do not complain of this and gets to open
my folder much faster (especially notmuch).
On Sun Mar 6, 2022 at 10:06 AM CET, wagner riffel wrote:
> Fix a data race due dirInfo pointer being read in the main goroutine> by NewMessageStore and written in the anonymous goroutine launched in> Worker.getDirectoryInfo.>> To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we> use readdir(3) once, parse and cache its results, this replaces> go-maildir library Dir.Flags based stat(3) and filepath.Glob> causing the issue when N (emails) is large.>> Signed-off-by: wagner riffel <w@104d.net>> --->> All previous versions of this patch were broken because it's missing a> seek to reset readdir position to the beginning, I found this nicer> workaround, which is also faster than the previous one, but always> falls back to go-maildir Flags if it can't parse a flag, (in my tests,> it could parse all emails in generated and my personal maildir)> currently the main loop is running 400,000 emails in 165~175ms.>> worker/maildir/worker.go | 107 ++++++++++++++++++++++++++++++---------> 1 file changed, 82 insertions(+), 25 deletions(-)>> diff --git a/worker/maildir/worker.go b/worker/maildir/worker.go> index 6ff66b2..df37291 100644> --- a/worker/maildir/worker.go> +++ b/worker/maildir/worker.go> @@ -9,6 +9,8 @@ import (> "net/url"> "os"> "path/filepath"> + "sort"> + "strings">> "github.com/emersion/go-maildir"> "github.com/fsnotify/fsnotify"> @@ -120,6 +122,36 @@ func (w *Worker) err(msg types.WorkerMessage, err error) {> }, nil)> }>> +func splitMaildirFile(name string) (uniq string, flags []maildir.Flag, err error) {> + i := strings.LastIndexByte(name, ':')> + if i < 0 {> + return "", nil, &maildir.MailfileError{Name: name}> + }> + info := name[i+1:]> + uniq = name[:i]> + if len(info) < 2 {> + return "", nil, &maildir.FlagError{Info: info, Experimental: false}> + }> + if info[1] != ',' || info[0] != '2' {> + return "", nil, &maildir.FlagError{Info: info, Experimental: false}> + }> + if info[0] == '1' {> + return "", nil, &maildir.FlagError{Info: info, Experimental: true}> + }> + flags = []maildir.Flag(info[2:])> + sort.Slice(flags, func(i, j int) bool { return info[i] < info[j] })> + return uniq, flags, nil> +}> +> +func dirFiles(name string) ([]string, error) {> + dir, err := os.Open(filepath.Join(name, "cur"))> + if err != nil {> + return nil, err> + }> + defer dir.Close()> + return dir.Readdirnames(-1)> +}> +> func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {> dirInfo := &models.DirectoryInfo{> Name: name,> @@ -136,6 +168,21 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {> }>> dir := w.c.Dir(name)> + var keyFlags map[string][]maildir.Flag> + files, err := dirFiles(string(dir))> + if err == nil {> + keyFlags = make(map[string][]maildir.Flag, len(files))> + for _, v := range files {> + key, flags, err := splitMaildirFile(v)> + if err != nil {> + w.worker.Logger.Printf("%q: error parsing flags (%q): %v", v, key, err)> + continue> + }> + keyFlags[key] = flags> + }> + } else {> + w.worker.Logger.Printf("disabled flags cache: %q: %v", dir, err)> + }>> uids, err := w.c.UIDs(dir)> if err != nil {> @@ -144,38 +191,48 @@ func (w *Worker) getDirectoryInfo(name string) *models.DirectoryInfo {> }>> dirInfo.Exists = len(uids)> -> - go func() {> - info := dirInfo> - for _, uid := range uids {> - message, err := w.c.Message(dir, uid)> - if err != nil {> - w.worker.Logger.Printf("could not get message: %v", err)> - continue> + for _, uid := range uids {> + message, err := w.c.Message(dir, uid)> + if err != nil {> + w.worker.Logger.Printf("could not get message: %v", err)> + continue> + }> + var flags []maildir.Flag> + if keyFlags != nil {> + ok := false> + flags, ok = keyFlags[message.key]> + if !ok {> + w.worker.Logger.Printf("message (key=%q uid=%d) not found in map cache", message.key, message.uid)> + flags, err = message.Flags()> + if err != nil {> + w.worker.Logger.Printf("could not get flags: %v", err)> + continue> + }> }> - flags, err := message.Flags()> + } else {> + flags, err = message.Flags()> if err != nil {> w.worker.Logger.Printf("could not get flags: %v", err)> continue> }> - seen := false> - for _, flag := range flags {> - if flag == maildir.FlagSeen {> - seen = true> - }> - }> - if !seen {> - info.Unseen++> - }> - if w.c.IsRecent(uid) {> - info.Recent++> + }> + seen := false> + for _, flag := range flags {> + if flag == maildir.FlagSeen {> + seen = true> + break> }> }> - info.Unseen += info.Recent> - info.Exists += info.Recent> - info.AccurateCounts = true> - }()> -> + if !seen {> + dirInfo.Unseen++> + }> + if w.c.IsRecent(uid) {> + dirInfo.Recent++> + }> + }> + dirInfo.Unseen += dirInfo.Recent> + dirInfo.Exists += dirInfo.Recent> + dirInfo.AccurateCounts = true> return dirInfo> }>> --> 2.34.1
On Sun Mar 6, 2022 at 11:58 AM CET, inwit wrote:
> However, I'm attaching two logs produced after opening my "sent" folder (~20K> messages), one with the Maildir interface and other with notmuch. As you can> see, there is a lag of ~20s. In both cases, the lag occurs after the call to> types.FetchDirectoryContents, which stalls silently and then produce errors> related to the mime structure of some of my mails.>> Is this an expected behaviour? I'm not surprised that some of my mails have a> wrong structure, but other softwares do not complain of this and gets to open> my folder much faster (especially notmuch).>
Those errors seems unrelated with this changes, 20s lag is very
surprising, can you confirm that they only happen after applying this
patch?
On Sun Mar 6, 2022 at 6:25 PM CET, wagner riffel wrote:
> Those errors seems unrelated with this changes, 20s lag is very surprising,> can you confirm that they only happen after applying this patch?
Indeed no. They happened just the same before your patch. I'm very sorry for
not being clear before about this. I guess I was hoping that maybe this could
ring some bell, since you know that part of the code. I wish I knew how to
debug this. I guess there's some spurious processing going on on those silent
20s, but I really don't have a clue about how would I look into it.
On Sun Mar 6, 2022 at 7:35 PM CET, inwit wrote:
>> Indeed no. They happened just the same before your patch. I'm very sorry for> not being clear before about this. I guess I was hoping that maybe this could> ring some bell, since you know that part of the code. I wish I knew how to> debug this. I guess there's some spurious processing going on on those silent> 20s, but I really don't have a clue about how would I look into it.
There's any open issue for this? If not I think you should create one
and move this topic to there, what I did to debug was use go
runtime/pprof, I suggest you do the same, apply the patch attached,
which simply profiles for 5 seconds, saves the pprof data to aerc.prof
and exits aerc, then you can investigate with "go tool pprof" command,
if you have no familiarity with its command line idiom, I suggest you
use -http=":8080" option (go tool pprof -http=":8080" aerc.prof), that
serves a web interface for the profiled data.
ps: if you have problem analyzing this data, I'm not sure if it's safe
to share the prof file, I suggest not doing so, since it can also dump
memory which might have sensitive data, instead you can use "go tool
pprof -(png|svg|pdf) aerc.prof" to create an image with the call
graph, which only aerc program flow.
On Sun Mar 6, 2022 at 9:21 PM CET, wagner riffel wrote:
> ps: if you have problem analyzing this data, I'm not sure if it's safe to> share the prof file, I suggest not doing so, since it can also dump memory> which might have sensitive data, instead you can use "go tool pprof> -(png|svg|pdf) aerc.prof" to create an image with the call graph, which only> aerc program flow.
I've captured my screen at the important bit [0]. I'm sharing it because I
don't really feel that I can understand it fully.
However, indeed, it seems that the bulk of the time aerc spends on my message
base is parsing multipart messages using the go-maildir library. This is
somehow coherent with the fact that the errors I'm observing in the logs have
all to do with mime-related questions. However, I'm confused with the fact that
it is spending almost one third of the time in textproto.(*Part).Close.
I don't even know if this is a bug in aerc or not, that's why I'm hesitant to
report it as such. I've reported at go-maildir, however [1].
[0] https://transfer.sh/8PyiDz/Screenshot_2022-03-06_22-06-40.png
[1] https://github.com/emersion/go-maildir/issues/12
On Sun Mar 6, 2022 at 10:28 PM CET, inwit wrote:
> I'm confused with the fact that> it is spending almost one third of the time in textproto.(*Part).Close.
The x-axis of this graph is not time, it's just sorted by name, the
y-axis is the stack depth, so which function that took majority of
time was syscall.Read, not textproto.(*Part).Close, you can read this
graph as Run calls handleAction, handleActions calls handleMessage ...
NextPart calls Close which calls io.Copy ... until syscall.Read, which
is the one that uses most of time, it looks like a similar problem as
getDirectoryInfo, it's doing way too much i/o.
> I don't even know if this is a bug in aerc or not, that's why I'm hesitant to> report it as such. I've reported at go-maildir, however [1].
I think it's fine even if it's not a bug in aerc to at least document
that it exists.
-w
On Mon Mar 7, 2022 at 1:17 AM CET, wagner riffel wrote:
> The x-axis of this graph is not time, it's just sorted by name, the y-axis is> the stack depth, so which function that took majority of time was> syscall.Read, not textproto.(*Part).Close, you can read this graph as Run> calls handleAction, handleActions calls handleMessage ... NextPart calls> Close which calls io.Copy ... until syscall.Read, which is the one that uses> most of time, it looks like a similar problem as getDirectoryInfo, it's doing> way too much i/o.
Thanks, Wagner, that was helpful. However, I was reading the times from the
tooltips and the message at the top of the orange bars, not the x-axis: in my
capture, textproto.(*Part).Close is marked as consuming 36.27% of time, 3.71s.
> I think it's fine even if it's not a bug in aerc to at least document that it> exists.
I'll do that. Thanks again!
wagner riffel, Mar 06, 2022 at 10:06:
> Fix a data race due dirInfo pointer being read in the main goroutine> by NewMessageStore and written in the anonymous goroutine launched in> Worker.getDirectoryInfo.>> To address the issue raised in https://todo.sr.ht/~rjarry/aerc/16, we> use readdir(3) once, parse and cache its results, this replaces> go-maildir library Dir.Flags based stat(3) and filepath.Glob> causing the issue when N (emails) is large.>> Signed-off-by: wagner riffel <w@104d.net>> --->> All previous versions of this patch were broken because it's missing a> seek to reset readdir position to the beginning, I found this nicer> workaround, which is also faster than the previous one, but always> falls back to go-maildir Flags if it can't parse a flag, (in my tests,> it could parse all emails in generated and my personal maildir)> currently the main loop is running 400,000 emails in 165~175ms.
Thanks wagner.
Acked-by: Robin Jarry <robin@jarry.cc>
To git.sr.ht:~rjarry/aerc
c1636f8d757f..55ae3d2cab84 master -> master
In the future, please prefix your patches with PATCH aerc so that
a build is triggered when you send one on the mailing list:
$ git config format.subjectPrefix "PATCH aerc"
$ git config sendemail.to "~rjarry/aerc-devel@lists.sr.ht"