~andrewrk/ziglang

5 3

Memory errors with ChildProcess.exec

Details
Message ID
<20200502222033.GA22505@Kepler>
DKIM signature
missing
Download raw message
Hello all,

I have been working on a small, personal project as a way to get to 
learn Zig. I've run into a problem that I'm completely vexed by, so I'm 
turning to the community for help.

I am using std.ChildProcess.exec to spawn a grep subprocess that 
searches for a given pattern through all files found in a directory.  
Here is a minimum working example that displays the error I am seeing

     https://paste.sr.ht/~gpanders/33cada72ac71ec869ee4e1e8221e7e3cbee547d4

Before the call to std.ChildProcess.exec, I have confirmed that all of 
the pathnames in the ArrayList look okay. I can iterate through them and 
print them to stdout and all looks well.

However, the child process fails because when the child process itself 
runs the filenames are corrupted. For context, I am running this command 
in a directory containing a subdirectory called "files" containing 
several text files. Below is the output of running the program:

     ExecResult{ .term = Term{ .Exited = 2 }, .stdout = 20191213092520_George_Perkins_Marsh.txt
     20191228173202_The_evolving_politics_of_Abraham_Lincoln.txt
     , .stderr = grep: 20200209203150_Intem|+͊r\v:d0g+=d\035v: No such file or directory
     grep: L O\022}'N\035\002l"\016Pcz.k;L   i: No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: 202ż\026Cez_\032*ҕ&^J+\027E%ż\026Cez_\032*ҕ&L\020
     8*\026: No such file or directory
     grep: 8*\026: No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: : No such file or directory
     grep: \001ݬ:\024v\      M\006OtȘB: No such file or directory
      }

You can see that some of the filenames appear correctly, but that grep 
is also reporting errors because it cannot find filenames full of 
garbage.

So that means somewhere between calling std.ChildProcess.exec and the 
child process actually executing, something is happening to the memory 
to corrupt those filename strings.

An interesting data point is that if I move the logic out of the 
`getFiles` function into `main` directly, this problem does not occur.

Can anyone explain what is going on here? There is likely something 
obvious I'm missing as I'm still new to Zig.

Thanks for your help,

Greg
Details
Message ID
<CAEnbY+fepcvAp5KN9dB7c1QVTheSHX=Exk=g9tL0_WSwJ2tKdw@mail.gmail.com>
In-Reply-To
<20200502222033.GA22505@Kepler> (view parent)
DKIM signature
pass
Download raw message
On Sun, 3 May 2020 at 08:20, Greg Anders <greg@gpanders.com> wrote:
>
> Hello all,
>
> I have been working on a small, personal project as a way to get to
> learn Zig. I've run into a problem that I'm completely vexed by, so I'm
> turning to the community for help.
>
> I am using std.ChildProcess.exec to spawn a grep subprocess that
> searches for a given pattern through all files found in a directory.
> Here is a minimum working example that displays the error I am seeing
>
>      https://paste.sr.ht/~gpanders/33cada72ac71ec869ee4e1e8221e7e3cbee547d4
>
> Before the call to std.ChildProcess.exec, I have confirmed that all of
> the pathnames in the ArrayList look okay. I can iterate through them and
> print them to stdout and all looks well.
>
> However, the child process fails because when the child process itself
> runs the filenames are corrupted. For context, I am running this command
> in a directory containing a subdirectory called "files" containing
> several text files. Below is the output of running the program:
>
>      ExecResult{ .term = Term{ .Exited = 2 }, .stdout = 20191213092520_George_Perkins_Marsh.txt
>      20191228173202_The_evolving_politics_of_Abraham_Lincoln.txt
>      , .stderr = grep: 20200209203150_Intem|+͊r\v:d0g+=d\035v: No such file or directory
>      grep: L O\022}'N\035\002l"\016Pcz.k;L   i: No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: 202ż\026Cez_\032*ҕ&^J+\027E%ż\026Cez_\032*ҕ&L\020
>      8*\026: No such file or directory
>      grep: 8*\026: No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: : No such file or directory
>      grep: \001ݬ:\024v\      M\006OtȘB: No such file or directory
>       }
>
> You can see that some of the filenames appear correctly, but that grep
> is also reporting errors because it cannot find filenames full of
> garbage.
>
> So that means somewhere between calling std.ChildProcess.exec and the
> child process actually executing, something is happening to the memory
> to corrupt those filename strings.
>
> An interesting data point is that if I move the logic out of the
> `getFiles` function into `main` directly, this problem does not occur.
>
> Can anyone explain what is going on here? There is likely something
> obvious I'm missing as I'm still new to Zig.
>
> Thanks for your help,
>
> Greg

The problem is that `entry.name` is only valid inside of your loop.
you need to do:
-        try entries.append(entry.name);
+        try entries.append(try std.mem.dupe(allocator, u8, entry.name));

Also note that you didn't close `dir` after you're done with it.
I'd also add a `defer` in your `getFiles` function to clean up the
arraylist contents.
Details
Message ID
<14edbee7-97c2-73be-5c43-0e4e8e01ea14@ziglang.org>
In-Reply-To
<CAEnbY+fepcvAp5KN9dB7c1QVTheSHX=Exk=g9tL0_WSwJ2tKdw@mail.gmail.com> (view parent)
DKIM signature
pass
Download raw message
On 5/3/20 12:27 AM, Daurnimator wrote:
> 
> The problem is that `entry.name` is only valid inside of your loop.
> you need to do:
> -        try entries.append(entry.name);
> +        try entries.append(try std.mem.dupe(allocator, u8, entry.name));

Be careful - if the allocator being used here is not an arena allocator,
this will leak memory if the inner `try` fails. Better would be:

const copied_name = try std.mem.dupe(allocator, u8, entry.name);
{
    errdefer allocator.free(copied_name);
    try entries.append(copied_name);
}
Details
Message ID
<CAEnbY+e_EpnBV99e=iaXo6SUK65QiYcjVbGd4UAB2ebUcpHowA@mail.gmail.com>
In-Reply-To
<14edbee7-97c2-73be-5c43-0e4e8e01ea14@ziglang.org> (view parent)
DKIM signature
pass
Download raw message
On Sun, 3 May 2020 at 15:18, Andrew Kelley <andrew@ziglang.org> wrote:
> On 5/3/20 12:27 AM, Daurnimator wrote:
> >
> > The problem is that `entry.name` is only valid inside of your loop.
> > you need to do:
> > -        try entries.append(entry.name);
> > +        try entries.append(try std.mem.dupe(allocator, u8, entry.name));
>
> Be careful - if the allocator being used here is not an arena allocator,
> this will leak memory if the inner `try` fails. Better would be:
>
> const copied_name = try std.mem.dupe(allocator, u8, entry.name);
> {
>     errdefer allocator.free(copied_name);
>     try entries.append(copied_name);
> }

The original code was using an arena allocator :)

However, I'd recommend a slightly different change to make things fail-safe:

    try entries.ensureCapacity(entries.items.len + 1);
    entries.appendAssumeCapacity(try std.mem.dupe(allocator, u8, entry.name));
Details
Message ID
<69879ca8-252a-7326-6a21-5102e3940aa7@ziglang.org>
In-Reply-To
<CAEnbY+e_EpnBV99e=iaXo6SUK65QiYcjVbGd4UAB2ebUcpHowA@mail.gmail.com> (view parent)
DKIM signature
pass
Download raw message
On 5/3/20 2:17 AM, Daurnimator wrote:
> On Sun, 3 May 2020 at 15:18, Andrew Kelley <andrew@ziglang.org> wrote:
>> On 5/3/20 12:27 AM, Daurnimator wrote:
>>>
>>> The problem is that `entry.name` is only valid inside of your loop.
>>> you need to do:
>>> -        try entries.append(entry.name);
>>> +        try entries.append(try std.mem.dupe(allocator, u8, entry.name));
>>
>> Be careful - if the allocator being used here is not an arena allocator,
>> this will leak memory if the inner `try` fails. Better would be:
>>
>> const copied_name = try std.mem.dupe(allocator, u8, entry.name);
>> {
>>     errdefer allocator.free(copied_name);
>>     try entries.append(copied_name);
>> }
> 
> The original code was using an arena allocator :)
> 
> However, I'd recommend a slightly different change to make things fail-safe:
> 
>     try entries.ensureCapacity(entries.items.len + 1);
>     entries.appendAssumeCapacity(try std.mem.dupe(allocator, u8, entry.name));
> 

Ahh, that's even better!
Details
Message ID
<20200503142738.GB36254@Kepler>
In-Reply-To
<69879ca8-252a-7326-6a21-5102e3940aa7@ziglang.org> (view parent)
DKIM signature
missing
Download raw message
Excellent, thank you both for your help! I knew it would be something 
obvious; I never even suspected `entry.name`.

Everything is working now. Much appreciated.

Greg
Reply to thread Export thread (mbox)