~sircmpwn/sr.ht-discuss

4 3

Connection issues to build environment using Arch Linux image

Details
Message ID
<CPZ93DR2LPBD.14MIXCEMBT0MG@arch>
DKIM signature
pass
Download raw message
Hi,

I am having issues with the Arch Linux build environment provided by
builds.sr.ht. The problem is that during a build - at the same point
in the build process every time, it fails with the following message:

> Connection to localhost closed by remote host

An example build is https://builds.sr.ht/~ineptattech/job/927081.
Unfortunately I do not have a simple example to reproduce this issue. I
wonder the environment is low on memory and killing processes such as
sshd - in which case I can reduce the parallelism of the build. Or
perhaps there is another issue? This started happening recently when I
updated the version of an AUR package I am building. Note that the
package builds fine on my personal computer. In addition, I am able to
build with manifests for other packages with the Arch Linux image. So
this seems specific to the specific manifest that I linked to.

Thanks,
Tim
Details
Message ID
<CPZG333M3XUR.2PTHLX1IIEEPT@taiga>
In-Reply-To
<CPZ93DR2LPBD.14MIXCEMBT0MG@arch> (view parent)
DKIM signature
pass
Download raw message
I could not really say what's going on here without a more minimal
reproduction. I would be surprised if it was OOM killing sshd, but that
does seem... somewhat plausible. sshd doesn't use a lot of RAM so I
don't think Linux would put it high up on the chopping block.

Could be that the whole VM was OOM killed on the host side, too, which
might happen in the case of high disk usage.
Details
Message ID
<CQ2U0TSFWC6H.2SEKQ1Z5F1MTA@arch>
In-Reply-To
<CPZG333M3XUR.2PTHLX1IIEEPT@taiga> (view parent)
DKIM signature
pass
Download raw message
I may drop this as I am not sure what a useful minimum reproduction
would be. However, I am curious whether you can publish the memory and
cpu resources allocated to each build environment. I am interested in
this because the the number of cpu threads dictates the number of jobs
in my build. Perhaps this would be useful for others to know.

I have another ask. My understanding is that if the OOM killer starts
killing various processes, there is no logic way to set the environment
back to normal without restarting. With that said, would it be possible
to setup the sshd service to restart if it is killed? This would be
useful for debugging. Alternatively, could you provide some feedback in
the build failure message that indicates whether OOM killer interfered
with the build? Or maybe a more useful message would be a general
indication of whether there was some issue with the environment instead
of the build itself.
Details
Message ID
<CQ2U2RZFV6PO.UZNT37N78HPT@taiga>
In-Reply-To
<CQ2U0TSFWC6H.2SEKQ1Z5F1MTA@arch> (view parent)
DKIM signature
pass
Download raw message
On Fri Jan 27, 2023 at 9:33 AM CET, Tim Lagnese wrote:
> I may drop this as I am not sure what a useful minimum reproduction
> would be. However, I am curious whether you can publish the memory and
> cpu resources allocated to each build environment. I am interested in
> this because the the number of cpu threads dictates the number of jobs
> in my build. Perhaps this would be useful for others to know.

You can easily ascertain these from within the build environment. These
specs are not set in stone, we may adjust them later, so it's best if
users are relying not on documented specs, but observed specs.

> I have another ask. My understanding is that if the OOM killer starts
> killing various processes, there is no logic way to set the environment
> back to normal without restarting. With that said, would it be possible
> to setup the sshd service to restart if it is killed? This would be
> useful for debugging. Alternatively, could you provide some feedback in
> the build failure message that indicates whether OOM killer interfered
> with the build? Or maybe a more useful message would be a general
> indication of whether there was some issue with the environment instead
> of the build itself.

The OOM killer prefers to kill processes using a large amount of memory
first, so it's unlikely that sshd was killed. Couldn't speculate further
until we have a reproduction to work with. And no, it's not really
feasible to determine if the OOM killer was responsible for a build
failure automatically.
Details
Message ID
<CQ343JR27OKS.SLHIKQZRMIBR@hades.moritz.sh>
In-Reply-To
<CPZG333M3XUR.2PTHLX1IIEEPT@taiga> (view parent)
DKIM signature
pass
Download raw message
I may have something that could provide additional information:

When connecting to the runner from Archlinux (OpenSSH version 9.1p1-3)
the connection is closed. The same thing also happens when using hut as
a wrapper. Full log attached.

Log:
	Connected to build job #929530 (failed): https://builds.sr.ht/~poldi1405/job/929530
	Your VM will be terminated 4 hours from now, or when you log out.

	debug3: receive packet: type 96
	debug2: channel 0: rcvd eof
	debug2: channel 0: output open -> drain
	debug2: channel 0: obuf empty
	debug2: chan_shutdown_write: channel 0: (i0 o1 sock -1 wfd 5 efd 6 [write])
	debug2: channel 0: output drain -> closed
	debug3: receive packet: type 98
	debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
	debug3: receive packet: type 98
	debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
	debug2: channel 0: rcvd eow
	debug2: chan_shutdown_read: channel 0: (i0 o3 sock -1 wfd 4 efd 6 [write])
	debug2: channel 0: input open -> closed
	debug3: receive packet: type 97
	debug2: channel 0: rcvd close
	debug3: channel 0: will not send data after close
	debug2: channel 0: almost dead
	debug2: channel 0: gc: notify user
	debug2: channel 0: gc: user detached
	debug2: channel 0: send close
	debug3: send packet: type 97
	debug2: channel 0: is dead
	debug2: channel 0: garbage collecting
	debug1: channel 0: free: client-session, nchannels 1
	debug3: channel 0: status: The following connections are open:
	  #0 client-session (t4 r0 i3/0 o3/0 e[write]/0 fd -1/-1/6 sock -1 cc -1 io 0x00/0x00)

	debug3: send packet: type 1
	Connection to yui.runners.sr.ht closed.
	Transferred: sent 2496, received 2888 bytes, in 1.0 seconds
	Bytes per second: sent 2591.3, received 2998.3
	debug1: Exit status 0
-- 
Moritz Poldrack
https://moritz.sh
Reply to thread Export thread (mbox)