> Access to the hardware is provided through normal files, and per- > process namespaces do not require special permissions to modify > mountpoints. Making a container is thus trivial: just unmount all of > the hardware you don’t want the sandboxed program to have access to. > Done. I'm not a Plan 9 expert, but I'm pretty sure that's not the case. If you check `ns`, all the files corresponding to hardware are provided from special directories corresponding to device trees, for example `#d`. Even if you unmount them from your namespace - the "sandboxed" program can mount them right back, by e.g. running the commands as listed in ns's output. As far as I can tell, the only way to prevent that is the noattach flag (set by RFNOMNT), which completely disables mount(). Obviously, that breaks a lot of programs. Even if you don't mind programs randomly breaking because you've disabled a core feature, the > You don't even have to be root. part is starting to get sketchy too. Not having RFNOMNT is basically the same as having root. Once you sandbox a program, it's no longer "root", and it can no longer mount(), making its sandboxing capabilities much more limited. But again - I'm definitely not an expert on this. If I'm wrong, I'd love to get corrected by an actual Plan 9 user/developer. relevant functions: namec in 9/port/chan.c bindmount in 9/port/sysfile.c On a similar note: > Want to forward a TCP port? Write an implementation of /net/tcp which > is limited to whatever ports you need — perhaps with just a hundred > lines of shell scripting — and mount it into the namespace. I always found that kinda weird. Why should you have to implement a whole TCP implementation to forward a single port? Wouldn't it be more natural for the TCP implementation to accept the address/port in the path? For example, open("/net/listen/0.0.0.0/tcp/80") Then you can manage all of a program's privileges as a simple list of paths, which is simpler to reason about. I've implemented that in my toy OS (basically Plan9 with containers) and it seems to work fine. I'm probably missing something, though.
Hi, > Even if you unmount them from your namespace - the "sandboxed" program > can mount them right back, by e.g. running the commands as listed in > ns's output. > As far as I can tell, the only way to prevent that is the noattach flag > (set by RFNOMNT), which completely disables mount(). Obviously, that > breaks a lot of programs. 9front has features for more advanced sandboxing, including auth/box, which allows specifying a full list of allowed drivers, and constructing an arbitrary sandbox. Moreover, quoth rfork(2): RFNOMNT If set, subsequent mounts into the new name space are disallowed. All pathnames starting with # besides those used to access pipe(3), dup(3), env(3), cons(3), and proc(3) can not be walked. RFNOMNT doesn't break core features; if you have something already mounted, it remains usable, and e.g. pipe (which is implemented via the #| device) is explicitly allowed. Accidentally replied off list initially, apologies for the double-response :/ - Noam Preil