~awesomekling/serenityos-dev

1

Design Question: How to plumb Userspace<T> through Kernel/FileSystem/BlockBasedFS?

Details
Message ID
<CAFOmKHfpcE_h=j9ZwkHpMrjNq7P0R2jCKzHn1H=ieOPa1-egMA@mail.gmail.com>
DKIM signature
pass
Download raw message
I was partying on more Userspace<T> changes, this time pushing the
type system down through sys$write(..) / FileDescription::write(...) /
File::write and all the derived implementations. However I ran into a
stumbling block, which leads to a potentially interesting design
problem.

I naively assumed all writes would come from userspace, which is why I
started pushing Userspace<T> all the way down into the Device/File
API. However that assumption is violated in a few places, for example
by BlockBasedFS::flush_specific_block_If_needed().

255 void BlockBasedFS::flush_specific_block_if_needed(unsigned index)
256 {
257     LOCKER(m_lock);
258     if (!cache().is_dirty())
259         return;
260     cache().for_each_entry([&](CacheEntry& entry) {
261         if (entry.is_dirty && entry.block_index == index) {
262             u32 base_offset = static_cast<u32>(entry.block_index)
* static_cast<u32>(block_size());
263             file_description().seek(base_offset, SEEK_SET);
264             // FIXME: Should this error path be surfaced somehow?
265             (void)file_description().write(entry.data, block_size());
266             entry.is_dirty = false;
267         }
268     });
269 }

The implementation copies the user space buffer contents to a KBuffer,
which CacheEntry.data points to.
So this write's input buffer is a Kernel address, not a Userspace<T>,
and worse yet, it starts the write at
the FileDescription level, which is the start of the inheritance
tree.. how do we model this in the code since
we are trying to force the interface to take a Userspace<const u8*>
instead of a naked const u8*?

Options:

- Invent a variant of Userspace<T>, like UserspaceOrKernel<T> which
holds the metadata about which one it is, in addition to the address
of the pointer.
  Then we could share code, but dynamically branch where needed.

- Implement both parallel paths through the entire object hierarchy,
unifying at some common place at each override implementation of
File::write(...).
  This seems a bit drastic, but maybe it's mostly busy work and we
won't end up with a lot of code dupe in the end.

- Figure out some other way for BlockBasedFS (and potentially other
similar cases I have yet to notice) to flush without needing to start
the operation
  at the FileDescription::write level and instead only push the
changes lower. This seems attractive, but it seems almost inevitable
that I'm going to run
  into some other path which does something similar, once I make my
way past this.

- Do nothing, don't touch File.h and stop propagating Userspace<T>
down into subsystems. We would validate at syscall boundary and leave
it at that.

Thoughts?

Thanks!
- Brian
Details
Message ID
<CABXa4C+i_Q5YfiqaAuu6MWXTT56Jq_ZB36DeO5p00dBJGenaew@mail.gmail.com>
In-Reply-To
<CAFOmKHfpcE_h=j9ZwkHpMrjNq7P0R2jCKzHn1H=ieOPa1-egMA@mail.gmail.com> (view parent)
DKIM signature
pass
Download raw message
Indeed, Userspace<T> doesn't work everywhere in the kernel since some
interfaces are used both by syscall implementations and by other kernel parts.

Another alternative you didn't bring up could be to have syscalls use temporary
kernel buffers when calling these interfaces and then copy out of these buffers
to the final userspace buffer at the syscall handler level. There's an extra
copy here but maybe it's not the end of the world, and it would allow us to
keep the "inner" parts of the kernel free from worrying about this.

/Andreas
Reply to thread Export thread (mbox)