~vdupras/duskos-discuss

6 3

Roadmap Update

Details
Message ID
<1a6b5f15-a54a-4f28-8b7c-a9da2009a4c5@www.fastmail.com>
DKIM signature
pass
Download raw message
Hello all,

The C compiler is getting good. There's still a lot of work to do, but there's
also a lot that has been done already.

Focusing on C compiler completeness is a bit boring and I thought it would be
more fun to, instead, focus on a more holistic goal (in the sense that reaching
it requires more than just working on the C compiler): run Collapse OS' CVM
from within Dusk.

The CVM itself is a bit much (I'd need structs, which I'd like to tackle
later), but I thought that I could start with porting tools/blkpack.c. Only to
have this run presents some interesting challenges because many different
classes of things are missing. First and foremost...

## I/O API

Dusk represents an opportunity for blank slate design, with many new
opportunities due to Forth's semantics. blkunpack, like many, many programs,
use stdin, stdout, stderr, which needs to be re-routable if I want to keep UNIX
stream flexibility.

If I was set on recreating UNIX semantics, I'd begin designing some sort of file
descriptor system around some kind of read() and write(). But since we're
calling our programs from a Forth interpreter, the operator has the ability to
play with words directly, so why bother with file descriptors?

What I'm set on trying first is to add a new sys/stream (or sys/io? I don't
know) subsystem that define "stdin", a "key"-like word and "stdout", a
"emit"-like word. They would be simple aliases to key and emit. 

I'm not sure yet about stderr. It's generally used for logging to the console or
to a log file, so I'm tempted to have programs use "emit" directly. If the sysop
needs to redirect this output to some place else, she has tools to redirect
"emit" before calling her word. Requires some thinking.

That would allow us to plug them into the...

## Files API

Dusk will need a way for the sysop to interact with files. At its core, I don't
have better ideas than the UNIX way of doing it, that is: open() returns a
descriptor, you can read, write, seek, then you close it.

But when it comes to interacting with other words, I'd like to make things
convenient for the sysop. I'm not sure at all where I want to go with this, but
my brain wanders around the concept of "file selection". It goes somewhat like
this:

The file selection is a "rolling" array of path references with convenience
words to access the Nth selected word. By convention, the last selected path
is the destination and the 2nd last is the source (or, if processes only need
a source, then it's the last selected path). You select paths with ":". Example:

:path/to/source :path/to/dest streamcopy

Now, this is where things get fuzzy. Ideally, this command above is all it
would take for "streamcopy" to understand that I want to do. The "streams" API
could have some convenience words that check if the file selection is empty,
and if it's not, open/create the files in the selection, plug them into
stdin/stdout, and execute. I'm not sure yet how it will pan out, but that's
where I'm headed.

## C stdlib

Dusk's C compiler doesn't have to be ANSI compliant. The idea with choosing C
is only that it makes porting UNIX applications easier.

My intention with the C stdlib is to consider that any compromise I make towards
UNIX will ease porting effort, but compromise the aesthetical integrity (and
thus possibly its power/simplicity ratio, which is the whole point of this
project). This tradeoff has to be evaluated on a case-by-case basis.

blkpack uses:

* putchar
* sscanf
* getline
* fprintf

putchar() maps directly to the stdout() word outlined in the streams API above.

I'll probably bite the bullet of C string formatting (scanf, printf) and
implement something similar, but I'll probably cut corners in the formatting
options department.

getline() will probably be implemented in a similar fashion, with without a file
argument (it will use stdin()). It will be the same for every word of this type,
that is: always use stdin/stdout.

What to do when a word needs to operate on a wider file selection? It will take
care of overriding/restoring stdin/stdout during the process.

I'm not sure yet where to go exactly with this, but the mantra "embrace
statefulness" seems right to me.

## Discussion welcome

This roadmap above isn't set in stone, it's just a rought idea of where I think
I want to go. Discussion is encouraged.

Regards,
Virgil
Details
Message ID
<fc8ec725-bba7-9958-87ac-0b81022d7f56@envs.net>
In-Reply-To
<1a6b5f15-a54a-4f28-8b7c-a9da2009a4c5@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message

On 2022-06-27 07:47, Virgil Dupras wrote:
> Hello all,
>
> The C compiler is getting good. There's still a lot of work to do, but there's
> also a lot that has been done already.
>
> Focusing on C compiler completeness is a bit boring and I thought it would be
> more fun to, instead, focus on a more holistic goal (in the sense that reaching
> it requires more than just working on the C compiler): run Collapse OS' CVM
> from within Dusk.
>
> The CVM itself is a bit much (I'd need structs, which I'd like to tackle
> later), but I thought that I could start with porting tools/blkpack.c. Only to
> have this run presents some interesting challenges because many different
> classes of things are missing. First and foremost...
>
> ## I/O API
>
> Dusk represents an opportunity for blank slate design, with many new
> opportunities due to Forth's semantics. blkunpack, like many, many programs,
> use stdin, stdout, stderr, which needs to be re-routable if I want to keep UNIX
> stream flexibility.
>
> If I was set on recreating UNIX semantics, I'd begin designing some sort of file
> descriptor system around some kind of read() and write(). But since we're
> calling our programs from a Forth interpreter, the operator has the ability to
> play with words directly, so why bother with file descriptors?
>
> What I'm set on trying first is to add a new sys/stream (or sys/io? I don't
> know) subsystem that define "stdin", a "key"-like word and "stdout", a
> "emit"-like word. They would be simple aliases to key and emit.

this sounds very similar to the `iin<`, `in<`, and `cc<` that we already have.

also, how will EOF be represented?

>
> I'm not sure yet about stderr. It's generally used for logging to the console or
> to a log file, so I'm tempted to have programs use "emit" directly. If the sysop
> needs to redirect this output to some place else, she has tools to redirect
> "emit" before calling her word. Requires some thinking.
>
> That would allow us to plug them into the...
>
> ## Files API
>
> Dusk will need a way for the sysop to interact with files. At its core, I don't
> have better ideas than the UNIX way of doing it, that is: open() returns a
> descriptor, you can read, write, seek, then you close it.
>
> But when it comes to interacting with other words, I'd like to make things
> convenient for the sysop. I'm not sure at all where I want to go with this, but
> my brain wanders around the concept of "file selection". It goes somewhat like
> this:
>
> The file selection is a "rolling" array of path references with convenience
> words to access the Nth selected word. By convention, the last selected path
> is the destination and the 2nd last is the source (or, if processes only need
> a source, then it's the last selected path). You select paths with ":". Example:
>
> :path/to/source :path/to/dest streamcopy
>
> Now, this is where things get fuzzy. Ideally, this command above is all it
> would take for "streamcopy" to understand that I want to do. The "streams" API
> could have some convenience words that check if the file selection is empty,
> and if it's not, open/create the files in the selection, plug them into
> stdin/stdout, and execute. I'm not sure yet how it will pan out, but that's
> where I'm headed.

Personally, I think this is somewhat in conflict with the idea above about using
words directly.  I understand that they could exist as separate subsystems/layers,
but they feel like they represent different ideas of how the system should work.

I'm not quite sure where this approach would have an advantage over just like,
having convenience words that parse a filename and assign it to stdin/stdout.

Another thing that could be useful are combinators that would take a path,
open it, make that file stdin/stdout, call a word, close the file, then restore
the previous value for that stream.  This type of combinator is quite common in
LISP languages.


> I'm not sure yet where to go exactly with this, but the mantra "embrace
> statefulness" seems right to me.
>
>
I find that is generally a good idea for forth systems.
Details
Message ID
<8c901760-cd53-49b9-a720-260a24108bc6@www.fastmail.com>
In-Reply-To
<fc8ec725-bba7-9958-87ac-0b81022d7f56@envs.net> (view parent)
DKIM signature
pass
Download raw message
On Mon, Jun 27, 2022, at 1:27 PM, binarycat wrote:
> On 2022-06-27 07:47, Virgil Dupras wrote:
>> Hello all,
>>
>> The C compiler is getting good. There's still a lot of work to do, but there's
>> also a lot that has been done already.
>>
>> Focusing on C compiler completeness is a bit boring and I thought it would be
>> more fun to, instead, focus on a more holistic goal (in the sense that reaching
>> it requires more than just working on the C compiler): run Collapse OS' CVM
>> from within Dusk.
>>
>> The CVM itself is a bit much (I'd need structs, which I'd like to tackle
>> later), but I thought that I could start with porting tools/blkpack.c. Only to
>> have this run presents some interesting challenges because many different
>> classes of things are missing. First and foremost...
>>
>> ## I/O API
>>
>> Dusk represents an opportunity for blank slate design, with many new
>> opportunities due to Forth's semantics. blkunpack, like many, many programs,
>> use stdin, stdout, stderr, which needs to be re-routable if I want to keep UNIX
>> stream flexibility.
>>
>> If I was set on recreating UNIX semantics, I'd begin designing some sort of file
>> descriptor system around some kind of read() and write(). But since we're
>> calling our programs from a Forth interpreter, the operator has the ability to
>> play with words directly, so why bother with file descriptors?
>>
>> What I'm set on trying first is to add a new sys/stream (or sys/io? I don't
>> know) subsystem that define "stdin", a "key"-like word and "stdout", a
>> "emit"-like word. They would be simple aliases to key and emit.
>
> this sounds very similar to the `iin<`, `in<`, and `cc<` that we already have.

Yes it does :) This new system would replace these. For example, "cc<" wouldn't
exist anymore because under a new system where there are proper convenience
words for managing stdin/stdout/stderr, "cc<" would become references to
"stdin".

When I wrote the text above, I had forgotten about the "iin<"/"in<"
distinction, so that's one more thing to consider when fleshing out the system.

> also, how will EOF be represented?

My first idea would be a global flag for checking if stdin is EOF. Right now,
the "in<" words use a "c-or-0" scheme for detecting EOF, but that's going to
break as soon as we try to process binary data. It has to be changed at some
point.

>>
>> I'm not sure yet about stderr. It's generally used for logging to the console or
>> to a log file, so I'm tempted to have programs use "emit" directly. If the sysop
>> needs to redirect this output to some place else, she has tools to redirect
>> "emit" before calling her word. Requires some thinking.
>>
>> That would allow us to plug them into the...
>>
>> ## Files API
>>
>> Dusk will need a way for the sysop to interact with files. At its core, I don't
>> have better ideas than the UNIX way of doing it, that is: open() returns a
>> descriptor, you can read, write, seek, then you close it.
>>
>> But when it comes to interacting with other words, I'd like to make things
>> convenient for the sysop. I'm not sure at all where I want to go with this, but
>> my brain wanders around the concept of "file selection". It goes somewhat like
>> this:
>>
>> The file selection is a "rolling" array of path references with convenience
>> words to access the Nth selected word. By convention, the last selected path
>> is the destination and the 2nd last is the source (or, if processes only need
>> a source, then it's the last selected path). You select paths with ":". Example:
>>
>> :path/to/source :path/to/dest streamcopy
>>
>> Now, this is where things get fuzzy. Ideally, this command above is all it
>> would take for "streamcopy" to understand that I want to do. The "streams" API
>> could have some convenience words that check if the file selection is empty,
>> and if it's not, open/create the files in the selection, plug them into
>> stdin/stdout, and execute. I'm not sure yet how it will pan out, but that's
>> where I'm headed.
>
> Personally, I think this is somewhat in conflict with the idea above 
> about using
> words directly.  I understand that they could exist as separate 
> subsystems/layers,
> but they feel like they represent different ideas of how the system 
> should work.
>
> I'm not quite sure where this approach would have an advantage over just like,
> having convenience words that parse a filename and assign it to stdin/stdout.
>
> Another thing that could be useful are combinators that would take a path,
> open it, make that file stdin/stdout, call a word, close the file, then restore
> the previous value for that stream.  This type of combinator is quite common in
> LISP languages.

Yeah, you're probably right about this. This idea needs more work.

>> I'm not sure yet where to go exactly with this, but the mantra "embrace
>> statefulness" seems right to me.
>>
>>
> I find that is generally a good idea for forth systems.
Details
Message ID
<b8d2bbd1-c164-49ed-bab0-7c27e8f15149@www.fastmail.com>
In-Reply-To
<fc8ec725-bba7-9958-87ac-0b81022d7f56@envs.net> (view parent)
DKIM signature
pass
Download raw message
Roadmap update update

Latest commits show some progress w.r.t the road I was set to take, but there's
going to be some adjustments.

First, as you mentioned, binarycat, there can't be a separate mainloop in an
eventual sys/console, for the exact reason you mentioned. The worst part of
hitting this wall is that it's now the 3rd time I do it. I made two previous
exploratory attempt of this type in Collapse OS, but I always forget about the
trickiness of ":". To my defense, I hadn't completely forgot about it, but with
my general roadmap being fuzzy enough, I thought I could go through that wall.
No, that wall is unbreakable.

As of now, this has been done:

* Have "f<<" run its own interpret loop and thus simplify this part of the code.
* This allows us to get rid of "iin<". Now, there's only "in<" and "emit" in the
  kernel (and wihle we're at it, is it really important to continue to
  distinguish between "in<" and "key"? maybe not...)
* Define a new "stdin" ( -- c ) word in a new "lib/io" unit.
* Have the C compiler use stdin instead of defining its own I/O related words.

Now, about what's ahead. I'm still aiming at the "porting blkpack.c" goal. On
the CC side, I think all I need is:

* implement for loops
* allow C code to call system words (right now, it can only call functions
  declared in the same C unit)

After that, I think I can start porting (operation during which I will probably
encounter many compiler bugs, because so far all this compiler has been
compiling is toy code).

However, that's not all that will be needed. blkpack's purpose is to compile
an unpacked blkfs into a packed one. This means two things:

1. The ability to create new files and write to them.
2. The ability to verify that the contents generated is correct.

Regarding #2, it means the implementation of a checksum function, probably
crc32. Then, all I need to do is to compute the checksum of my result file and
compare it with the reference (the "blkfs" computed by Collapse OS' blkpack
under a POSIX system).

Regarding #1, the straightforward way to proceed would be to add more "lnxcall"
wrappers. However, I don't want to do that because it makes me paddle the wrong
way: Dusk OS has to cut its ties to the Linux kernel, not strengthen them.

So, I think I want to tackle the filesystem problem right away, that is:

1. select a filesystem that Dusk is going to boot from.
2. have the Makefile create a filesystem from "fs/" and embed it into the dusk
   binary.
3. implement words to replace calls to open(2) and read(2) in boot.fs

I'm leaning on FAT16 for its combination of simplicity and ubiquity.
Details
Message ID
<87edz75gyl.fsf@valley.vpn>
In-Reply-To
<fc8ec725-bba7-9958-87ac-0b81022d7f56@envs.net> (view parent)
DKIM signature
missing
Download raw message
Hi all,

So after reading the kernel and boot files I feel slightly more
comfortable starting to comment on things.  I'll try to substantiate as
much as I can despite my lack of experience with designing forth
systems.

For starters, thanks for the updates and congrats for the progress
you've made!

For the I/O API, I am wondering if it'd be interesting to think about 9p
at that point, I've never looked in depth at 9p, but I've used Plan 9
for a few months daily and the way everything i/o feels seemless must
mean something.  I know it was meant with networking in mind initially,
but there might definitely be a few things we could draw inspiration
from in there.

> What I'm set on trying first is to add a new sys/stream (or sys/io? I
> don't know) subsystem that define "stdin", a "key"-like word and
> "stdout", a "emit"-like word. They would be simple aliases to key and
> emit.

Sounds interesting, especially with the way aliases are done it could
give rise to a pretty simple way of handling stdio.

> I'm not sure yet about stderr. It's generally used for logging to the
> console or to a log file, so I'm tempted to have programs use "emit"
> directly. If the sysop needs to redirect this output to some place
> else, she has tools to redirect "emit" before calling her
> word. Requires some thinking.

The main idea behin stderr is to avoid cluttering stdout with needless
things when you work with UNIX pipes or need to do some processing on
the output of programs while still being able to let the user know of
what is going on.  The idea of just using emit sounds like it would
work, but having stderr could provide a more obvious and straightforward
way to do some logging.  The risk with redirecting emit is that, by
mistake, the programmer may mess up stdout which is initially an alias
to emit if I understand correctly.

One last thing regarding I/O and streams, without staying too much into
the UNIX dogma, we should seriously think about pipes.  It is imo one of
the most elegant and simple invention they made on UNIX, so if there is
one lesson to draw from this system, I would say it's this.  Maybe
thinking about this would bring to light a nice way to handle the
streamcopy example while staying consistent with the I/O system.

> I'm not sure yet where to go exactly with this, but the mantra
> "embrace statefulness" seems right to me.

Like binarycat, I think it sounds like a good idea to me.

>> also, how will EOF be represented?
>
> My first idea would be a global flag for checking if stdin is
> EOF. Right now, the "in<" words use a "c-or-0" scheme for detecting
> EOF, but that's going to break as soon as we try to process binary
> data. It has to be changed at some point.

A flag sounds good and coherent with the statefulness idea.  I really
have no idea how this would work or if it's even relevant, but throwing
this out there: Would an interrupt be of any use when reaching the EOF?

> First, as you mentioned, binarycat, there can't be a separate mainloop
> in an eventual sys/console, for the exact reason you mentioned. The
> worst part of hitting this wall is that it's now the 3rd time I do
> it. I made two previous exploratory attempt of this type in Collapse
> OS, but I always forget about the trickiness of ":". To my defense, I
> hadn't completely forgot about it, but with my general roadmap being
> fuzzy enough, I thought I could go through that wall.  No, that wall
> is unbreakable.

Not exactly sure what is the issue here, but if it's about the colon and
compilation, would ideas from colorforth be of interest here?  From what
I understand, hinting at if a word is to be defined, compiled, executed,
etc...  Apparently helps in simplifying the compiler.  We don't even
need to embrace the full color thing for that as there are ideas of
colorless colorforth, although I am not so sure it is very readable.
One such forth is avrforth <http://krue.net/avrforth/>, honestly it was
not too bad working with it.

While I speak of colorforth I'd like to take the opportunity to ask, if
I understand correctly we make use of this areg and I first saw this in
colorforth, would it make sense, like in colorforth to dedicate a
register to it?  (EDX in cf: <https://colorforth.github.io/forth.html>)
I remember in one of the talk/interview of Chuck, he mentionned the
performance and simplicity gains of using the areg. (I think the talk
was called 1xForth or something like that)

> Regarding #1, the straightforward way to proceed would be to add more
> "lnxcall" wrappers. However, I don't want to do that because it makes
> me paddle the wrong way: Dusk OS has to cut its ties to the Linux
> kernel, not strengthen them.

I am not familiar with blkpack.c but I agree that we should minimize
"lnxcall" calls as much as possible.

> So, I think I want to tackle the filesystem problem right away, that is:
>
> 1. select a filesystem that Dusk is going to boot from.
> 2. have the Makefile create a filesystem from "fs/" and embed it into
>    the dusk binary.
> 3. implement words to replace calls to open(2) and read(2) in boot.fs
>
> I'm leaning on FAT16 for its combination of simplicity and ubiquity.

I would suggest FAT32, as FAT16 is limited to 4G and I believe that most
"modern" hardware usually have bigger drives than 4G.  Now it also
depends on those ideas of Drives or working with mounts and multiple
filesystems would probably still allow to use the space that's available
while using FAT16.

I remember implementing a FAT32 driver back in school and I don't think
it was too bad, maybe my memory is embellishing things.  Although feel
free to discard my suggestion if the FAT16 implementation is much
simpler.

I just realized I wrote a lot of different things, not totally related
with the roadmap, I hope it's fine to post it in this thread rather than
make a new one for all this info.

Cheers,
-- 
nature
Details
Message ID
<dad915ad-d0ae-4064-b16f-a226796bd604@www.fastmail.com>
In-Reply-To
<87edz75gyl.fsf@valley.vpn> (view parent)
DKIM signature
pass
Download raw message
On Wed, Jun 29, 2022, at 3:17 PM, nature wrote:
> Hi all,
>
> So after reading the kernel and boot files I feel slightly more
> comfortable starting to comment on things.  I'll try to substantiate as
> much as I can despite my lack of experience with designing forth
> systems.
>
> For starters, thanks for the updates and congrats for the progress
> you've made!
>
> For the I/O API, I am wondering if it'd be interesting to think about 9p
> at that point, I've never looked in depth at 9p, but I've used Plan 9
> for a few months daily and the way everything i/o feels seemless must
> mean something.  I know it was meant with networking in mind initially,
> but there might definitely be a few things we could draw inspiration
> from in there.

No, I haven't looked at Plan 9 closely for the exact reason you mentioned. As
simple as Plan 9 is, if it's built around networking, then I suppose its core
abstractions don't fit with my goals with Dusk.

>> What I'm set on trying first is to add a new sys/stream (or sys/io? I
>> don't know) subsystem that define "stdin", a "key"-like word and
>> "stdout", a "emit"-like word. They would be simple aliases to key and
>> emit.
>
> Sounds interesting, especially with the way aliases are done it could
> give rise to a pretty simple way of handling stdio.
>
>> I'm not sure yet about stderr. It's generally used for logging to the
>> console or to a log file, so I'm tempted to have programs use "emit"
>> directly. If the sysop needs to redirect this output to some place
>> else, she has tools to redirect "emit" before calling her
>> word. Requires some thinking.
>
> The main idea behin stderr is to avoid cluttering stdout with needless
> things when you work with UNIX pipes or need to do some processing on
> the output of programs while still being able to let the user know of
> what is going on.  The idea of just using emit sounds like it would
> work, but having stderr could provide a more obvious and straightforward
> way to do some logging.  The risk with redirecting emit is that, by
> mistake, the programmer may mess up stdout which is initially an alias
> to emit if I understand correctly.
>
> One last thing regarding I/O and streams, without staying too much into
> the UNIX dogma, we should seriously think about pipes.  It is imo one of
> the most elegant and simple invention they made on UNIX, so if there is
> one lesson to draw from this system, I would say it's this.  Maybe
> thinking about this would bring to light a nice way to handle the
> streamcopy example while staying consistent with the I/O system.

Yes, the concept of piping is among my design concerns. The thing with piping
as they are done in UNIX is that they require concurrency, which is at odds
with "embrace statefulness".

I think it's possible to achieve something similar to piping with a set of
"filters" that we could apply to stdin and stdout.

>> I'm not sure yet where to go exactly with this, but the mantra
>> "embrace statefulness" seems right to me.
>
> Like binarycat, I think it sounds like a good idea to me.
>
>>> also, how will EOF be represented?
>>
>> My first idea would be a global flag for checking if stdin is
>> EOF. Right now, the "in<" words use a "c-or-0" scheme for detecting
>> EOF, but that's going to break as soon as we try to process binary
>> data. It has to be changed at some point.
>
> A flag sounds good and coherent with the statefulness idea.  I really
> have no idea how this would work or if it's even relevant, but throwing
> this out there: Would an interrupt be of any use when reaching the EOF?
>
>> First, as you mentioned, binarycat, there can't be a separate mainloop
>> in an eventual sys/console, for the exact reason you mentioned. The
>> worst part of hitting this wall is that it's now the 3rd time I do
>> it. I made two previous exploratory attempt of this type in Collapse
>> OS, but I always forget about the trickiness of ":". To my defense, I
>> hadn't completely forgot about it, but with my general roadmap being
>> fuzzy enough, I thought I could go through that wall.  No, that wall
>> is unbreakable.
>
> Not exactly sure what is the issue here, but if it's about the colon and
> compilation, would ideas from colorforth be of interest here?  From what
> I understand, hinting at if a word is to be defined, compiled, executed,
> etc...  Apparently helps in simplifying the compiler.  We don't even
> need to embrace the full color thing for that as there are ideas of
> colorless colorforth, although I am not so sure it is very readable.
> One such forth is avrforth <http://krue.net/avrforth/>, honestly it was
> not too bad working with it.

There's no issue per se. The mainloop in Dusk is fine as it is. It's just that
I had imagined a structure that turned out to be inconsistent and impossible.
But it's fine to just keep things as they are now.

> While I speak of colorforth I'd like to take the opportunity to ask, if
> I understand correctly we make use of this areg and I first saw this in
> colorforth, would it make sense, like in colorforth to dedicate a
> register to it?  (EDX in cf: <https://colorforth.github.io/forth.html>)
> I remember in one of the talk/interview of Chuck, he mentionned the
> performance and simplicity gains of using the areg. (I think the talk
> was called 1xForth or something like that)

Yes, it's very possible to assign registers to the A register. I do so in
Collapse OS. However, doing so removes registers that the C compiler can
allocate.

To have native words with direct access to a hardcoded memory location is
already a huge speedup compared to a regular "value". I think that the
additional gain to be had with assigning a register to it wouldn't outweigh
the loss on the C compiler part. But that's something that can be explored and
benchmarked...

>> Regarding #1, the straightforward way to proceed would be to add more
>> "lnxcall" wrappers. However, I don't want to do that because it makes
>> me paddle the wrong way: Dusk OS has to cut its ties to the Linux
>> kernel, not strengthen them.
>
> I am not familiar with blkpack.c but I agree that we should minimize
> "lnxcall" calls as much as possible.
>
>> So, I think I want to tackle the filesystem problem right away, that is:
>>
>> 1. select a filesystem that Dusk is going to boot from.
>> 2. have the Makefile create a filesystem from "fs/" and embed it into
>>    the dusk binary.
>> 3. implement words to replace calls to open(2) and read(2) in boot.fs
>>
>> I'm leaning on FAT16 for its combination of simplicity and ubiquity.
>
> I would suggest FAT32, as FAT16 is limited to 4G and I believe that most
> "modern" hardware usually have bigger drives than 4G.  Now it also
> depends on those ideas of Drives or working with mounts and multiple
> filesystems would probably still allow to use the space that's available
> while using FAT16.
>
> I remember implementing a FAT32 driver back in school and I don't think
> it was too bad, maybe my memory is embellishing things.  Although feel
> free to discard my suggestion if the FAT16 implementation is much
> simpler.

My preliminary idea is to have the "boot" partition not contain much else than
the system itself. The user will probably want to put their files in another
filesystem. Therefore, the 4GB limit is not a problem.

Moreover, from what I understand of the specs, you can't implement only FAT32
because under a certain drive size, the FAT *has* to be FAT16. Therefore, if
I choose FAT32, it will strictly be more complex than if I choose FAT16.

Because booting the FS from nothing comes with tricky bootstrapping constraints,
I prefer the side of simplicity.

> I just realized I wrote a lot of different things, not totally related
> with the roadmap, I hope it's fine to post it in this thread rather than
> make a new one for all this info.
>
> Cheers,
> -- 
> nature
Details
Message ID
<89dd2acd-400f-21b6-112d-a4596b79cff8@envs.net>
In-Reply-To
<b8d2bbd1-c164-49ed-bab0-7c27e8f15149@www.fastmail.com> (view parent)
DKIM signature
pass
Download raw message

On 2022-06-29 12:46, Virgil Dupras wrote:
> Roadmap update update
>
> Latest commits show some progress w.r.t the road I was set to take, but there's
> going to be some adjustments.
>
> First, as you mentioned, binarycat, there can't be a separate mainloop in an
> eventual sys/console, for the exact reason you mentioned. The worst part of
> hitting this wall is that it's now the 3rd time I do it. I made two previous
> exploratory attempt of this type in Collapse OS, but I always forget about the
> trickiness of ":". To my defense, I hadn't completely forgot about it, but with
> my general roadmap being fuzzy enough, I thought I could go through that wall.
> No, that wall is unbreakable.
>
> As of now, this has been done:
>
> * Have "f<<" run its own interpret loop and thus simplify this part of the code.
> * This allows us to get rid of "iin<". Now, there's only "in<" and "emit" in the
>    kernel (and wihle we're at it, is it really important to continue to
>    distinguish between "in<" and "key"? maybe not...)
> * Define a new "stdin" ( -- c ) word in a new "lib/io" unit.
> * Have the C compiler use stdin instead of defining its own I/O related words.
I would say that having 2 input sources like this is useful in a few ways
(the first one that comes to mind is pagers, which we will no doubt need).
additionally, line-buffered vs unbuffered input can be quite useful.
Merging them might be fine in some cases, but in other cases may make
things more complicated.  Although, with "stdin", that makes 3 input 
sources,
so perhaps there is room to remove one of them.
>
> Now, about what's ahead. I'm still aiming at the "porting blkpack.c" goal. On
> the CC side, I think all I need is:
>
> * implement for loops
> * allow C code to call system words (right now, it can only call functions
>    declared in the same C unit)
>
> After that, I think I can start porting (operation during which I will probably
> encounter many compiler bugs, because so far all this compiler has been
> compiling is toy code).
>
> However, that's not all that will be needed. blkpack's purpose is to compile
> an unpacked blkfs into a packed one. This means two things:
>
> 1. The ability to create new files and write to them.
> 2. The ability to verify that the contents generated is correct.
>
> Regarding #2, it means the implementation of a checksum function, probably
> crc32. Then, all I need to do is to compute the checksum of my result file and
> compare it with the reference (the "blkfs" computed by Collapse OS' blkpack
> under a POSIX system).
>
> Regarding #1, the straightforward way to proceed would be to add more "lnxcall"
> wrappers. However, I don't want to do that because it makes me paddle the wrong
> way: Dusk OS has to cut its ties to the Linux kernel, not strengthen them.
>
> So, I think I want to tackle the filesystem problem right away, that is:
>
> 1. select a filesystem that Dusk is going to boot from.
> 2. have the Makefile create a filesystem from "fs/" and embed it into the dusk
>     binary.
> 3. implement words to replace calls to open(2) and read(2) in boot.fs
>
> I'm leaning on FAT16 for its combination of simplicity and ubiquity.
I was thinking about implementing FAT32 for a while, that's the main
reason I brought up the drive subsystem in the first place.  If you think
FAT16 would be simpler, then go ahead.

I do think eventually some sort of copy-on-write filesystem would be a
good idea, eventually we will probably have to abandon git, and a nightly
backup system similar to plan9's cwfs would be a much simpler alternative.

Having good local backups will be very important in a post-internet world.
Reply to thread Export thread (mbox)