~sircmpwn/public-inbox

2 2

Fwd: [BARE] prefix VarInt

Details
Message ID
<ad963600-1b86-1968-c83b-e33473f3b4d3@gmail.com>
DKIM signature
missing
Download raw message
Hi,

I've just found out about BARE and would like to offer 
some of my observations as I've seen that feedback is 
(still?) welcome as the specs aren't yet finalized.

Regarding variable-length integer encoding in BARE, 
there's an alternative to (U)LEB128 encoding that is 
simpler and faster. Little-endian prefix VarInt encodes 
64bits in 1–9 bytes and can be implemented without a loop.

Here's an example implementation (not made by me)

     - https://docs.rs/vint64/latest/vint64/
     - 
https://github.com/iqlusioninc/veriform/tree/develop/rust/vint64


Another thing regarding varints is that often a signed 
integer type is used to store only positive values. 
Which makes it storing as zigzag encoded sint less 
efficient. Protobuf has three types of varints for that 
purpose (int, sint, uint), where int is not zigzag 
encoded. To make it more clear, it could be named 
something like `pint` (positive int). So the three 
varint types could be:

     - pint (positive signed int, encodes negative 
numbers as two’s complement)
     - sint (signed int, uses zigzag encoding)
     - uint (unsigned int)

These are just some of my observations. I understand 
they might not be good fit for BARE's vision which is 
fine. I just thought to offer some food for thought.

Best, zig
Details
Message ID
<CAPUJBFJyMuJGX2EEnYV-AMFp=QJmEP-LPXDcv+j5mUKWDDm9EQ@mail.gmail.com>
In-Reply-To
<ad963600-1b86-1968-c83b-e33473f3b4d3@gmail.com> (view parent)
DKIM signature
missing
Download raw message
Hello zig, thank you for your feedback.

> Regarding variable-length integer encoding in BARE, there's an
> alternative to (U)LEB128 encoding that is simpler and faster.

Variable-length integer encoding was discussed in the following threads:
- https://lists.sr.ht/~sircmpwn/public-inbox/%3C1dfb75c3-86a3-276a-ca9a-ba9da0df376d%40elvinger.fr%3E
- https://lists.sr.ht/~sircmpwn/public-inbox/%3CCAFFTG-a-Vci%2BkS_d_%3DuaX8kyxszqtB-79pcb5580Z9xw72V5kw%40mail.gmail.com%3E

Please, note that changing the encoding would break current
implementations, which is strong argument against.

> Another thing regarding varints is that often a signed integer type is
> used to store only positive values. Which makes it storing as zigzag
> encoded sint less efficient.

I am sorry but I probably misunderstand. Why would be positive integers
stored as int when there is uint?
Details
Message ID
<f72e66e1-8ae9-f414-8651-16110c5b8249@gmail.com>
In-Reply-To
<CAPUJBFJyMuJGX2EEnYV-AMFp=QJmEP-LPXDcv+j5mUKWDDm9EQ@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Hello Jiri, thank you for your replay.

> Please, note that changing the encoding would break 
> current
> implementations, which is strong argument against.

I agree that breaking current implementations only 
because of this wouldn't be reasonable, as it's just a 
minor improvement.
It would only make sense to consider this in the event 
of a new specification with breaking changes. If that 
ever happens.

> I am sorry but I probably misunderstand. Why would be 
> positive integers
> stored as int when there is uint?

Just to avoid casting in user code from ints to uints 
and back to ints when interacting with BARE-generated 
objects.
I don't have real statistics, but I would say that 
positive integers stored as ints are probably extremely 
common, especially
if value range of the positive integer is small. This 
too would be only a minor improvement.
Reply to thread Export thread (mbox)