~arestifo/crystal-cbor

crystal-cbor: Compute the number of object properties to serialize. v1 APPLIED

~karchnu: 3
 Compute the number of object properties to serialize.
 Removing "_ " in diagnostics, since map sizes are right this time.
 Unmapped values are taken into account in the object mapping size.

 3 files changed, 50 insertions(+), 22 deletions(-)
#339211 .build.yml success
crystal-cbor/patches/.build.yml: SUCCESS in 26s

[Compute the number of object properties to serialize.][0] from [~karchnu][1]

[0]: https://lists.sr.ht/~arestifo/crystal-cbor/patches/14897
[1]: mailto:karchnu@karchnu.fr

✓ #339211 SUCCESS crystal-cbor/patches/.build.yml https://builds.sr.ht/~arestifo/job/339211
Hello all.


Just want to say a few things.

First, thanks for the effort, this is a great start for this kind of library.

Second, I'm working on this CBOR library, and its integration into the language has to be almost as smooth as the JSON mapping for my needs.
Let's try to get it working!

Third, this is a first time for me on this platform, I hope you'll be indulgent. :)
My first proposal include only the first commit, which didn't pass the tests.
This proposal fixes everything.
Finally, I saw a few other problems in the library:
- floating values can be trimmed into 16-bit floating values when possible.
  That is what is done in the cbor.me website, this is great to save a few bytes.
- decoding has errors when working with unknown sizes of maps. Hash of hashes fails, for example.
  I put a second "location" in the House example class, and after a to_cbor (formerly producing unbound size maps), the from_cbor crashed.
  As a side-effect, this current proposal actually fixes this case. Still, unknown sizes of maps are a problem.
- unmapped values could include anything as keys. I'm not sure why we don't want to have integers as keys for example.
- use_cbor_discriminator really is important in practice for me. I suppose it is important for a lot of cases.
  So I'll give it a shot.

I hope my patches will be useful.

Thanks!
Hey!

I just wanted to drop-in quickly to say that I don't have much time this
week to look a the code, but I'm very grateful for the contributions!

This weeked I'll give the patches a proper review.
Next
Just to let you know, I don't push new PRs now because I wait for you to read my previous ones, but I implemented CBOR::Any.
It is really useful to have the same API than JSON.

I still didn't merge my code for Float16.
I had more useful stuff to do, but it will surely be ready once we get the other PRs accepted.


After that I will review the current code to check for possible security breaches.
I will use this library for production, and I have to be confident that _at least_ the simplest attacks are mitigated.

Export patchset (mbox)
How do I use this?

Copy & paste the following snippet into your terminal to import this patchset into git:

curl -s https://lists.sr.ht/~arestifo/crystal-cbor/patches/14897/mbox | git am -3
Learn more about email & git

[PATCH crystal-cbor 1/3] Compute the number of object properties to serialize. Export this patch

From: Karchnu <karchnu@karchnu.fr>

---
When the number of object properties to add is pre-computed before
serialization, the serialization is more compact and decoding is a bit
faster.

I took care of nil values that should not be part of the
encoding.

 src/cbor/serializable.cr | 51 +++++++++++++++++++++++++++-------------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/src/cbor/serializable.cr b/src/cbor/serializable.cr
index 7ebd7d2..77d0689 100644
--- a/src/cbor/serializable.cr
+++ b/src/cbor/serializable.cr
@@ -136,6 +136,7 @@ module CBOR
      end

      private def self.new_from_cbor_decoder(decoder : ::CBOR::Decoder)
        # puts "self.new_from_cbor_decoder"
        instance = allocate
        instance.initialize(__decoder_for_cbor_serializable: decoder)
        GC.add_finalizer(instance) if instance.responds_to?(:finalize)
@@ -268,9 +269,27 @@ module CBOR
          {% end %}
        {% end %}

        cbor.object do
          {% for name, value in properties %}
        # Compute the size of the final list of properties to serialize.
        # This allows a more compact encoding, and a faster decoding.
        nb_properties_to_serialize = 0
        {% for name, value in properties %}
            _{{name}} = @{{name}}
          {% unless value[:emit_null] %}
            unless _{{name}}.nil?
              nb_properties_to_serialize += 1
            end
          {% else %}
            nb_properties_to_serialize += 1
          {% end %} # macro unless value[:emit_null]
        {% end %} # macro for properties



        {% if properties.size > 0 %}
          cbor.write_object_start nb_properties_to_serialize

          {% for name, value in properties %}
              _{{name}} = @{{name}}

            {% unless value[:emit_null] %}
              unless _{{name}}.nil?
@@ -279,23 +298,23 @@ module CBOR
              # Write the key of the map
              cbor.write({{value[:key]}})

              {% if value[:converter] %}
                if _{{name}}
                  {{ value[:converter] }}.to_cbor(_{{name}}, cbor)
                else
                  cbor.write(nil, use_undefined: value[:nil_as_undefined])
                end
              {% else %}
                _{{name}}.to_cbor(cbor)
              {% end %}
            {% if value[:converter] %}
              if _{{name}}
                {{ value[:converter] }}.to_cbor(_{{name}}, cbor)
              else
                cbor.write(nil, use_undefined: value[:nil_as_undefined])
              end
            {% else %} # macro if value[:converter]
              _{{name}}.to_cbor(cbor)
            {% end %} # macro if value[:converter]

            {% unless value[:emit_null] %}
              end
            {% end %}
          {% end %}
              end # unless _{{name}}.nil?
            {% end %} # macro unless value[:emit_null]
          {% end %} # macro for properties
          on_to_cbor(cbor)
        end
      {% end %}
        {% end %} # macro if properties.size > 0
      {% end %} # begin
    end

    module Unmapped
-- 
2.26.2
Thanks for your contributions!

I already applied this changes this time around to speed-up the process
given the amount of contributions you submitted and that I fell a bit
behind, I hope to be able to be a bit more responsinve in the next
weeks.

Just a couple of comments that I already addressed:
> + # puts "self.new_from_cbor_decoder"
I don't know if it's on purpose, but please avoid leving commented-out
logs.
> + {% else %} # macro if value[:converter]
> + {% end %} # macro if value[:converter]
> + {% end %} # macro if properties.size > 0
> + {% end %} # begin
I'm personally not a fan of this kind of comments. I understand they
might help in understanidng the code, but I always read the comments and
I find those kind of comments add stutter and distract me from the
"flow".

Thanks again for your contributions!

A version of those 3 patches, with the related adjustements applied has
been pushed as commit 307c57879a539bca9b5d85ec627b9de3eb1af7db

  - Alberto

[PATCH crystal-cbor 2/3] Removing "_ " in diagnostics, since map sizes are right this time. Export this patch

From: Karchnu <karchnu@karchnu.fr>

---
 spec/cbor/serializable_spec.cr | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/spec/cbor/serializable_spec.cr b/spec/cbor/serializable_spec.cr
index ff64361..16e2c8b 100644
--- a/spec/cbor/serializable_spec.cr
+++ b/spec/cbor/serializable_spec.cr
@@ -66,7 +66,7 @@ end

describe CBOR::Serializable do
  describe "rfc examples" do
    describe %(example {_ "a": 1, "b": [_ 2, 3]}) do
    describe %(example {"a": 1, "b": [2, 3]}) do
      it "decodes from cbor" do
        result = ExampleA.from_cbor(Bytes[0xbf, 0x61, 0x61, 0x01, 0x61, 0x62, 0x9f, 0x02, 0x03, 0xff, 0xff])

@@ -75,7 +75,7 @@ describe CBOR::Serializable do
      end
    end

    describe %(example {_ "Fun": true, "Amt": -2}) do
    describe %(example {"Fun": true, "Amt": -2}) do
      it "decodes from cbor" do
        result = ExampleB.from_cbor(Bytes[0xbf, 0x63, 0x46, 0x75, 0x6e, 0xf5, 0x63, 0x41, 0x6d, 0x74, 0x21, 0xff])

@@ -84,7 +84,7 @@ describe CBOR::Serializable do
      end
    end

    describe %(example ["a", {_ "b": "c"}]) do
    describe %(example ["a", {"b": "c"}]) do
      it "decodes from cbor" do
        result = Array(String | ExampleC).from_cbor(Bytes[0x82, 0x61, 0x61, 0xbf, 0x61, 0x62, 0x61, 0x63, 0xff])

@@ -137,7 +137,7 @@ describe CBOR::Serializable do

      it "encodes to CBOR" do
        cbor = House.from_cbor(bytes).to_cbor
        CBOR::Diagnostic.to_s(cbor).should eq(%({_ "address": "Crystal Road 1234", "location": {_ "lat": 12.3, "lng": 34.5}}))
        CBOR::Diagnostic.to_s(cbor).should eq(%({"address": "Crystal Road 1234", "location": {"lat": 12.3, "lng": 34.5}}))
      end
    end

@@ -168,7 +168,7 @@ describe CBOR::Serializable do
      it "encodes to CBOR" do
        cbor = Array(House).from_cbor(bytes).to_cbor

        CBOR::Diagnostic.to_s(cbor).should eq(%([{_ "address": "Crystal Road 1234", "location": {_ "lat": 12.3, "lng": 34.5}}]))
        CBOR::Diagnostic.to_s(cbor).should eq(%([{"address": "Crystal Road 1234", "location": {"lat": 12.3, "lng": 34.5}}]))
      end
    end

@@ -185,7 +185,7 @@ describe CBOR::Serializable do
        res.a.should eq(1)
        res.cbor_unmapped.should eq({"b" => 2})

        CBOR::Diagnostic.to_s(res.to_cbor).should eq(%({_ "a": 1, "b": 2}))
        CBOR::Diagnostic.to_s(res.to_cbor).should eq(%({"a": 1, "b": 2}))
      end
    end
  end
-- 
2.26.2

[PATCH crystal-cbor 3/3] Unmapped values are taken into account in the object mapping size. Export this patch

From: Karchnu <karchnu@karchnu.fr>

---
 src/cbor/serializable.cr | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/src/cbor/serializable.cr b/src/cbor/serializable.cr
index 77d0689..b1fe894 100644
--- a/src/cbor/serializable.cr
+++ b/src/cbor/serializable.cr
@@ -244,6 +244,10 @@ module CBOR
      raise ::CBOR::SerializationError.new("Unknown CBOR attribute: #{key}", self.class.to_s, nil)
    end

    protected def get_cbor_unmapped
      {} of String => ::CBOR::Type
    end

    protected def on_to_cbor(cbor : ::CBOR::Encoder)
    end

@@ -283,6 +287,7 @@ module CBOR
          {% end %} # macro unless value[:emit_null]
        {% end %} # macro for properties

        nb_properties_to_serialize += get_cbor_unmapped.size


        {% if properties.size > 0 %}
@@ -329,6 +334,10 @@ module CBOR
        end
      end

      protected def get_cbor_unmapped
        cbor_unmapped
      end

      protected def on_to_cbor(cbor : ::CBOR::Encoder)
        cbor_unmapped.each do |key, value|
          cbor.write(key)
-- 
2.26.2
crystal-cbor/patches/.build.yml: SUCCESS in 26s

[Compute the number of object properties to serialize.][0] from [~karchnu][1]

[0]: https://lists.sr.ht/~arestifo/crystal-cbor/patches/14897
[1]: mailto:karchnu@karchnu.fr

✓ #339211 SUCCESS crystal-cbor/patches/.build.yml https://builds.sr.ht/~arestifo/job/339211
Hello all.


Just want to say a few things.

First, thanks for the effort, this is a great start for this kind of library.

Second, I'm working on this CBOR library, and its integration into the language has to be almost as smooth as the JSON mapping for my needs.
Let's try to get it working!

Third, this is a first time for me on this platform, I hope you'll be indulgent. :)
My first proposal include only the first commit, which didn't pass the tests.
This proposal fixes everything.

Finally, I saw a few other problems in the library:
- floating values can be trimmed into 16-bit floating values when possible.
  That is what is done in the cbor.me website, this is great to save a few bytes.
- decoding has errors when working with unknown sizes of maps. Hash of hashes fails, for example.
  I put a second "location" in the House example class, and after a to_cbor (formerly producing unbound size maps), the from_cbor crashed.
  As a side-effect, this current proposal actually fixes this case. Still, unknown sizes of maps are a problem.
- unmapped values could include anything as keys. I'm not sure why we don't want to have integers as keys for example.
- use_cbor_discriminator really is important in practice for me. I suppose it is important for a lot of cases.
  So I'll give it a shot.


I hope my patches will be useful.

Thanks!