~karchnu: 3 Compute the number of object properties to serialize. Removing "_ " in diagnostics, since map sizes are right this time. Unmapped values are taken into account in the object mapping size. 3 files changed, 50 insertions(+), 22 deletions(-)
crystal-cbor/patches/.build.yml: SUCCESS in 26s [Compute the number of object properties to serialize.][0] from [~karchnu][1] [0]: https://lists.sr.ht/~arestifo/crystal-cbor/patches/14897 [1]: mailto:karchnu@karchnu.fr ✓ #339211 SUCCESS crystal-cbor/patches/.build.yml https://builds.sr.ht/~arestifo/job/339211
Hello all. Just want to say a few things. First, thanks for the effort, this is a great start for this kind of library. Second, I'm working on this CBOR library, and its integration into the language has to be almost as smooth as the JSON mapping for my needs. Let's try to get it working! Third, this is a first time for me on this platform, I hope you'll be indulgent. :) My first proposal include only the first commit, which didn't pass the tests. This proposal fixes everything.
I marked the first patch as rejected, so we can keep the conversation going in this patch.going in this patch. There was a problem with from_cbor on serialized objects with to_cbor. Here is a minimal working example of the problem: https://paste.ofcode.org/fjhC7SacFbkqSddxDuMy5Y First, a JSON representation, a CBOR representation after a to_cbor, and finally a CBOR representation after a from_cbor. The last one lacks the "location2" attribute. The current patch solves this by fixing the size of encoded hashes, but decoding unbounded hashes still is a problem. Will try to fix it maybe this weekend.
Finally, I saw a few other problems in the library: - floating values can be trimmed into 16-bit floating values when possible.
16 bit floats have been a huge issue. The main problem is that the language doesn't have any Float16 prmitives (I opened an issue asking if it was possible to implement[0] and they tried, but it's too complicated and not important enough right now) The current implementation[1] took me forever to "get right" and I don't feel right doing the work to transform a 32-big Float into a 16 bit one as well. I'm definetely up for contributions on this front :)I just made a f32_to_f16 function, with an implementation in full Crystal: https://paste.ofcode.org/HuRyTp9Z3uDgRe3PSWUvZP Better have a function now than waiting for the team to get it in standard lib. :) I return an enum to know if there is a problem with the conversion (such as overflow, underflow, infinite, etc.), better than a raise (too slow). I will do a proper patchset in the morning. Still not entirely sure how to put this function into the library, I'll figure it out with enough sleep I guess. The main thing to test is that we don't lose information, otherwise we have to stick to the 32-bit format.[0]: https://github.com/crystal-lang/crystal/issues/9172 [1]: https://git.sr.ht/~arestifo/crystal-cbor/tree/master/src/cbor/type/float_16.cr
That is what is done in the cbor.me website, this is great to save a few bytes. - decoding has errors when working with unknown sizes of maps. Hash of hashes fails, for example. I put a second "location" in the House example class, and after a to_cbor (formerly producing unbound size maps), the from_cbor crashed. As a side-effect, this current proposal actually fixes this case. Still, unknown sizes of maps are a problem.
I'll have to look into this deeper, supposedly there are some tests to ensure such maps work.
- unmapped values could include anything as keys. I'm not sure why we don't want to have integers as keys for example.
This is supposed to work, for example there are some tests that use UInts for the keys[0] [0]: https://git.sr.ht/~arestifo/crystal-cbor/tree/master/spec/cbor/from_cbor_spec.cr#L33-34
- use_cbor_discriminator really is important in practice for me. I suppose it is important for a lot of cases. So I'll give it a shot.
Many thanks! You're not the first one asking for this feature! I'll review your patches during the weekend with more calm :)Great!- Alberto
I hope my patches will be useful. Thanks!
Hey! I just wanted to drop-in quickly to say that I don't have much time this week to look a the code, but I'm very grateful for the contributions! This weeked I'll give the patches a proper review.
Just to let you know, I don't push new PRs now because I wait for you to read my previous ones, but I implemented CBOR::Any. It is really useful to have the same API than JSON.
Sorry for the delay and thank you so much for your contributions!
I still didn't merge my code for Float16. I had more useful stuff to do, but it will surely be ready once we get the other PRs accepted. After that I will review the current code to check for possible security breaches. I will use this library for production, and I have to be confident that _at least_ the simplest attacks are mitigated.
I think this library would be a good candidate for some fuzzy testing. However I've never done it and I'm not sure I have the time to get into it right now. - Alberto
Copy & paste the following snippet into your terminal to import this patchset into git:
curl -s https://lists.sr.ht/~arestifo/crystal-cbor/patches/14897/mbox | git am -3Learn more about email & git
From: Karchnu <karchnu@karchnu.fr> --- When the number of object properties to add is pre-computed before serialization, the serialization is more compact and decoding is a bit faster. I took care of nil values that should not be part of the encoding. src/cbor/serializable.cr | 51 +++++++++++++++++++++++++++------------- 1 file changed, 35 insertions(+), 16 deletions(-) diff --git a/src/cbor/serializable.cr b/src/cbor/serializable.cr index 7ebd7d2..77d0689 100644 --- a/src/cbor/serializable.cr +++ b/src/cbor/serializable.cr @@ -136,6 +136,7 @@ module CBOR end private def self.new_from_cbor_decoder(decoder : ::CBOR::Decoder) + # puts "self.new_from_cbor_decoder" instance = allocate instance.initialize(__decoder_for_cbor_serializable: decoder) GC.add_finalizer(instance) if instance.responds_to?(:finalize) @@ -268,9 +269,27 @@ module CBOR {% end %} {% end %} - cbor.object do - {% for name, value in properties %} + # Compute the size of the final list of properties to serialize. + # This allows a more compact encoding, and a faster decoding. + nb_properties_to_serialize = 0 + {% for name, value in properties %} _{{name}} = @{{name}} + {% unless value[:emit_null] %} + unless _{{name}}.nil? + nb_properties_to_serialize += 1 + end + {% else %} + nb_properties_to_serialize += 1 + {% end %} # macro unless value[:emit_null] + {% end %} # macro for properties + + + + {% if properties.size > 0 %} + cbor.write_object_start nb_properties_to_serialize + + {% for name, value in properties %} + _{{name}} = @{{name}} {% unless value[:emit_null] %} unless _{{name}}.nil? @@ -279,23 +298,23 @@ module CBOR # Write the key of the map cbor.write({{value[:key]}}) - {% if value[:converter] %} - if _{{name}} - {{ value[:converter] }}.to_cbor(_{{name}}, cbor) - else - cbor.write(nil, use_undefined: value[:nil_as_undefined]) - end - {% else %} - _{{name}}.to_cbor(cbor) - {% end %} + {% if value[:converter] %} + if _{{name}} + {{ value[:converter] }}.to_cbor(_{{name}}, cbor) + else + cbor.write(nil, use_undefined: value[:nil_as_undefined]) + end + {% else %} # macro if value[:converter] + _{{name}}.to_cbor(cbor) + {% end %} # macro if value[:converter] {% unless value[:emit_null] %} - end - {% end %} - {% end %} + end # unless _{{name}}.nil? + {% end %} # macro unless value[:emit_null] + {% end %} # macro for properties on_to_cbor(cbor) - end - {% end %} + {% end %} # macro if properties.size > 0 + {% end %} # begin end module Unmapped -- 2.26.2
Thanks for your contributions! I already applied this changes this time around to speed-up the process given the amount of contributions you submitted and that I fell a bit behind, I hope to be able to be a bit more responsinve in the next weeks. Just a couple of comments that I already addressed: > + # puts "self.new_from_cbor_decoder" I don't know if it's on purpose, but please avoid leving commented-out logs. > + {% else %} # macro if value[:converter] > + {% end %} # macro if value[:converter] > + {% end %} # macro if properties.size > 0 > + {% end %} # begin I'm personally not a fan of this kind of comments. I understand they might help in understanidng the code, but I always read the comments and I find those kind of comments add stutter and distract me from the "flow". Thanks again for your contributions! A version of those 3 patches, with the related adjustements applied has been pushed as commit 307c57879a539bca9b5d85ec627b9de3eb1af7db - Alberto
From: Karchnu <karchnu@karchnu.fr> --- spec/cbor/serializable_spec.cr | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/spec/cbor/serializable_spec.cr b/spec/cbor/serializable_spec.cr index ff64361..16e2c8b 100644 --- a/spec/cbor/serializable_spec.cr +++ b/spec/cbor/serializable_spec.cr @@ -66,7 +66,7 @@ end describe CBOR::Serializable do describe "rfc examples" do - describe %(example {_ "a": 1, "b": [_ 2, 3]}) do + describe %(example {"a": 1, "b": [2, 3]}) do it "decodes from cbor" do result = ExampleA.from_cbor(Bytes[0xbf, 0x61, 0x61, 0x01, 0x61, 0x62, 0x9f, 0x02, 0x03, 0xff, 0xff]) @@ -75,7 +75,7 @@ describe CBOR::Serializable do end end - describe %(example {_ "Fun": true, "Amt": -2}) do + describe %(example {"Fun": true, "Amt": -2}) do it "decodes from cbor" do result = ExampleB.from_cbor(Bytes[0xbf, 0x63, 0x46, 0x75, 0x6e, 0xf5, 0x63, 0x41, 0x6d, 0x74, 0x21, 0xff]) @@ -84,7 +84,7 @@ describe CBOR::Serializable do end end - describe %(example ["a", {_ "b": "c"}]) do + describe %(example ["a", {"b": "c"}]) do it "decodes from cbor" do result = Array(String | ExampleC).from_cbor(Bytes[0x82, 0x61, 0x61, 0xbf, 0x61, 0x62, 0x61, 0x63, 0xff]) @@ -137,7 +137,7 @@ describe CBOR::Serializable do it "encodes to CBOR" do cbor = House.from_cbor(bytes).to_cbor - CBOR::Diagnostic.to_s(cbor).should eq(%({_ "address": "Crystal Road 1234", "location": {_ "lat": 12.3, "lng": 34.5}})) + CBOR::Diagnostic.to_s(cbor).should eq(%({"address": "Crystal Road 1234", "location": {"lat": 12.3, "lng": 34.5}})) end end @@ -168,7 +168,7 @@ describe CBOR::Serializable do it "encodes to CBOR" do cbor = Array(House).from_cbor(bytes).to_cbor - CBOR::Diagnostic.to_s(cbor).should eq(%([{_ "address": "Crystal Road 1234", "location": {_ "lat": 12.3, "lng": 34.5}}])) + CBOR::Diagnostic.to_s(cbor).should eq(%([{"address": "Crystal Road 1234", "location": {"lat": 12.3, "lng": 34.5}}])) end end @@ -185,7 +185,7 @@ describe CBOR::Serializable do res.a.should eq(1) res.cbor_unmapped.should eq({"b" => 2}) - CBOR::Diagnostic.to_s(res.to_cbor).should eq(%({_ "a": 1, "b": 2})) + CBOR::Diagnostic.to_s(res.to_cbor).should eq(%({"a": 1, "b": 2})) end end end -- 2.26.2
From: Karchnu <karchnu@karchnu.fr> --- src/cbor/serializable.cr | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/cbor/serializable.cr b/src/cbor/serializable.cr index 77d0689..b1fe894 100644 --- a/src/cbor/serializable.cr +++ b/src/cbor/serializable.cr @@ -244,6 +244,10 @@ module CBOR raise ::CBOR::SerializationError.new("Unknown CBOR attribute: #{key}", self.class.to_s, nil) end + protected def get_cbor_unmapped + {} of String => ::CBOR::Type + end + protected def on_to_cbor(cbor : ::CBOR::Encoder) end @@ -283,6 +287,7 @@ module CBOR {% end %} # macro unless value[:emit_null] {% end %} # macro for properties + nb_properties_to_serialize += get_cbor_unmapped.size {% if properties.size > 0 %} @@ -329,6 +334,10 @@ module CBOR end end + protected def get_cbor_unmapped + cbor_unmapped + end + protected def on_to_cbor(cbor : ::CBOR::Encoder) cbor_unmapped.each do |key, value| cbor.write(key) -- 2.26.2
builds.sr.ht <builds@sr.ht>crystal-cbor/patches/.build.yml: SUCCESS in 26s [Compute the number of object properties to serialize.][0] from [~karchnu][1] [0]: https://lists.sr.ht/~arestifo/crystal-cbor/patches/14897 [1]: mailto:karchnu@karchnu.fr ✓ #339211 SUCCESS crystal-cbor/patches/.build.yml https://builds.sr.ht/~arestifo/job/339211
Hello all. Just want to say a few things. First, thanks for the effort, this is a great start for this kind of library. Second, I'm working on this CBOR library, and its integration into the language has to be almost as smooth as the JSON mapping for my needs. Let's try to get it working! Third, this is a first time for me on this platform, I hope you'll be indulgent. :) My first proposal include only the first commit, which didn't pass the tests. This proposal fixes everything. Finally, I saw a few other problems in the library: - floating values can be trimmed into 16-bit floating values when possible. That is what is done in the cbor.me website, this is great to save a few bytes. - decoding has errors when working with unknown sizes of maps. Hash of hashes fails, for example. I put a second "location" in the House example class, and after a to_cbor (formerly producing unbound size maps), the from_cbor crashed. As a side-effect, this current proposal actually fixes this case. Still, unknown sizes of maps are a problem. - unmapped values could include anything as keys. I'm not sure why we don't want to have integers as keys for example. - use_cbor_discriminator really is important in practice for me. I suppose it is important for a lot of cases. So I'll give it a shot. I hope my patches will be useful. Thanks!