Fork me on GitHub
#jackdaw
<
2019-08-01
>
cddr18:08:56

Hey @sam.brauer, Another thought I had about this PR https://github.com/FundingCircle/jackdaw/pull/174.... You mention that the change is required to make it easier for consumers to deal with unions. It occurred to me a while ago that another way of implementing the serde (at least the deserialization side) would be to generate the JSON representation of the avro message, and then just (json/parse-string) that. The json representation of an avro message includes tags to indicate which branch of a union is selected. So e.g. with a schema like

{:type   "record",
                :name   "myrecord",
                :fields [{:name "myunion",
                          :type [{:type   "record",
                                  :name   "recordtype1",
                                  :fields [{:name "field1", :type "string"}]}
                                 {:type   "record",
                                  :name   "recordtype2",
                                  :fields [{:name "field2", :type "string"}]}]}]}
the message you'd get would look something like
{"myunion" {"recordtype1" {"field1" "foo"}}}
or
{"myunion" {"recordtype2" {"field2" "bar"}}}
This would have the benefit of making the code you write to consume these messages more interoperable with the standard avro-console-consumer which prints the messages in this way. I don't think we could change the existing avro serde to do this (or at least if we did we'd have to make it something you'd opt in to) but there's nothing to stop you from making a "better-avro-serde" package that provides a way to create a Serde instance and just setting the :value-serde in your topic-config to one of those. Food for thought anyway....

sbrauer18:08:46

Interesting idea to chew on. I'm familiar with that type-tagged style that kafka-avro-console-consumer and some other tools use for union branches. Certainly don't want to change the existing avro serde to output that format (by default), but perhaps it could make sense as an option like you said.

sbrauer19:08:27

The idea in my PR was just a proposal. I'm not dead set on it. So far when I've had a union of records I've sort of sniffed out which type I have in a specific message by checking for a field that I'd only expect in a specific branch record (so in your example I would check for :field1 or :field2). A heavier (but perhaps better) approach might be to conform to a spec. However I found myself wishing that the record name was "just" available in the value, thus my initial PR.

sbrauer19:08:01

No hard feelings if you'd prefer I close the PR.

cddr19:08:49

I’d keep it open for now. Need to check with my colleagues

sbrauer19:08:47

Sounds good. Thanks for reaching out.