This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-08-15
Channels
- # architecture (5)
- # babashka (34)
- # beginners (72)
- # calva (42)
- # cherry (31)
- # cider (14)
- # clojure (27)
- # clojure-europe (11)
- # clojure-norway (17)
- # clojure-uk (1)
- # clojurescript (25)
- # community-development (13)
- # conjure (1)
- # core-async (11)
- # datascript (18)
- # datomic (11)
- # emacs (12)
- # fulcro (10)
- # integrant (5)
- # introduce-yourself (3)
- # jobs (8)
- # juxt (2)
- # malli (22)
- # off-topic (11)
- # pathom (18)
- # polylith (62)
- # rdf (18)
- # reagent (8)
- # releases (1)
- # shadow-cljs (35)
- # sql (3)
- # squint (141)
- # tools-deps (12)
- # vim (4)
- # xtdb (4)
I'm generating an Avro schema from a SHACL model, and an interesting question has come up: do I choose null
or []
to represent absence?
To get to the point: say I have a resource Sensor
, which can have zero or more measurement
s. I see two ways to encode "there are no measurements": 1. I make its type optional and use null
, or 2. I use the empty array []
to do that.
(Note that I've assumed a closed world here. Under the open world assumption there would've been a clear distinction: []
encodes the statement "there are no measurements", and null
encodes "i don't know of any measurements".)
I was wondering if others have input here: am I correct in seeing semantic equivalence there under the closed world assumption? And if so, what reasons would I have to choose either option? Thanks!
Yes, I believe you have semantic equivalence.
To do it in an open world you’d need a collection, or an explicit “not exists” value to assert.
To do it in a closed world, then you can just make it optional, which is what an Avro null
does. The empty collection is OK, since being explicit like that means the same thing under both assumptions
Though, I should check in to make sure… you said that you want to “represent absence”. Are you representing that the value is unset, or are you representing that the value is unknown? I’m presuming the former, but I shouldn’t make assumptions 🙂
Okay, glad to see my reasoning verified at least 🙂.
Regarding your question: good to check that, particularly since choosing the words "represent absence" was the hardest part of phrasing the question 😄. Anyways, I'm not sure I understand the difference between the two interpretations you suggest (within a closed world that is). Could you elaborate? To add a little context that might help: the code I'm writing aims to be a general generator of Avro schemas from a given SHACL model. That means I'm trying to map SHACL concepts onto Avro ones in a meaningful way (SHACL is of course more expressive), but also a practical way (the Avro schemas are going to be actually used, so things like schema evolution matter to the point where I might sacrifice some "transformational purity" if that makes sense)
My point is: all I get from a SHACL model is cardinality constraints, and all I have in Avro are the possibility of making a type nullable or not. To that degree of generality I have to make a choice whether I map SHACL's minCardinality = 0; maxCardinality > 1
to a nullable array or not
What I meant was basically 3 types of value: • a value, such as a number. • a null, to indicate that there is no value. • an “unknown” to indicate that the value is not known. We have a surprising number of these with medical data. For instance, if a thumb is broken, then the “laterality” property will be either left, or right, or unknown, but it can’t be null, since one of the 2 thumbs was broken. But the location of a neoplasm can have no laterality, if it is, for instance, in the center of the chest.
if you have an array, then I wouldn’t make it nullable, but that’s a personal preference. Clojure is great for treating empty seqables as nil, but only if you wrap them in seq
. But in the non-Clojure world, you often need to explicitly check for null before you’re allowed to look at a collection, which is annoying, since you need separate code for that.
I love that example
Those kind of practical consequences are the ones I'm looking for. That's a good argument for not making it nullable indeed
Well, some people will say that the model was incomplete and should have include “center” as an option. But in the real world ALL models are incomplete 🙂
I’ve always marveled at how when I try to describe modeling to people via a simple example from the real world, someone will ALWAYS find exceptions to the model
Haha yes that sounds like a familiar wall to bump into
I let it sink in and yes, [] indeed seems the more practical choice here :). An extra reason I came up with is: there's only one value to express a cardinality of zero then. I like having fewer choices, keeps things simple ;). Thanks for the input