Fork me on GitHub
#clojure
<
2021-12-07
>
wombawomba09:12:53

I need to generate a Java class that holds some Clojure state, with annotated public fields and methods. How can I go about this?

🦗 1
Colin P. Hill12:12:06

There is probably some reason you're not doing this, but may as well ask for the sake of better understanding your needs: if it needs to be a Java class and have a lot of Java stuff going on, why not write it in Java?

wombawomba13:12:54

I'm generating it from Clojure data

wombawomba13:12:51

basically I'm setting up bindings so I can call my code from other languages via GraalVMs polyglot features, and GraalVM requires annotations

wombawomba13:12:13

@UCHFND0T0 yeah I've actually set this up using gen-class now, and it works for methods, but AFAICT gen-class won't let me create fields

emccue15:12:31

maybe look into the ASM ecosystem?

emccue15:12:23

you can also dynamically make gen-class calls and have them work in AOT

wombawomba15:12:39

Hmm yeah, that seems like it might be a bit overkill for this use-case though? I don't really need control at the bytecode level.

ryan19:12:33

Haven't used it, but https://github.com/jgpc42/insn looks like it might do what you need. Support annotations on fields: https://github.com/jgpc42/insn/wiki/Annotations

👀 1
Pepijn de Vos12:12:27

Hm, would zippers be useful to represent an undo/redo structure? Like, if you undo/redo you could just change the location in the tree, but the tricky bit is if you do a new action, it's wrong to redo actions on top of that. The "simple" way would be to lob of the redo part, which doesn't seem to be an operation zippers have. The fancy way would be to actually have an undo tree where a new action on top of a state just creates a new branch. In this case up/down is undo/redo but... a state would be a leaf node and you can't add a state to a leaf node. So... it seems like it would be a useful tool but I can't quite figure out how to best apply it, or if it'd be easier to just keep track of stuff yourself.

Pepijn de Vos13:12:30

Figured it out I think

(defonce undotree (atom (zip/seq-zip ())))

(defn newdo [ut state]
  (swap! ut #(-> %
                (zip/insert-child (list state))
                zip/down)))

(defn undo [ut]
  (swap! ut #(-> % zip/up)))

(defn redo [ut]
  (swap! ut #(-> % zip/down)))

(defn undo-state [ut]
  (-> @ut zip/down zip/rightmost zip/node))

Colin P. Hill13:12:35

Trying to wrap my head around a tricky case of the Clojure/Hickey philosophy on optionality in specs. I'm watching "Maybe Not", and just got to https://youtu.be/YR5WdGrpoug?t=1684 where Rich is talking about how (e.g.) a name is never a "Maybe String" – rather, it's simply a String and you either know it or you don't. But it seems to me there are cases where you need to distinguish between accidental absence (I don't have it) and essential absence (it doesn't exist), and it is in these latter cases I usually reach for optionals in statically typed languages. I'd appreciate help understanding the idiomatic way to grapple with that in Clojure. Example scenario in thread.

Colin P. Hill13:12:49

Suppose we work at a company with in-house customer relationship management software written in a full Clojure stack. When a customer speaks with a CSR, one of the first things that happens is the the software prompts the CSR to ask the customer for any pieces of information we're missing – mailing address, phone number, etc. One day, a customer calls in and the software tells the CSR we don't know their last name. The CSR asks, and the customer replies, "Oh, I don't have a last name. My name is just Cher, first name only." The CSR leaves that field blank and submits the form. The system sees the blank field and says, "I guess we still don't know it," and simply omits the ::last-name key in the map that gets passed down the stack to our database. A week later, Cher calls in again with another problem. Once again, the CSR is prompted to ask her for her last name. Now Cher is annoyed: "I told you last week, I don't have a last name. Didn't you put that into your system?" This CSR is savvy and recognizes that this is a shortcoming in the software. You receive a feature request: we need to distinguish between "I don't know this person's last name" and "I know for certain that this person does not have a last name". How do you represent this distinction in your maps? How do you spec it? Is this not well and truly a "Maybe String"?

Pepijn de Vos13:12:19

Isn't that the same distinction as a key not being there and the value being nil?

Pepijn de Vos13:12:41

That's what I'do do anyway: set the last name to nil if it is explicitly not there. But yea a nullable thing is kind of a poor mans maybe

Colin P. Hill14:12:12

I would think so – and I've specced it that way – but in the talk I linked, Rich seems to be opposed to that. At the timestamp I linked it at, the slide has this line of code:

(spec/def ::model (spec/nilable string?))
and it's crossed out! Afaict, he thinks this is an anti-pattern, and he explicitly says that nothing is a "Maybe String". Idk what the alternative is, though.

Pepijn de Vos14:12:13

IIRC spec has a special thing for nullable fields by the way

delaguardo14:12:29

if this system defines a domain where customer model might not have ::last-name then unlikely you will get validation exceptions in case the key is missing. The model is simply defines a set of facts about every customer: like {::first-name "Foo"} or {::last-name "Bar"}. Both valid examples from my point of view unless those keys are not used as primary indices

Colin P. Hill14:12:56

@U04V4KLKC I'm not sure you quite understood the scenario. There is no validation exception. The problem is that we need to distinguish between accidental absence and essential absence, and I'm struggling to figure out how to fit that into the philosophy of specs that Rich is articulating.

Jelle Licht14:12:12

Isn't the “we don’t have a last name, please ask the customer”-part the problem? There seems to be an implicit understanding that you should have a last name

delaguardo14:12:14

i got "Once again, the CSR is prompted to ask her for her last name." as "validation" exception.

Colin P. Hill14:12:00

@U0190EXLXU1 Yes, exactly. There's a problem in our data model. How do we resolve it while staying within the bounds of the philosophy that Rich endorses?

Jelle Licht14:12:11

Just.. don’t prompt the CSR to ask for last names in the first place?

Jelle Licht14:12:31

Perhaps I’m fixating on the (simplified) example 😅

Colin P. Hill14:12:11

You suggest this and the product owner shoots you down: "No, we often have missing information, and we need to try to fill it in when possible. Can't you just make it so that the software understands the difference between 'I don't know this information' and 'this information doesn't exist'?"

Pepijn de Vos14:12:48

Well you know... maybe really smart people who design systems from a hammock might miss a few real world use cases, so I say if in your system the distinction between types of missing information is important, encode that information. That it's an antipattern in most cases doesn't mean nillable in illegal, it exists to be used.

Colin P. Hill14:12:29

It would be easy enough for me to simply reject what Rich is trying to explain and embrace s/nilable or Maybe or whatever the equivalent is in any given language, but then I'd miss out on understanding this unfamiliar perspective.

Pepijn de Vos14:12:38

IMO the antipattern is setting a field to nil when it could just as well be absent. But nillable is right there for you if nil is a useful value in your domain.

Colin P. Hill14:12:18

"nothing is a maybe string" – this is something he said after a long time working in real-world systems where surely he's encountered problems shaped like this

delaguardo14:12:55

or simply use empty string for last name 😉

Jelle Licht14:12:29

I would still not use nil for that though, but some random ::tickle-me-no-last-name . I don’t think there’s anything wrong with following advice only when it works for you :-)

Colin P. Hill14:12:07

@U04V4KLKC Ha, suppose we could do that. That feels Go-y in a bad way, imo. I guess it could work, but that makes it feel like her name is "Cher [moment of silence]", you know?

Pepijn de Vos14:12:16

Hehe in Julia nil is called nothing making his statement ambiguous. "a nothing values is a maybe string" or "there exist no maybe strings"

Colin P. Hill14:12:29

@U0190EXLXU1 For sure, and until I change my mind I'll continue rejecting this advice and doing things as I've always done. But I know that there's something here I fundamentally don't understand. I'm watching the talk to learn a new way to think about software, not to evaluate it against my established ways of thinking.

💯 1
delaguardo14:12:35

@U029J729MUP true, that might feel as a wrong idea but at the same time you are not mixing different data types in the same field. For me string and nil nether goes together because they can't represent each other (to compare with empty sequence and nil as an example).

Jelle Licht14:12:43

@U029J729MUP how do you distinguish between not-there and not-existing values right now? For map structures, this seems to make them kind-of-useless if using nil

Colin P. Hill14:12:13

> For me string and nil nether goes together because they can't represent each other @U04V4KLKC Isn't that the merit of using an explicit nil? It does not represent any string, and instead represents exactly the case of there being no string whatsoever.

Colin P. Hill14:12:09

@U0190EXLXU1 In cases where I need to model a distinction between "I don't have it" and "it doesn't exist", usually I represent the former as an omitted field and the latter as a field with a value of nil

delaguardo14:12:01

and then you will be forced to respect nileness of the data in case of sending the data to DB or printing.

Colin P. Hill14:12:42

Could use a special sentinel value like a keyword, but that feels philosophically the same – it's a nil by another name, and still violates the stricture of "nothing is a maybe string"

Colin P. Hill14:12:40

Maybe Rich would not see it that way, but then I'd struggle to understand where he sees the distinction

delaguardo14:12:39

my personal take away from the talk is "data should not lie". A field ::last-name should be associated with something that application recognise as a last name and nothing else, nil including.

Colin P. Hill14:12:58

So then how would you solve the problem in the scenario?

delaguardo14:12:21

(s/def ::first-name string?)
(s/def ::last-name string?)
(s/def ::customer (s/keys :req [::first-name ::last-name]))

Colin P. Hill14:12:19

Okay, but we have to figure out a concrete map to pass to the backend where we're somehow informing it that the ::last-name is not simply missing, but is absent by design. We don't want Cher to be asked for it every time she calls in just because our system doesn't understand the difference between "the customer never told us this" and "the customer told us that it's nil"

delaguardo14:12:07

sorry, I corrected the spec. it should have :req instead of :opt

delaguardo14:12:47

both fields required and must be a string. Absence of any of them - is empty string.

Colin P. Hill14:12:28

Hrm. That does strictly speaking satisfy the principles Rich is articulating, but it feels like it violates other ones. If that's really The Hickeyan Way to do it, then maybe I simply disagree with it. But I would be really surprised by that. Surely an empty string is a specific concrete value, not a placeholder representing the absence of a value?

delaguardo14:12:35

> "the customer never told us this" and "the customer told us that it's nil" I don't think system should know the difference.

Colin P. Hill14:12:49

The system must know the difference. The customers are complaining because it doesn't.

delaguardo14:12:29

not trying to claim "this is Rich Hickey way" 🙂 that is exclusively my understanding

Colin P. Hill14:12:10

Oh sure. Rich isn't here, so other folks' interpretations are all I have. Yours seems compatible with what he's saying, so it's the best candidate I have so far for how to understand the talk.

delaguardo14:12:16

> The system must know the difference. The customers are complaining because it doesn't. (edited) may I ask how they could discover that system doesn't know the difference?

ghadi14:12:32

haven't read the whole scrollback, but if Cher doesn't have a last name, and the login system requires a first & last name, then Cher can't login

ghadi14:12:15

making a Maybe is like making an empty cell in the database. the info is either there or not

ghadi14:12:50

the solution isn't 'better' types, it's making the login system robust to Cher, Madonna, Brazilian futbol players

delaguardo14:12:24

> haven't read the whole scrollback, but if Cher doesn't have a last name, and the login system requires a first & last name, then Cher can't login depends how the system is implementeed. I don't see any problem with sending the form like that {"first-name":"Cher","last-name":""}

Colin P. Hill14:12:57

> may I ask how they could discover that system doesn't know the difference? The scenario revealed that problem. At the outset, we have sparse data – very common in customer relationship stuff. Cher told the CSR that she doesn't have a last name. The field was left empty, and that was submitted to the backend. That was her answer! It should be final! But because the system is modeling sparse data – "I don't know her last name" – and explicitly non-existent data – "I have no last name" – in the same way, the system will ask over and over again what her last name is.

Colin P. Hill14:12:26

> the solution isn't 'better' types, it's making the login system robust to Cher, Madonna, Brazilian futbol players Okay, but how do we make it robust? The context is a talk about spec, but really spec is just an effort to formally describe whatever data model we might have had anyway.

ghadi14:12:10

ew to last-name=""

ghadi14:12:27

make it more robust = by only requiring first name

ghadi14:12:51

her last name is not Null, nor "", nor None

Colin P. Hill14:12:53

Okay, so we relax the backend's requirements. It will accept {"first-name":"Cher"} as a valid payload and update only that field. The next time Cher calls in, the system still doesn't know her last name, so it once again prompts the CSR to ask her. It is still failing to distinguish between "it doesn't exist" and "I don't know what it is". The original problem has still not been solved.

ghadi14:12:13

sounds like you didn't update the system cohesively

Jelle Licht14:12:29

Can’t you turn it around? Use a sentinel value for ‘still-need-to-ask’ of some-sorts

Colin P. Hill14:12:10

@U0190EXLXU1 I think that would work fine, but afaict it's just another way of saying "a last name is a Maybe String", and I think it's still doing exactly what Rich is saying not to do

Jelle Licht14:12:45

You can key it into a :please-still-ask-for

Cora (she/her)14:12:56

what we really need is another nil-like value! how about undefined? troll

😂 1
ghadi14:12:15

the key piece here is whether the system knows the entity is complete or not

Jelle Licht14:12:16

Or store a :already-asked-for somewhere: but it quickly gets really complicated if you don’t know all the valid transitions for the data (now, and in the future with new requirements/fields). I would probably never overload nil for this though, because it makes your map useless in e.g. destructuring

Cora (she/her)14:12:14

sentinel values the user can't possibly enter is the only real solution here, imo

Colin P. Hill14:12:51

I think using a sentinel for "idk, we need to ask" and omission for "it doesn't exist" is fraught. It would imply that a map of {} is an assertion to the effect that no information about anything exists, rather than the usual idea that a map might simply have sparse data

lassemaatta14:12:14

This might be a dumb idea, but how about you keep track separately which fields needs input from the user? Not just checking if a particular field is nil/missing/””? If the user tells you she/he has no last name you clear the appropriate flag etc

Colin P. Hill14:12:22

@U0178V2SLAY So our map is something like {::first-name "Cher", ::first-name-finalized true, ::last-name-finalized true}?

lassemaatta14:12:41

Something like that. Or keep the names of the missing fields in a set or whatever.

Jelle Licht14:12:58

It has the advantage of not complecting “no value” vs “value missing”, and can be interpreted on its own

ghadi14:12:03

just accept the absence of information without having to put a box around it

ghadi14:12:08

it's liberating 😂

Colin P. Hill14:12:19

@U050ECB92 Okay, but we still need to distinguish between two kinds of absence!

ghadi14:12:29

then mark the record as complete

ghadi14:12:44

ostensibly you wouldn't make it through a registration flow without a complete record

Colin P. Hill14:12:51

That's dangerous. What if a new field is added later? What if the front end is out of date?

ghadi14:12:51

for whatever your system deems "complete"

delaguardo14:12:59

so yay to flags and ew to empty string ? ))

ghadi14:12:13

haha -- or no flag at all

Colin P. Hill14:12:43

To say "this record is complete" is not to say "this field here is absent by design", it is to say "all fields we don't have are absent by design" – which is dangerous when you might not know about every possible field. It makes the addition of a new field a breaking change that requires lockstep upgrades of all applications that understand this entity.

Cora (she/her)14:12:39

yeah, it's not a good design, imo

Colin P. Hill14:12:39

I don't really love the idea of having separate flags, but it does at least keep completion of each field isolated from other fields

Jelle Licht15:12:43

Making this all ‘self-contained’ in a single key (or lack thereof) means that any choice you make, you are stuck with it going forward; in that sense having it in a separate key makes future choice (“Oh let’s not bother random people for random personal info ever, unless it is a hard requirement for legal purposes”) also possible through a simple mechanical data migration

Colin P. Hill15:12:57

That's a good point. I guess we can make a case that bundling all of these absences into the semantics of a single key complects ontology with epistemology.

🤯 2
lassemaatta15:12:32

Instead apply vexillology ;)

Colin P. Hill15:12:54

Although I'm not sure whether the standard Clojurian idea of a sparse map that only contains the things we know really escapes that complecting

Jelle Licht15:12:35

You added a requirement about having to ask things: somehow noting that you have asked something (or should not ask something) seems like a proper thing to store in the data

Colin P. Hill15:12:08

I said it's a good point but I think I'm talking myself out of believing it now. Presence is epistemology – we have a fact, we know a thing – while value is ontology – here is what that fact is. If the fact is "there is no thing", shouldn't we represent that directly?

Jelle Licht15:12:25

But that is not what you are doing: “I will not give you my last name, please don’t ask again” is also an answer you can expect

Colin P. Hill15:12:40

Right, and {::last-name nil} asserts a fact (because the key is present), and the fact being asserted is that no last name exists. It's a statement about non-being, which feels a bit vacuous, but it's still a meaningful thing to say. Meanwhile {} asserts nothing at all.

Jelle Licht15:12:59

Either way, I think the “should-not-ask-this-question” can not be directly derived from the data, unless you store it specifically.

Colin P. Hill15:12:30

> “I will not give you my last name, please don’t ask again” is also an answer you can expect This feels like it's shifting the problem from what exists to what we know. When Cher tells us she has no last name, that is a full and complete answer to the question. The question does not remain open, and the field does not remain an unknown. We have our answer: there is no last name.

Jelle Licht15:12:38

Whichever choice you make today might not be the right one tomorrow. Just as you might add (or remove) fields, so should your “can/should/may I ask for this information?”-requirements be able to change over time. Gender seems like a fine candidate of a piece of data that will provide work for the foreseeable future, in adjusting processes and rules where deemed necessary

Colin P. Hill15:12:23

I think if we try to model this in a way that also captures refusal to answer, then we're conflating two different scenarios: a question whose answer is permanently unknown versus a question whose answer is known and happens to entail absence.

Jelle Licht15:12:46

Data is just data 😄

Colin P. Hill15:12:33

Okay, I guess we can make the case that we don't need to actually care about that distinction

emccue15:12:38

> Data is just data Feels like we bottomed out

😄 1
Colin P. Hill15:12:55

And rather than modeling whether we know it, we can just set that aside and do exactly what you're suggesting: model whether to ask it. Maybe we ask because we suspect it's out of date (a business rule might periodically invalidate mailing addresses), maybe we ask because it's a new field, maybe we ask because we never got an answer, and maybe we don't ask, because we've been told that the empty value (however we model that) is correct.

💯 1
Colin P. Hill15:12:05

I think this satisfies the scenario I gave just fine, and I appreciate your perspective. Something still doesn't feel quite right – I feel like our systems will, at least sometimes, need to model, "I know this fact, and the fact I know is that X doesn't exist" – but I'll have to do more thinking on whether I can actually identify a case for that.

emccue15:12:05

i’ve never met rich. as far as i know he eats his own boogers and chews on coffee cup lids

emccue15:12:28

i think what is valid is to say that if you access the info :last-name and get this

emccue15:12:59

dont know | known to not exist | exists
  nil     |       nil          | "abc"

emccue15:12:08

then you have overlapping domain things

Colin P. Hill15:12:32

Yes, for sure. Whatever we do, we can't simply conflate those two things.

emccue15:12:37

if clojure maps were implemented in a way where an accidental nil were not something you would run into then maybe it would be a different question

emccue15:12:01

like if a Map gave an Option<T> and you stored Option<T>, you can get an Option<Option<T>>

emccue15:12:06

no overlapping domain

Colin P. Hill15:12:07

Well, Clojure maps do let you distinguish between a key which is absent and a key which is set to nil

Colin P. Hill15:12:10

we just usually don't

emccue15:12:24

it technically can but its in the etherspace of data representation

emccue15:12:44

i think what you want is a known sentinel

emccue15:12:58

dont know | known to not exist | exists
  nil     |      :none         | "abc"

emccue15:12:15

no overlapping domain, no contains? check, fits better with clojure’s apis

emccue15:12:46

and then you can have nil values always mean “dont know”

Colin P. Hill15:12:47

Yes, Jelle was suggesting that earlier. I think that amounts to just an ersatz Option, and is really just the same thing as the {}/`{::last-name nil}` distinction, but just happens to have better ergonomics in Clojure. In principle it's not different afaict.

emccue15:12:48

an Option is Some(T) + None which is == to T + A where A is not in the domain of T

emccue15:12:47

again, this is a stranger’s philosophy that is embodied most strongly in a library that isn’t finished

Colin P. Hill15:12:31

Yes, true. It could be that if he had full and explicable answers to these questions, he would have implemented it by now!

lassemaatta15:12:36

correct me if I'm wrong, but when using a sentinel(s) to indicate something (in the absence of the actual data), the consumers must now separately deal with it (instead of just using the :last-name field as a string or whatever)

Colin P. Hill15:12:31

@U0178V2SLAY Yes. But this is just a way of encoding a distinction which the problem domain has already said is important – however we choose to represent it, we're going to have to deal with it either way.

emccue15:12:20

you can encode it so that the knowledge of the distinction is higher up by enforcing an “object contract”

(defn last-name [person if-not-known]
  (get person :last-name if-not-known))

dvingo15:12:29

Reasoning from first principles, there is no reason to accept someone else's opinion about how to do something just because of their status. FWIW other people with lots of experience building real world systems explicitly support Option/Maybe types in their systems: https://github.com/metosin/malli#maybe-schemas

💯 1
emccue15:12:37

but again now we are in a fuggin more than 100 item thread

emccue15:12:00

this is a waste of human potential

Colin P. Hill15:12:06

@U051V5LLP That's not what I'm doing though. I'm not accepting his opinion, I'm working to understand what it is. Then I will decide what I think of it.

dvingo15:12:43

Well, he's not on here much, so good luck getting a reply.

Colin P. Hill15:12:28

I'm also not trying to ask him. I'm asking others for their interpretations. The same way you might discuss a book and ask, "What do you suppose the author meant by this? What are its implications?"

dvingo15:12:27

Well it seems like the discussion listed the alternatives - you somehow have to track the flavor of the unknown data.

1
emccue15:12:15

i think we just invented javascript

emccue15:12:19

undefined and null

Colin P. Hill15:12:28

I'm sorry to say someone beat you to that joke

emccue15:12:34

oh goshamn it

😞 1
Cora (she/her)16:12:07

it was me

😁 1
Cora (she/her)16:12:10

I made the bad joke

Max17:12:57

Just gonna throw my 2c in because who doesn’t like throwing oil on a fire I think that to understand Rich’s argument you have to understand what he’s arguing against. Most languages/databases force you to represent the absence of a value, and most of the time, people use null to represent that absence. If we approach this from first principles in Haskell for example, I’d argue that the type of each of these fields is:

Field a = Present a | Unknown | IntentionallyOmitted
This is a completely different type from Maybe. You wouldn’t model it as a Maybe, you wouldn’t use a nullable field in a database, etc. It’s a sum type with three possible states. Now, we can ask the question of how to model this in Clojure, a language without ADTs. Present is just the value, and Rich makes a strong argument for modeling unknown values with the absence of a name, so that takes care of Unknown. How about IntentionallyOmitted? We could use nil, but that risks confusing other engineers who are used to nil meaning “absence”. In this case, I’d probably use a sentinel keyword. But that’s not an ersatz Maybe, it’s a representation of one of the three possible states for that field in the system.

🔥 1
☝️ 1
emccue17:12:37

can people please have long arguments about my opinions? I have a lot of them

🧞 1
respatialized17:12:30

I have a different solution to the problem: a :full-name / :canonical-namefield. In the case of individuals with first, middle, and last names, it would be a concatenation of all three. In Cher's case, it would be the mononym. At the cost of some redundancy and needing to do some lookups, the system can distinguish between "known to not have a last name" and "missing last name" by comparing :full-name with the other name fields. in fact, you could replace top-level fields like :first-name and :last-name with a :name field that's a disjoint type (I'm using malli syntax for convenience here). this has the benefit of being extensible to https://en.wikipedia.org/wiki/Spanish_naming_customs that prevail in other languages:

[:or
 [:mononym :string]
 [:first-last [:map [:first :string] [:last :string]]
 [:conventional-spanish [:map [:given :string] [:patronym :string] [:matronym :string]]
I think the original problem may be complecting the questions of "what is this person's name?" and "what are the parts of a name?"

Cora (she/her)17:12:10

the problem isn't really the name issue, though, it's how do you represent IntentionallyOmitted vs Unknown for fields. the name issue was an example just to explain that in some cases this matters

1
respatialized17:12:33

I can't make a conclusive case for this, but as some others in the thread may have suggested before, I have a very strong suspicion that other cases where we think we have "intentional omissions" are where there are unstated assumptions in the data model. so I think other cases where we need to distinguish between "intentionally left out" and "missing" end up being like my name example upon further inspection.

Cora (she/her)17:12:25

there may be other reasons that there is no field and it matters that we indicate it

Max18:12:12

I mean, another way to approach this is to decomplect 😉 the features. :last-name is present if present, absent if not, and then you have a separate :last-name-confirmed. When you’re looking at the data for its own sake, :last-name is just data. When you want to power the “should we ask” feature, you have the data for that too, and an absent :last-name plus a true :last-name-confirmed means they’re Cher

2
emccue18:12:59

I think there is an option we aren’t considering here

emccue18:12:58

(def first-name 0)
(def last-name 1)

(defn create-person [first-name last-name]
  [first-name last-name])

(get (create-person "Abc" "Def") last-name)

😱 1
😂 1
Colin P. Hill18:12:11

> We could use nil, but that risks confusing other engineers who are used to nil meaning “absence”. In this case, I’d probably use a sentinel keyword. But that’s not an ersatz Maybe, it’s a representation of one of the three possible states for that field in the system. Eh, I still think it's an ersatz Maybe. Perhaps more precisely an ersatz Maybe Maybe. We've given it more precise semantics than "it might not be there!" by directly encoding why it isn't there, but this still runs up against one of Rich's central theses: a name is just a string, not a maybe string. I don't think giving a more precise name to that optionality addresses his essential point in saying this.

Max18:12:01

Seems like you’d prefer the :last-name-confirmed approach?

Colin P. Hill18:12:16

I mean, my preference is the sentinel value, but I'm trying to figure out what kind of approach falls out of Rich Hickey's domain modeling philosophy

Colin P. Hill18:12:13

Tracking "have we figured out this value?" as a separate datum has come up before in this thread and seems to fit, but feels weird to me in ways I don't quite yet know how to articulate

Colin P. Hill18:12:50

Like, "I know that this value doesn't exist" is just as much a knowable fact as "I know that this value is X", and it seems wrong to me that we would express these in different places. But maybe this really is the implication of Rich's philosophy and I just disagree with it. I've been hoping for an aha moment where I see something that looks tidy (even if I ultimately prefer something else).

Jelle Licht18:12:26

Don't you think you might want this way of expressing yourself, because it is the easiest thing in other languages? Not saying the approach is wrong in all cases, but perhaps you need to reconsider how you decompose problem statements. To me it's not a given that I often need to differentiate 'don't know’ vs ‘does not exist’; an issue with most of the proposed alternatives is that they are not generic, but perhaps we don't want generalised solutions for problems that fundamentally only occur in highly specific situations

Max18:12:02

For what it's worth, if you generalize the problem from Boolean confidence to percentage confidence, it starts to look more like the 2-key approach: {

:sample-value 12
 :sample-confidence 0.89}
If you don't like those hanging out at the top level you can always nest them in their own map, but in Clojure at least flat is preferred to nested

Cora (she/her)18:12:09

{:last-name {:value nil, :state :confirmed}}

👍 1
Cora (she/her)18:12:56

not a fan of that, but that's what we're talking about, essentially. we need states of fields represented somehow

lassemaatta18:12:38

it sounds a bit like you need to model the individual attributes (of some hypothetical entity) as more than just a single key-value pair. because we could then start asking all sorts of questions about the attribute (do we know the value? why don't we know it? when did we last try to retrieve the value? how did we ..) edit: cora was faster 🙂

🙌 1
Max18:12:57

In a more general sense, this is an instance of the metadata problem: programming languages and databases aren't great at representing information about information. The RDF-inspired model that's underneath Datomic and a lot of clojure’s design aesthetic famously is terrible at this. On the other hand, it’s kind of a niche need.

Colin P. Hill18:12:27

@U0190EXLXU1 I've got several competing thoughts on that. From my own perspective, it's a universal fact that Cher has no last name – this isn't context specific, this is simply the shape of the domain and is a fact we should be able to assert for anyone interested in knowing it. From the perspective of trying to interpret Rich's philosophy, the goal here is context-free domain modeling, so while I'm sympathetic to sticking to narrow solutions when generalizations don't make themselves apparent, I don't think it helps me make sense of what is being endorsed in that talk.

Colin P. Hill18:12:07

re: everyone else Yeah, "whether and the way in which we have this value" is starting to feel like more than just a boolean proposition...

Colin P. Hill18:12:59

> The RDF-inspired model that's underneath Datomic and a lot of clojure’s design aesthetic famously is terrible at this I am unfamiliar with this fame. Have any resources you can point me to on breakdowns of where it breaks down?

2
Cora (she/her)18:12:24

also interested

Colin P. Hill18:12:28

Because it's not as if it's an esoteric problem – most PATCH services will run into it

Cora (she/her)18:12:38

just serialize all values to include their clojure metadata and stick any relevant info in there! problem... solved? troll

Colin P. Hill18:12:26

(contains? my-map ::last-name)
;; => :its-complicated

😆 2
Max18:12:49

Uh, we may want to start a new thread for that 😆 and I may have been overstating its famousness. It’s famous among people who have tried to model metadata with RDF, which I can’t imagine is a particularly large club. I’ve run into it because of a knowledge modeling side project, and the lack of that capability drove me to explore more general approaches, like those offered by knowledge engines/knowledge bases like https://vaticle.com/. For a concrete example of this problem, consider a knowledge base like Wikipedia that contains facts like United States founded-in 1776. You’d want to back that fact up with a citation, but how would you model the citation? You really want to use the triple itself as the subject of another triple, but RDF doesn’t support that. Instead they offer reification, which no one likes:

r isa reified-node
r subject United states
r verb founded-in
r object 1776
r citation ...
https://w3c.github.io/rdf-star/cg-spec/editors_draft.html* was offered as a possible solution, but it’s pretty much been shouted out of the room. A solution that some RDF-based knowledge bases have come up with is called “automatic reification”, which is a non-standards-based thing where they let you view any triple as reified or non-reified based on the context via godawful magic under the hood. So actually @U02N27RK69K is kind of right, the most “accurate” way to model this is to put it in metadata. It is literally data about data. The fact that that’s impractical is what I meant when I said that programming languages and databases aren’t great at working with metadata

thanks3 1
👍 1
emccue18:12:12

in datomic i think you would use the transaction id as the entity of another triple

emccue18:12:33

so “these facts” came in via transaction “t” and then you can attach the citation to “t”

Max19:12:25

Not quite, Datomic’s transaction ID refers to 1…many assertions. When you use it as a subject you’re asserting about the transaction, not about a triple. It’s the right place for metadata like “who added this” and “why’d they do it”, not for triple-specific metadata

Max19:12:39

an orthogonal but also useful concept

pithyless19:12:22

I agree that what we are trying to model here is the domain of a sum-type [Value | Unknown | NoValue]. By definition, you would need a way to represent 3 different states. Datomic is not a good example, because attribute values need a specific type (e.g. string), cannot be nil, and cannot be a sum-type (e.g. string OR sentinel-keyword). But, I would argue you are not breaking some unforgivable rule of idiomatic Clojure, if you decide you need more nuance in your domain and be more strict about what it means for map values to be nil or sentinels. (And you can use runtime validators like spec and malli to keep everyone in check). A practical use-case of this in the wild is the Pathom library that deals exactly with this problem: a resolver must differentiate to the caller between "I have no value, go ask someone else" and "there is no value, don't bother asking anyone else". In the internals it uses a sentinel keyword to represent this third-state, but the public API actually expects the developer to be careful in what they return: a map without a key is "Unknown - keep searching", and a map with the key and a value of nil is "NoValue - end of story".

Max19:12:02

To sum up, there’s a difference between “data modeling” and “knowledge modeling”. Data modeling is (usually) about storing data to serve a specific need, whereas knowledge modeling (usually) is about modeling everything that is known to allow you to answer arbitrary questions in the future unknown to you at present. Knowledge modeling is a lot harder and the tooling support is worse. Once in a while, data modeling problems acquire a knowledge modeling flavor, and when that happens no matter what it’s gonna be a little messy (the above is all my opinions, I haven’t heard this stance articulated by anyone else in the industry)

Colin P. Hill20:12:11

Thanks Max, that's all really useful perspective. I guess maybe the ultimate answer to my question is: Rich's model doesn't have an answer to this, but that's just because it's a largely unsolved problem in modeling.

Colin P. Hill20:12:19

And familiar solutions – like a JSON PATCH endpoint distinguishing between an omitted value and a null one – might work fine in their specific contexts but tend not to be generalizable.

Colin P. Hill20:12:54

Although I think I'd dispute that data modeling and knowledge modeling are separate – but I might admit that we can often pretend they are. I think at heart this might be a manifestation in software engineering of a general problem in philosophy: in some ways the object of our knowledge seems to only ever be an idea and never a thing in itself, so whenever we try to write down what we know we find it impossible to completely disentangle from the subjective details of the manner in which we know it.

Max20:12:51

Right, I think this is a good example of the difference between inherent complexity and accidental complexity. Modeling knowledge in a general fashion is inherently complex, whereas modeling enough data to drive a particular application is usually not (even though we often make it accidentally complex). YAGNI applies, until of course, it doesn’t.

1
Max20:12:22

Another thing you might be interested in looking into is open world/closed world assumption in knowledge modeling. In a closed world, the fact is true if it exists and false if it doesn’t. In an open world, the fact could be asserted true, asserted false, or not in the system at all meaning it’s unknown. Turns out it’s really hard to build applications on top of an open world assumption

Colin P. Hill20:12:00

Yeah, that makes sense as a generalization of what we've looked at here

Max20:12:25

An example: OWL, an ontology language for RDF, says it uses an open world assumption, but as far as I can tell, most people use it as if it uses a closed world assumption

Colin P. Hill20:12:05

The purist in me is frustrated by this because I know that doing something that you know is "wrong" in the domain model means shipping a fast MVP at the cost of a high likelihood of expensive changes later. But I guess if philosophy hasn't figured out the relationship between knowledge and truth after ~2500 years, I can't expect computer science to have done so after less than a century.

😆 2
wombawomba15:12:07

Let's say I have a macro (defmacro foo [x] ...) and a call (foo :bar), how should I rewrite the macro so I can break out the argument into a var (i.e. (def bar :bar) (foo bar))?

wombawomba15:12:55

I came up with (defmacro foo [x] (let [x (if (symbol? x) @(resolve x) x)] ...)), but I'm curious to hear if there's a better way

lilactown15:12:36

so i'm playing with today's Advent of Code and solved it, but slowly. I'm profiling my code and it looks like 65% of the time is spent in clojure.lang.Numbers.minus

lilactown15:12:25

the offending line seems to be (🧵 for potential spoilers but it's v generic code)

lilactown15:12:36

(->> positions
     (map #(sum (abs (- % pos))))
     (reduce +))

lilactown15:12:23

Math/abs is also v slow so I wrote my own, but that doesn't seem to be the slowest by far. just the (- % pos)

lilactown15:12:00

I'm guessing this is because of doing a bunch of typechecking/casting?

lilactown15:12:30

there's no unchecked-minus

lilactown15:12:53

ah unchecked-subtract

Alex Miller (Clojure team)16:12:46

it's not checked/unchecked, it's boxing

Alex Miller (Clojure team)16:12:10

boxed math (which you're doing) is about 100x primitive math

lilactown16:12:38

any tips on how to unbox things?

Alex Miller (Clojure team)16:12:43

all sequence functions will box (they only take Objects) so this almost always ends up as a loop/recur

Alex Miller (Clojure team)16:12:08

what's positions? coll of ??

lilactown16:12:39

coll of ints

lilactown16:12:04

(defn sum
  [n]
  (reduce + (range 1 (inc n))))


(defn abs
  [n]
  (if (neg? n) (- 0 n) n))


(defn part2
  [input]
  (let [positions (->> (string/split input #",")
                       (map #(Integer/parseInt %))
                       (sort))
        max-pos (last positions)]
    (loop [pos 0
           least-fuel (reduce + (map sum positions))]
      (if (= max-pos pos)
        least-fuel
        (let [fuel (->> positions
                        (map #(sum (abs (- % pos))))
                        (reduce +))]
          (recur
           (inc pos)
           (if (< fuel least-fuel)
             fuel
             least-fuel)))))))

Alex Miller (Clojure team)16:12:47

would be best to parse into an int[] (or long[]), and then refactor anything that's seq ops into either areduce etc or loop/recur

lilactown16:12:40

i'll try that

Alex Miller (Clojure team)16:12:53

and then in the loop/recur, you want to ensure that things are unambiguously prims in the loop decl and in the recur, but really you want primitive longs (Clojure doesn't really fully support primitive ints)

Alex Miller (Clojure team)16:12:24

(set! *unchecked-math* :warn-on-boxed) can also help find things

Alex Miller (Clojure team)16:12:49

sum and abs are also not very good for perf. Polymorphic fast abs is coming in 1.11 too ;)

💪 1
1
lilactown16:12:20

i def started from "good enough to solve the problem" and am now working my to something speedy 😛 profiling led me to the boxed math first

lilactown16:12:07

naively trying Math/abs to my own abs impl cut the run time in half

Alex Miller (Clojure team)16:12:23

Math/abs is a hotspot intrinsic so should definitely be part of any use - that's going straight to hand optimized assembly code

Alex Miller (Clojure team)16:12:48

but should also ensure it's getting a primitive input, not doing unboxing

pavlosmelissinos16:12:20

FWIW @U4YGF4NGM, in this particular case (advent of code 7b) I found that just adding type hints improved performance by an order of magnitude (10s to 1s) for me, e.g.:

(defn distance [^long a ^long b]
  (Math/abs (- a b)))
that's with a naive, brute-force approach, no clever algorithmic stuff or anything

lilactown16:12:58

didn't change anything for me AFAICT but I think it's because of what alex alluded to, the seq ops automatically box values going into my distance fn

lilactown16:12:02

is that right?

Alex Miller (Clojure team)16:12:58

the above at least is getting unboxed (- a b) :)

Alex Miller (Clojure team)16:12:23

but really, you want to arrange for the whole thing to be unboxed primitives

lilactown16:12:52

a very mechanical replacement of the above seq ops with arrays/areduce:

(defn sum-fast ^long
  [^long n]
  (/ (* n (+ 1 n)) 2))


(defn part2-fast
  [input]
  (let [^"[J" positions (->> (string/split input #",")
                             ;; according to alexmiller long support is better
                             (map #(Long/parseLong %))
                             (into-array Long/TYPE))]
    (loop [^long pos (reduce max positions)
           ^long least-fuel (areduce positions
                                     i ret (long 0)
                                     (+ ret ^long (sum-fast (aget positions i))))]
      (if (zero? pos)
        least-fuel
        (recur
         (dec pos)
         (let [^long fuel (areduce
                           positions i ret (long 0)
                           (+ ret
                              ^long (sum-fast (Math/abs (- (aget positions i)
                                                      pos)))))]
           (min fuel least-fuel)))))))
plus a much better sum impl

lilactown16:12:23

strangely, even after converting all of the seq ops in part2 to areduce, the run time stayed roughly the same until I also factored all seq ops out of the sum

lilactown16:12:01

the first impl was running about ~6-7s on my laptop. part2-fast runs in about 35ms

👏 2
Alex Miller (Clojure team)17:12:40

It's usually about 2 orders of magnitude different

Alex Miller (Clojure team)17:12:41

The next level of this is to start checking the bytecode

Ben Sless17:12:44

Speaking of boxed math, it doesn't warn on Util.equiv when it probably could, I can produce a simple repro

Antonio Bibiano17:12:44

are there any libraries that help with this kind of operations?

Antonio Bibiano17:12:19

looks like a subset of what numpy does for python

Ben Sless17:12:44

(defn string-char-predicate
  [p]
  (fn charset-pred ^Boolean [^String s]
    (let [n (.length s)]
      (loop [i 0]
        (if (= i n)
          true
          (if (p (.charAt s (unchecked-int i)))
            (recur (unchecked-inc i))
            false))))))

(defn string-char-predicate*
  [p]
  (fn charset-pred ^Boolean [^String s]
    (when-let [n (some-> s .length)]
      (loop [i 0]
        (if (= i n)
          true
          (if (p (.charAt s (unchecked-int i)))
            (recur (unchecked-inc i))
            false))))))
Should the second emit boxed math warnings? i is long but n is Object

seancorfield17:12:01

Please use the #adventofcode channel for AoC discussions.

lilactown17:12:23

@U04V70XH6 while the impetus of the question was AoC, i figured because i was trying to figure out the root cause of my performance problems w/ some pretty generic Clojure code that #clojure made more sense

lilactown17:12:54

i'll avoid posting full solutions in the future

seancorfield18:12:49

Yeah, I figured this was borderline -- after we'd posted in #admin-announcements about it -- but wanted a reminder for everyone in the thread.

2
👍 1
Alex Miller (Clojure team)19:12:37

the boxed math warnings will not catch everything - it's not actually possible unless doing those with full runtime info I don't think

Alex Miller (Clojure team)19:12:02

if you look up the original ticket there are more details in there (and at least one open spin off for ret types in particular)

Alex Miller (Clojure team)19:12:59

I don't have the time to figure out if the case above is detectable by what the boxed math warnings cover

GGfpc17:12:45

How do you usually document data structure? The one thing I miss about typed languages is being able to create a struct for a return type of a method and know exactly what that method is supposed to return.

Ben Sless17:12:57

This is a great use case for spec

GGfpc17:12:03

Do you just assert that the data matches the spec and use that assertion as documentation?

Ben Sless17:12:18

You could, you could also turn this assertion on at run time

Ben Sless17:12:07

i.e. (s/fspec foo :args ,,, :ret (s/keys ,,,)) You could write it right below your function, it's both documentation and a validation which can be turned on at run time or dev time

emccue17:12:16

or you can use malli - same deal

emccue17:12:43

it only asserts when you instrument your functions which most people only do at dev time

emccue17:12:54

but yes, the spec serves as documentation

Joshua Suskalo18:12:52

The doc macro at the repl will also include spec information

qqq19:12:11

Is there any library that provides in-process-memory (not sqlite, not mysql, not postgresql) relational tables for Clojure, using Clojure instead of SQL as the "query language".

ghadi19:12:28

Datomic dev-local

1
wotbrew20:12:51

Obviously #relic is going to be awesome when done 😉 , but you could use h2 + honeysql https://www.h2database.com/html/main.html

clojure-spin 2
potetm02:12:57

datascript

qqq02:12:23

@U064UGEUQ: Quoting the relic README: > Did you try https://github.com/noprompt/meanderhttps://github.com/clojure/core.logichttps://github.com/tonsky/datascript and every graph-map-database under the sun but still do not feel http://curtclifton.net/papers/MoseleyMarks06a.pdf? Yep, describes how I feel.