Fork me on GitHub

i found myself in the middle of a fun (:man-shrugging:) debate today, the topic being: > can a function still be pure if it calls d/pull? some people say yes, d/pull operates on db as a value. other people say no, d/pull is a side effect that fetches data over the network who's right, and who has to buy the next round at the pub?


example function taken from the Cloud documentation:

(defn inc-attr
  "Transaction function that increments the value of entity's card-1
attr by amount, treating a missing value as 0."
  [db entity attr amount]
  (let [m (d/pull db {:eid entity :selector [:db/id attr]})]
    [[:db/add (:db/id m) attr (+ (or (attr m) 0) amount)]]))


I think it is more important to worry about what your function returns rather than what it actually does. Does your "pure" function always return the same result for the same input?


The difference between a memory load of an immutable value and a network load is mostly going to be performance and possibility of exception. Well I can write a 'pure' function that takes a long time to compute and I can write a 'pure' function that might throw e.g out-of-memory.


Excision may well remove the value-ness of your as-of db though.


yes, i agree with both of your points. Datomic being immutable, you will always be returned the same output for a given input, except for the case of excision (although we don't worry about that in Cloud).


You also can’t just look at the function to decide purity, you have to look at its arguments and return values


d/pull is absolutely pure if db is an in-memory db, for example


but it’s absolutely not pure if it’s an in-memory db that randomly generates results when read


I think more than purity I usually want to know things like: does this function take/return values or objects (lazyness being an important grey area--look at d/entity return values, or lazy-seqs that perform IO); does it return the same thing on repeated calls with the same arguments (“same” depending on whether they are values or objects); could it ever possibly perform io or idle-blocking


Yes, same is not an obvious concept 🙂 But I think the important part to get across here is that the notion of pure function matters for the user of the function and not its implementation. You can have pure functions that manage state internally (e.g. any memoized function).


“pure” can mean same return for same arguments, or no side effects (I think this excludes memoization and IO), or both. Because it can mean any of these things, I think it’s better to be more specific. We know the properties of d/pull, so I think the original question is really an argument about what “pure” should mean.


which I guess is a fine argument for the pub 🙂


(just keep sharp objects away)

😁 3

anything that needs to leave the process in impure, networks failing is common


the example function inc-attr at the top of this thread is actually a transactor function taken from the Datomic documentation. and according to the same documentation, they must be pure. so perhaps inc-attr is pure (in this context) because transactor functions run "inside" of Datomic?


I think “pure” is used loosely here to mean “I may execute this function multiple times while holding a db lock at the head of a queue of other transactions, and you are ok with the consequences of whatever you do in there”


thanks for sharing your thoughts, it's always insightful to pick other peoples' brains.

Lennart Buit19:09:29

So you can specify that a rule requires bindings by enclosing the argument in brackets, but from experimenting I noticed that that doesn’t take clause ordering into account, e.g. this is fine, even though the rule is invoked in a clause preceding the binding of ?e:

(d/q '{:find [?v]
       :in   [$ %]
       :where [(my-rule? ?e)
               [?e :attr ?v]]}
     '[[(my-rule? [?e])
        [?e :attr 12]]])
Why is that? And how should I go about making sure that consumers of this rule don’t accidentally use it in ways that binds large parts of the database?


That looks like a bug to me


does that only happen if it’s the very first clause?

Lennart Buit20:09:21

It also doesn’t complain like this:

(d/q '{:find [?v]
       :in   [$ %]
       :where [[?other-e :attr ?v]
               (my-rule? ?e)
               [?e :attr ?v]]}
      '[[(my-rule? [?e])
         [?e :attr 12]]])

Lennart Buit20:09:40

Lets see what it does on a more contemporary version of datomic

Lennart Buit20:09:12

(same behaviour on 1.0.6202, both queries don’t complain about missing bindings)


yeah, I tried a bunch of things. I can only get this to fail:

(d/q '{:find  [?v]
       :in    [$ %]
       :where [(my-rule ?e ?v)]}
     [[1 :attr 2]
      [2 :attr 12]]
     '[[(my-rule [?e] ?v)
        [?e :attr ?v]
        [(ground 12) ?v]]])
Execution error (Exceptions$IllegalArgumentExceptionInfo) at datomic.error/arg (error.clj:79).
:db.error/insufficient-binding [?e] not bound in clause: (my-rule ?e ?v)


maybe, there is some clause reordering? I wouldn’t know how to predict the performance of this


if it isn’t somehow delaying rule evaluation until after everything’s bound that can be, I would expect your examples to throw

Lennart Buit20:09:29

Hmm interesting. Yeah, if it is just performant, you will not hear me complain. I just got pretty fixated on getting my clause orders right, right, so I was surprised this was allowed


Rules will be pushed down until required bindings are satisfied


Youll get an error if it cant ever satisfy them


That is documented for or and not clauses but not for rules, i will double check and also add documentation

Lennart Buit21:09:02

Ah thank you! Then I learned something today and it was worth to ask