datomic 2016-10-20 | Slack Archive

Matt Butler08:10:33

Question about not clause and cardinality many refs. If an entity has 2 favourites, one where the type is :like and another where it is not (lets say :thumb-up) will this query return the entity or not? In my testing is appears to return the entity. In english I think what I’m asking for is a never satisfies clause.

[:find e?
 :where [?e :favourites ?f]
        (not [?f :type :like])]

tengstrand09:10:14

@teng I think I have the answer. The retrieved attributes are those that can change. The :db/id never change, it’s the actual timeline for the entity!

rauh09:10:00

@teng It's also because .keySet doesn't return :db/id, which is because :db/id is technically not an attribute of the entity.

tengstrand09:10:33

@rauh ok!

val_waeselynck10:10:27

@mbutler not sure what you're trying to achieve, could you reformulate?

dominicm10:10:51

I think the question is, given that :favourites could have many ?f results, and the desire is that if any of them match a clause, then it's not a matching ?e. Does that make sense?

val_waeselynck10:10:53

@dominicm yeah totally, just wanted to be sure

Matt Butler10:10:19

Seems to work, not entirely sure why though 😄

Matt Butler10:10:59

@val_waeselynck Thanks so much 🙂, I thought not-join was about specifying which variables unified. Why in this case does it work for what I need?

val_waeselynck10:10:29

@mbutler basically, your previous not clause did not work because it filtered out [?e ?f] pairs, not [?e] singletons

val_waeselynck10:10:19

I'm not sure not-join is entirely necessary, maybe the following would work:

val_waeselynck10:10:37

[:find e? :where [?e :favourites ?f] (not [?e :favourites ?f1] [?f1 :type :like])]

val_waeselynck10:10:55

I just prefer not-join because I find it easier to understand what it does at a glance

Matt Butler10:10:19

@val_waeselynck I think I understand, could you help explain why it works on all of the favourites rather than match on any favourite that isn’t the one specified in the not block?

Matt Butler10:10:23

Because there is a favourite that belongs to ?e that doesn’t have a :type :like

Matt Butler10:10:15

So why does datomic think that satisfies the query, I’m happy it doesn’t but am a little unclear on the logic.

val_waeselynck10:10:05

I'm sorry I don't get what you wrote 🙂 probably too used to Datalog now. Maybe you could show the results with example data and show me what surprises you ?

Matt Butler10:10:08

?e has 2 favourites {:type :like} {:type :dislike} The reading of that datalog query seems to say match where ?e has a favourite where its :type isnt :like. which is correct for one of the favourites, bother of which belong to ?e so why doesn’t that return ?e.

Matt Butler10:10:36

one of e?s favourites is :type :dislike which isn’t :like so why does the not work across all the favourites

Matt Butler10:10:54

I agree that is does, and its good that it does just surprising.

val_waeselynck11:10:55

No, the query says to match ?e where ?e has a favourite, and ?e does not have a favourite which type is :like

val_waeselynck11:10:43

the key difference is that the two mentioned favourites don't have to be the same

Matt Butler11:10:13

Okay, so not works in the way I had hoped I just needed to structure my query differently 🙂 When applied across cardinality many refs it works like saying none? Is this correct?

Matt Butler11:10:55

(not 
          [?e :favourites ?f1]
          [?f1 :type :like])

Is saying there is ?e does not have a favourite that has a type like

val_waeselynck11:10:18

@mbutler yes and no, it depends if ?f1 is externally bound

val_waeselynck11:10:34

(not-join [?e]
  [?e :favourites ?f1]
  [?f1 :type :like])

val_waeselynck11:10:42

^ this query definitely says so

Matt Butler11:10:02

I think I’m now on the same level, final but that leads me to thinking why doesn’t this work

(not 
          [?e :favourites ?f]
          [?f :type :like])

Why does this give db.error/insufficient-binding [?i ?f] And is the [?e :favourites ?f] that you put outside the not just to prevent this error?

Matt Butler11:10:46

The [?e :favourites ?f] is because we want to say that the [?e :favourites ?f] is a relation we want?

val_waeselynck11:10:14

I don't know exactly if this is a theoritical limitation or a practical one. The Datomic team will be of more help than me for that

Matt Butler11:10:19

I’m going to say that [?e :favourites ?f] gets you an entity with favourites, then the

(not-join 
          [?e :favourites ?f]
          [?f :type :like])

Says only return this entity if doesn’t satisfy this “internal query/clause” Treat them as separate like you were saying. @val_waeselynck Thanks so much for your help :thumbsup: !!

casperc12:10:02

I am wondering, are connections cached by datomic itself, or does it make sense to cache the connections in an atom in the application?

val_waeselynck12:10:22

> Connections are > cached such that calling datomic.api/connect multiple times with > the same database value will return the same connection object.

val_waeselynck12:10:42

from the doc of datomic.api/connect

casperc12:10:30

Ah thanks 🙂

marshall13:10:35

@mbutler It helps me to think of not clauses along the lines of: consider everything in the not in ‘isolation’ - it will match a set of datoms; remove that set of datoms from the set matched by everything else in the query. in other words, the stuff matched by the ‘not’ is “subtracted” from the result set

val_waeselynck13:10:28

Datalog question: are namespaced symbols officially supported for rule names ?

val_waeselynck13:10:37

(already asked above but not sure people saw it)

Matt Butler13:10:16

@marshall Absolutely, thats how I began to think of it, like 2 queries where the return of one was removed from the other 🙂

Matt Butler15:10:15

(d/q '[:find ?e
           :in $ ?a
           :where
           [?e :favourite ?i]
           (not-join
             [?e]
             [?e :favourite ?i]
             [?i :size ?x]
             [(> ?x ?a)])]
         db 10)

This code returns the an insufficient-binding error for ?a however in a similar query where you pass in ?a and use it as the val in a clause as per below, rather than inside an expression clause there is no error.

(d/q '[:find ?e
           :in $ ?a
           :where
           [?e :favourite ?i]
           (not-join
             [?e]
             [?e :favourite ?i]
             [?i :size ?a]
          db 10)

to get the first query in the first snippet to work you need to specify ?a as one of the variables to unify despite not wanting to use it outside the not-join.

(d/q '[:find ?e
           :in $ ?a
           :where
           [?e :favourite ?i]
           (not-join
             [?e ?a]
             [?e :favourite ?i]
             [?i :size ?x]
             [(> ?x ?a)])]
         db 10)

Any explanation for this behaviour? Thanks again 🙂

marshall15:10:15

The comparator (>) requires both arguments to be bound before it can execute The second example will match all possible values to ?a

marshall15:10:42

if you don’t have ?a in the list of bound variables it is effectively a different variable inside the not clause

marshall15:10:06

so to bind it to 10, you must include it in the not-join join list

Matt Butler15:10:32

@marshall Aha yes, thanks, turns out i was just matching on any val of ?a in the second example it was just giving the expected behaviour anyway. That makes much more sense and is consistent 🙂 Thanks again 🙂

marshall15:10:45

no problem :thumbsup:

Matt Butler15:10:05

@marshall Is a query such as the 3rd example capable of being expressed as a rule before I try to do so?

marshall15:10:15

sure. pretty much any set of datalog clauses can be in a rule from the docs: "rules can contain any type of clause: data, expression, or even other rule invocations."

Matt Butler15:10:08

Yep I read that and had concluded it was possible but thanks for the reassurance 🙂

marshall17:10:50

@hunter Yes, if you’re restarting the transactor the peer will report a lost connection (what you’re seeing)

marshall17:10:27

you should see retry and reconnection once your transactor is replaced or becomes available again

hunter17:10:50

@marshall All but one peer did not retry/reconnect, however it's transaction queue was still operational

hunter17:10:53

If it helps, the peer that is not reconnecting is using an asOf database filter, however the t in the log is increasing with each message from the transaction queue

marshall17:10:52

if it’s getting new novelty from the replacement transactor, it has reconnected. that transition may have occurred seamlessly without reporting an error

marshall18:10:14

but if the original transactor is gone and a peer is getting new transactions, it must be connected to the newly active transactor

potetm18:10:15

Here's a question: Is there a way to make something like this work?

(d/q '[:find ?out
       :in $
       :where
       (or-join [?e ?out]
                (and [?e :a1]
                     [(unify ?e ?out)])
                (and [?e :a2]
                     [?out :subcomponent ?e]))]
     [[123 :a1 "foo3"]
      [456 :a1 "foo4"]
      [456 :subcomponent 789]
      [789 :a2 "bar"]])

potetm18:10:06

unify is obviously not a thing. What I'm kind of wanting to do there is say (== ?e ?d)

marshall18:10:50

(ground)

potetm18:10:53

Point being: I've found some entity that may be a top-level type, or it may be a subcomponent. But I want to return the top-level type.

marshall18:10:00

erm. ground requires a const tho

potetm18:10:11

yeah.... 😕

marshall18:10:40

[(identity ?e) ?out]

marshall18:10:50

i think that might do it, although i’m not crazy about it

potetm18:10:59

I guess the obvious thing to do is just run separate queries. Just wondering if there's some functionality I'm missing.

potetm18:10:27

Hmmm, if you're not crazy about it, I'm gonna avoid that then.

marshall18:10:28

i.e. use a built in clojure function that returns the value right back to you and assign that value to a variable

marshall18:10:44

well, it’s not the use, it’s more the putting conditional logic in a query

potetm18:10:51

gotcha

marshall18:10:57

i’d probably tend toward separate queries

marshall18:10:05

with conditional logic in your application code

Ben Kamphaus18:10:55

@potetm I’m not sure I follow what you’re doing in that query.

Ben Kamphaus18:10:01

is:

(d/q '[:find ?out
       :in $
       :where
       (or-join [?out]
                [?out :a1]
                (and [?e :a2]
                     [?out :subcomponent ?e]))]
     [[123 :a1 "foo3"]
      [456 :a1 "foo4"]
      [456 :subcomponent 789]
      [789 :a2 "bar"]])

not equivalent?

potetm18:10:18

@bkamphaus Yeah that is. I apparently oversimplified the example.

potetm18:10:32

Something like this:

(d/q '[:find ?out
       :in $ [?e ?a]
       :where
       [?e ?a]
       (or-join [?e ?out]
                (and [?e :a1]
                     [(identity ?e) ?out])
                (and [?e :a2]
                     [?out :subcomponent ?e]))]
     [[123 :a1 "foo3"]
      [456 :a1 "foo4"]
      [456 :subcomponent 789]
      [789 :a2 "bar"]]
     [789 :a2])

potetm18:10:47

(Actual use case is we found some tx-data in the log by attr, and want to find the top-level entity.)

Ben Kamphaus19:10:17

hmm, something somewhere feels off to me, but not entirely sure. I guess it’s just the fact that as Marshall indicated it feels more like branching logic than a set union, which is why identity seems like a hack there. But is the contrived example a match, where you only care about :a1 and :a2 on ?e — and ?e ?a are passed from tx data? Why provide ?a — aren’t you only passing in ?e values you know you care about without additionally limiting by some other attribute which occurs with the datom?

potetm19:10:34

So, in the real case, I'm passing in ?a to scan the logs for changes to certain attributes. In the end, I want all ?e that have ?a, or that has children that have ?a, where ?a is bound to a collection of attrs.

potetm19:10:54

So yeah, having typed that out, that sounds a lot more like branching logic than set unions.

Ben Kamphaus19:10:25

that description seems a bit of a mismatch for the queries you showed, maybe I’m just missing something. I.e. the last description sounds like:

(d/q '[:find ?out
       :in $ [?a ...]
       :where
       (or-join [?out ?a]
                [?out ?a]
                (and [?e ?a]
                     [?out :subcomponent ?e]))]
     [[123 :a1 "foo3"]
      [456 :a1 "foo4"]
      [456 :subcomponent 789]
      [512 :subcomponent 332]
      [332 :idontcare "snafu"]
      [500 :subcomponent 123]
      [789 :a2 "bar"]]
     [:a2 :a1])

But it sounds as though you’re passing in the log and binding tx-data to values. Are you then actually hard-coding the attributes that identify a parent versus a child like in the example or-join? The interactions between are where I’m lost.

Ben Kamphaus19:10:19

if you just want to drop the subcomponents with that kind of exhaustive logic, something like:

(d/q '[:find ?out
       :in $ [?a ...]
       :where
       (or-join [?out ?a]
                [?out ?a]
                (and [?e ?a]
                     [?out :subcomponent ?e]))
       (not [_ :subcomponent ?out])]
     [[123 :a1 "foo3"]
      [456 :a1 "foo4"]
      [456 :subcomponent 789]
      [512 :subcomponent 332]
      [332 :idontcare "snafu"]
      [500 :subcomponent 123]
      [789 :a2 "bar"]]
     [:a2 :a1])

potetm19:10:19

Yeah dropping attrs is probably too much. Not worth it just to avoid doing separate queries. I'm only interested in a small number (4) attrs in the tree.

potetm19:10:21

Hmm... I'm doing very poorly at this. How about I start with the top-level use case?

potetm19:10:47

So I'm looking through the log for 4 attrs that have changed. Two of those attrs are on sub-entities, two are on the top-level entity.

potetm19:10:55

But I'm only interested getting the top-level entity.

potetm19:10:19

This example might be more realistic? I have 2 data sources, one that represents the db, one that represents the log.

potetm19:10:35

(d/q '[:find [?out ...]
       :in $d1 $log
       :where
       ($log or
             [?e :parent/attr]
             [?e :child/attr])
       ($d1 or-join [?e ?out]
            [?out :parent/attr]
            (and [?e :child/attr]
                 [?out :parent/child ?e]))]
     [[123 :parent/attr "foo3"]
      [456 :parent/attr "foo4"]
      [999 :parent/attr "shouldn't show"]
      [456 :parent/child 789]
      [789 :child/attr "bar"]]
     [[123 :parent/attr "foo3"]
      [123 :parent/attr "foo2"]
      [789 :child/attr "bar0"]
      [789 :child/attr "bar"]
      [000 :dont-care "blerg"]])

potetm19:10:26

=> [123 456 999]

potetm19:10:22

I'm probably missing something, but it seems like I need to say (== ?out ?e) in that first or-join.

Ben Kamphaus19:10:39

it’s because you don’t bind the :parent/attr and :child/attr distinctly.

potetm20:10:03

Ah, so ?p ?c instead of both as ?e

Ben Kamphaus20:10:05

still thinking through query structure.

Ben Kamphaus20:10:48

sorry, don’t think that fixes the issue but it seems incorrect to treat them the same initially then try to split them later. In that case it seems to come down to not being able to use different data sources within one or.

potetm20:10:38

I agree that it seems incorrect. Yeah, multiple sources in an or might fix it, but I agree that it's kind of a ridiculous ask.

2016-10-20

Channels