Fork me on GitHub
#xtdb
<
2023-05-10
>
cjsauer15:05:35

I'm seeing some weird behavior with numeric comparisons when querying XTDB via HTTP. I have a query that is effectively something like this (simplified):

{:find [e]
 :in [thresh]
 :where [[e :price price]
         [(<= price thresh)]]}
When I pass thresh in as an argument, if it "appears" to be a long, eg 123, XT suddenly will only return entities whose :price also happens to be "long-like" (ie no decimal digits). For example, if there's an entity with price 100.34, it will not be returned even tho it is indeed less than 123. But 100 is returned. I've noticed that if I force thresh to be passed in as a double by adding a tiny decimal to it, eg 0.0001, now I have the exact opposite problem: 100.34 will be returned in the result set, but 100 is not returned! This strikes me as very odd, because clojure's own <= operator will convert up to Double just fine, regardless of some args being Longs.

cjsauer15:05:30

I suppose there might be some way to cast the arguments to <= to double using double, but this really seems more like a bug to me. Shouldn't XT be able to coerce just like core's <=?

jarohen15:05:20

hey @U6GFE9HS7 - this is unfortunately https://github.com/xtdb/xtdb/issues/1298 in 1.x with a rather disruptive (requiring users to migrate their golden stores) fix.

jarohen15:05:30

best thing to do, if you know that some of your values will be doubles, is to cast all of your numbers to doubles, both when you ingest and when you pass in a value to be compared - e.g. 100.0 or 123.0

cjsauer15:05:46

Ah okay. How would I convince XT that something is a double if I'm hitting it over HTTP from JS? Any tips? I realize this is a bit of an edge case consumer-wise.

cjsauer15:05:18

In other words, my transactions are encoded in JSON

jarohen15:05:38

ah, of course, there's no way to explicitly specify 100.0 in JSON. :thinking_face: I'm guessing you're hitting the XTDB API directly; i.e. you don't have a layer on the server-side that could make the translation?

cjsauer15:05:57

Right. No intermediate layer.

cjsauer15:05:38

Can EDN encode doubles properly? I might be able to serialize my transactions as EDN.

jarohen15:05:25

it can, yeah - or our API also accepts Transit, if you could use the https://github.com/cognitect/transit-js library (admittedly not particularly palatable)

jarohen15:05:53

(I've never used it from vanilla JS)

cjsauer15:05:55

Oh interesting. Is the Transit content type undocumented? I didn't see that format in the docs.

jarohen15:05:37

ah, I didn't realise it was undocumented, thanks for letting me know 🙂

cjsauer15:05:44

Sure thing. I saw responses can be transit encoded, but I didn't realize I could also send transit. That might work.

cjsauer15:05:57

As an aside, it's a bit worrying that a bug of this caliber is 3+ years old in the issue tracker 😬 What are your plans for these golden store breaking changes? It's super early days for this project so a re-ingest is still quite palatable.

jarohen15:05:46

our current plan is a golden store migration as part of the 2.x release, which recently entered https://www.xtdb.com/blog/2x-early-access.

cjsauer15:05:51

Ah, nice! Well I might be able to help kick the tires on this given the experimental nature of this application.

jarohen15:05:52

that's a fair way off being production-ready as yet, but we'll certainly be keeping folks updated on its progress

jarohen15:05:04

we're generally quite conservative about imposing fixes of this nature, though, given how many people have XTDB in well-established production deployments

jarohen15:05:19

> Well I might be able to help kick the tires on this given the experimental nature of this application. that'd be great, thanks - always good to get feedback on these things 🙂

cjsauer15:05:58

Looking closer, this might be too bleeding-edge even for us at the moment 😅 I'm going to try transit serialization first to see if I can re-encode those numbers as doubles. I appreciate your time. I'll sign up for the 2.0 newsletter.

jarohen16:05:32

that's fair enough, it's still very bleeding-edge 🙂

cjsauer17:05:47

Hm...transit encoding is definitely not ideal. Is there any way to customize the decoding of the HTTP server? Or, even easier, is there any way to manually coerce values within the datalog query?

cjsauer17:05:33

Seems I'm able to make it work by using double. This seems like the lowest friction solution for now.

💡 2
jarohen09:05:50

hey @U6GFE9HS7, glad to hear you found a solution, and apologies again that this isn't as straightforward as we'd all like 🙂

Sagar Vrajalal16:05:53

What is the correct way to write a predicate to compare two intermediate bindings? (In this case, they are both maps) My query looks something like :

'{:find [?x]
  :in [map-1]
  :where [[?x :entity/map2 ?map-2]
          [(keys ?map-2) ?map-2-keys]
          [(select-keys map-1 ?map-2-keys) ?map-1]
          [(= ?map-1 ?map-2)]]}
I'm encountering some really weird behavior. When I run the above query using the REPL, it gives me the expected results. But during my program's runtime, when the same arguments are passed, nothing is returned. I tried (== ?map-1 ?map-2) and that doesn't work either. But strangely, if I use a custom function like
(defn maps-equal
  [map-1 map-2]
  (= map-1 map-2))
or simply
(clojure.core/= ?map-1 ?map-2)
it works. Is this a bug?

jarohen09:05:35

hey @U01B1CZQ9PF, I'd be interested to understand what's happening here - have you got a self-contained repro that we could try?

refset10:05:48

Hi 🙂 Can you confirm whether there is metadata on either of the maps? It might be hitting the same underlying hashing issue as https://github.com/xtdb/xtdb/issues/1799

Sagar Vrajalal11:05:17

Alas, I am not able to reproduce this in an isolated environment using an in-memory node. The metadata of the maps I am trying to compare is nil. The map that I'm passing as an argument to the query is a Kafka record consumed using the jackdaw library. Not sure if this helps, but my no-op function fails to print anything if I use = instead of clojure.core/= ! Given,

(ns core)

(defn no-op
  [& objs]
  (prn (map meta objs))
  (prn objs)
  true)
If I try
...
[(select-keys map-2 ?map-1-keys) ?map-2]
[(core/no-op ?map-1 ?map-2)]
[(= ?map-1 ?map-2)]
...
it prints nothing! But if I try
...
[(select-keys map-2 ?map-1-keys) ?map-2]
[(core/no-op ?map-1 ?map-2)]
[(clojure.core/= ?map-1 ?map-2)]
...
It prints the expected results.

refset13:05:45

>

[(core/no-op ?map-1 ?map-2)]
> [(clojure.core/= ?map-1 ?map-2)]
due to query planning there's no guarantee about the ordering these clauses run, so it is probably just short-circuiting

refset13:05:24

please could you try explicit unification and see if that changes anything, i.e. [(== ?map-1 ?map-2)]

Sagar Vrajalal15:05:09

No luck with == either unfortunately! I tried that before. The no-op function does work though, if I use ==. I can even see (just visually atleast) that there are matches.

...
(nil nil)
({:abc 100} {:abc 100})
...
The query still returns nothing though.

thinking-face 1
jussi09:05:41

I think that select-keys does not return the keys in any predictable order since maps are unsorted. You could try to use (.containsAll ?map-1 ?map-2)?

💡 2
âž• 2
Sagar Vrajalal06:05:33

Not sure if I'm missing something, but this won't work with Clojure types right? It just throws

No matching method containsAll found taking 1 args for class clojure.lang.PersistentArrayMap

jussi06:05:31

Ah, ok. so you are looking for a way to compare the equality of map keys and values? Just having the same keys is not enough?

Sagar Vrajalal06:05:01

Yes 🙂 In essence, I'm trying to check if a map is a "submap" of another map.

jussi07:05:21

Yup, not sure if there is anything in the core for you. Here is a very naive approach

(defn is-submap?
  [m submap]
  (every? true?
          (map (fn [k]
                 (= (get m k) (get submap k)))
               (keys submap))))

(deftest test-submap-equality-test
  (testing "Are submaps equal?"
    (let [map-1 {:a 1 :b 2 :c 3}
          map-2 {:a 1 :b 2}
          map-3 {:a 1 :b 3}]
      (is (is-submap? map-1 map-2))
      (is (not (is-submap? map-1 map-3))))))

jussi07:05:37

You could build a predicate function to do what you want with something similar.

jussi07:05:15

Just golfing around 😅

(defn is-submap?
  [m submap]
  (.containsAll (vals (select-keys m (keys submap))) (vals submap)))

(deftest test-submap-equality-test
  (testing "Are submaps equal?"
    (let [map-1 {:a 1 :b 2 :c 3}
          map-2 {:a 1 :b 2}
          map-3 {:a 1 :b 3}
          map-4 {:a 1 :z 3}]
      (is (is-submap? map-1 map-2))
      (is (not (is-submap? map-1 map-3)))
      (is (not (is-submap? map-1 map-4))))))

Sagar Vrajalal07:05:37

Thanks for your help! Writing my own predicate function is trivial here, I was able to achieve that. The problem is entirely different though :) right now I'm able to work around this using clojure.core/= instead of =

2
💪 2
refset11:05:55

>

No matching method containsAll found taking 1 args for class clojure.lang.PersistentArrayMap
Ah, so the problem with this is that XT only supports Clojure functions inside Datalog, i.e. you can't use Java interop syntax directly - you have to write Clojure functions that wrap it

Sagar Vrajalal11:05:24

I didn't write it inside Datalog directly. I wrote a custom predicate function wrapping it, but my usage was just wrong. I was using Clojure maps.

(.containsAll {...} {...})
that's what caused the exception.

refset12:05:47

Gotcha, that makes sense 🙂

chaos18:05:00

Hi, what is the best way to do a left join on entity values on the right? for example, I'd like to join :txt/str with :var/name via :var/txt-id and return a null :var/name on the right when there is no match:

(with-open [node (xt/start-node {})]

  (xt/submit-tx node
                [[::xt/put {:xt/id -1
                            :var/name "abc"
                            :var/value 5
                            :var/txt-id -10}]

                 [::xt/put {:xt/id -10 :txt/str "$abc"}]
                 [::xt/put {:xt/id -11 :txt/str "xyz"}]])

  (xt/sync node)

  (xt/q (xt/db node)
        '{:find [?str ?var-name]
          :where [[?txt-id :txt/str ?str]
                  [?var-id :var/txt-id ?txt-id]
                  [?var-id :var/name ?var-name]]})
  ;; => #{["$abc" "abc"]}
  )
I found a way to do so using or but it significantly underperforms in large datasets, while converting it to an explicit rule doesn't help either:
(xt/q (xt/db node)
      '{:find [?str ?var-name]
        :where [[?txt-id :txt/str ?str]
                (or [?var-id :var/txt-id ?txt-id]
                    (and (not-join [?txt-id]
                                   [_ :var/txt-id ?txt-id])
                         [(identity nil) ?var-id]))
                [(get-attr ?var-id :var/name nil) [?var-name ...]]]})
;; => #{["xyz" nil] ["$abc" "abc"]}
thanks

chaos17:05:08

Hi, is there anyone who could provide some assistance with this question please? thanks

refset18:05:08

Hey @U012BL6D9GE - sincere apologies this one slipped through my net - I saw you post it on the Discuss forum and I have that open to respond to still, but it totally passed be my that you already posted here also beforehand 😑 I'll write you up a response right now!

2
richiardiandrea20:05:46

Hi XTDB folks, we are trying to get a child entity at a different valid time compared to its parent. We were wondering if get-start-valid-time works within a subquery? As a subquestion, is there a way to pass a hardcoded #inst just for testing as param to it?