Fork me on GitHub
#xtdb
<
2021-12-22
>
xlfe09:12:22

Any plans to add compression/archiving to checkpoints? ie rather than uploading a directory of index files, upload a tar.bz2 of the directory ?

refset12:12:54

Not as yet, but Rocks is already doing various kinds of compression so there may not be a huge amount of benefit in that case (have you tried?), LMDB doesn't have built-in compression though, so there may be more benefit there. Archiving may be a good suggestion for the general case, although probably also removes any possibility for parallel file upload/download :thinking_face: What is the main motivation for you?

xlfe03:12:28

Yeah there's probably not a really convincing reason - the compression on snapshots is a slight benefit, but aside from that 🤷

xlfe10:01:46

Just to add some data, here's a couple of rocks db and lucene stores with and without compression

xlfe10:01:26

$ du -h
4.2G    ./docs-txes
2.3G    ./lucene
11G     ./idxs

xlfe10:01:33

-rw-r--r-- 1 user user 3.0G Jan 19 20:24 docs-txes.tar.bz2
-rw-r--r-- 1 user user 8.4G Jan 19 20:37 idxs.tar.bz2
-rw-r--r-- 1 user user 1.8G Jan 19 20:21 lucene.tar.bz2

xlfe10:01:48

So over those three stores, compression is approx 25%

lepistane11:12:59

Is it possible for query to return a map instead of set when using pull?

(xt/q (xt/db node) '{:find [(pull e [:team/name :team/image-hash])]
                       :in [?name]
                       :where [...clauses...]
                       }
        "rush"
        )
i get
#{[#:team{:name "rush",
          :image-hash
          "57e73736e582775cddf0fe40e5ea2a9afa30e5ed6cca0721950b80da09393e74"}]}
but i kinda expect
{:team/name "rush" :team/image-hash "1321"}
It works with :keys but i don't wanna put additional clauses/rules in where Also i couldn't get :keys to work with pull https://docs.xtdb.com/language-reference/1.20.0/datalog-queries/#return-maps

refset12:12:38

I think you are looking for "find specs" here, see https://github.com/xtdb/xtdb/issues/1449 - we don't support that in XT's Datalog today though, so you will have to do ffirst in your code

👍 1
refset12:12:34

> Also i couldn't get :keys to work with pull I expect this should work the same as order-by as per To use :order-by with an aggregate, simply restate the aggregate element exactly as it is written in your :find vector. (https://docs.xtdb.com/language-reference/datalog-queries/#ordering-and-pagination) ...but if it doesn't then that's probably a bug :thinking_face: do you have a small example?

lepistane12:12:17

Oh i think the 1. is the answer. I saw a lot of ffirst and though there must be a better way. Regarding 2. so

(xt/q (xt/db node) '{:find [(pull e [:team/name :team/image-hash])]
                       :keys [team image-hash]
                       :in [?name]
                       :where [..clauses..]}
        "rush"
        )
this is the stacktrace
Query didn't match expected structure
   {:xtdb.error/error-type :illegal-argument,
    :xtdb.error/error-key :query-spec-failed,
    :xtdb.error/message "Query didn't match expected structure",
    :explain
    #:clojure.spec.alpha{:problems
                         [{:path [],
                           :pred
                           (clojure.core/fn
                            [{:keys [clojure.core/find], :as q}]
                            (clojure.core/->>
                             (clojure.core/keep q [:keys :syms :strs])
                             (clojure.core/every?
                              (clojure.core/fn
                               [ks]
                               (clojure.core/=
                                (clojure.core/count ks)
                                (clojure.core/count
                                 clojure.core/find)))))),
                           :val
                           {:find
                            [[:pull
                              {:pull pull,
                               :logic-var e,
                               :pull-spec
                               [[:prop :team/name]
                                [:prop :team/image-hash]]}]],
                            :keys [team image-hash],
                            :in {:bindings [[:scalar ?name]]},
                            :where
                            [[:triple
                              {:e e,
                               :a :com.hudstats/provider,
                               :v "com.hudstats"}]
                             [:triple
                              {:e e,
                               :a :com.hudstats/entity,
                               :v :com.hudstats/team}]
                             [:triple
                              {:e e, :a :team/alias, :v ?name}]]},
                           :via [:xtdb.query/query],
                           :in []}],
                         :spec :xtdb.query/query,
                         :value
                         {:find
                          [(pull e [:team/name :team/image-hash])],
                          :keys [team image-hash],
                          :in [?name],
                          :where
                          [[e :com.hudstats/provider "com.hudstats"]
                           [e :com.hudstats/entity :com.hudstats/team]
                           [e :team/alias ?name]]}}}
Judging by your message this is to be expected. I thought i can mix pull with :keys Btw if you think this is still a bug i can create an example no problem

refset16:12:43

ahh sorry I was confused earlier, ignore my comments about order-by entirely 🙂 the real issue you're looking at there in that error is that this check is failing:

(clojure.core/=
  (clojure.core/count ks)
  (clojure.core/count clojure.core/find))

refset16:12:17

and that's simply because you have two elements in your :keys vector, and only one in your :find

lepistane13:12:07

Alright, so is there a way to make this work?

(xt/q (xt/db node) '{:find [(pull e [:team/image-hash])]
                       :keys [image-hash]
                       :in [?name]
                       :where [...[clauses]...
                               [e :team/alias ?name]]}
        "rush"
        )
gives me
#{{:image-hash
   #:team{:image-hash
          "57e73736e582775cddf0fe40e5ea2a9afa30e5ed6cca0721950b80da09393e74"}}}
but i expected just
{:image-hash
          "57e73736e582775cddf0fe40e5ea2a9afa30e5ed6cca0721950b80da09393e74"}
Maybe i am trying to use pull and :keys for rename-keys and that's not their purpose?

refset13:12:58

I think you want to use :as here (see https://docs.xtdb.com/language-reference/datalog-queries/#_attribute_parameters)

(xt/q (xt/db node) '{:find [(pull e [(:team/image-hash {:as :image-hash})])]
                       :in [?name]
                       :where [...[clauses]...
                               [e :team/alias ?name]]}
        "rush"
        )

;; should give 
#{[{:image-hash "57e73736e582775cddf0fe40e5ea2a9afa30e5ed6cca0721950b80da09393e74"}]}

👍 1
refset13:12:21

(and then using ffirst again 🙂 )

🙇 1