This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-05-31
Channels
- # aleph (3)
- # aws (5)
- # beginners (65)
- # boot (17)
- # cljs-dev (112)
- # cljsrn (5)
- # clojure (146)
- # clojure-austin (3)
- # clojure-dusseldorf (3)
- # clojure-italy (18)
- # clojure-norway (13)
- # clojure-russia (84)
- # clojure-serbia (5)
- # clojure-spec (24)
- # clojure-uk (84)
- # clojurescript (204)
- # css (1)
- # cursive (21)
- # data-science (3)
- # datascript (21)
- # datomic (26)
- # emacs (5)
- # euroclojure (1)
- # hoplon (8)
- # jobs (7)
- # jobs-discuss (2)
- # keechma (35)
- # lumo (92)
- # mount (1)
- # nrepl (2)
- # numerical-computing (16)
- # off-topic (10)
- # om (58)
- # re-frame (13)
- # reagent (90)
- # remote-jobs (2)
- # ring-swagger (1)
- # spacemacs (9)
- # specter (6)
- # unrepl (17)
- # untangled (56)
- # yada (2)
@tonsky is there any particular design decision behind tx-ids ints range? I just noticed, that ~5-10 tx ids make up 70% of my transit-string-db. This is insane!
Transactions are also entities. I think it's to keep an eid range reserved for attribute ids?
It mirrors datomic and different eid range for different partitions.
thought of that too, @danielstockton
This is my recollection anyway, don't have a good understanding stored in my head atm.
Yes, I don't think transit caches integers but trying to find the reasoning.
Specifically, all ~#tag, keyword and symbol values are cached when they are more than 3 characters long (including the tag). Strings more than 3 characters long are also cached when they are used as keys in maps whose keys are all "stringable".
Yep, was looking in the same place.
stringifying txs is questionable, because most of the transactions are just <10 datoms long, might not yield much benefit, but might be quite slower
Yes, i'd assume so. Depends if you're optimizing for really tight bandwidth or speed.
is anyone already working on full-text search
support? (thought about leveraging the lunrjs
js impl)
[:datoms 497]
[:txs 4]
1 time: "Elapsed time: 4.710000 msecs" ;; i7 macbook pro 2012, google chrome tab
1000: "Elapsed time: 4127.350000 msecs"
[:fast-count 16332]
1 time: "Elapsed time: 6.100000 msecs"
1000: "Elapsed time: 4978.610000 msecs"
[:short-count 13883]]
short 1, 1000 writes, 1, 1000 reads
"Elapsed time: 12.200000 msecs"
"Elapsed time: 4892.865000 msecs"
"Elapsed time: 18.630000 msecs"
"Elapsed time: 5594.200000 msecs"
fast 1, 1000 writes, 1, 1000 reads
"Elapsed time: 14.220000 msecs"
"Elapsed time: 3826.140000 msecs"
"Elapsed time: 20.645000 msecs"
"Elapsed time: 5157.845000 msecs"
15% shorter transit string. I think it does not worth it, given that I there are only 4 transactions in this data set
N 536870915
(9 symbols) becomes:
1 "~:536870915"
and N-1 "^K"
(13 and 4 symbols)
but with amount of transactions, width of cached "^K" will grow, and since you cannot control what would be encoded with which code, you might end up encoding tx, which appears only in 2 datoms with "^K", and attribute, which appears in 50% of the datoms - with "^ZZZ", and any suddenly it is sower and takes longer to read/write