This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-08-25
Channels
- # announcements (4)
- # asami (26)
- # babashka (82)
- # beginners (27)
- # biff (6)
- # boot (1)
- # calva (42)
- # cider (2)
- # clj-commons (1)
- # clj-http-lite (2)
- # clj-kondo (37)
- # cljdoc (1)
- # clojure (46)
- # clojure-europe (34)
- # clojure-nl (1)
- # clojure-norway (7)
- # clojure-uk (2)
- # clojurescript (54)
- # code-reviews (18)
- # cursive (2)
- # datalevin (32)
- # datomic (7)
- # etaoin (1)
- # fulcro (9)
- # gratitude (3)
- # hyperfiddle (15)
- # introduce-yourself (1)
- # jobs (2)
- # lsp (32)
- # nrepl (1)
- # off-topic (18)
- # pathom (17)
- # pedestal (5)
- # polylith (89)
- # reitit (7)
- # releases (3)
- # remote-jobs (4)
- # shadow-cljs (52)
- # spacemacs (3)
- # squint (14)
- # tools-build (10)
- # tools-deps (18)
- # vim (4)
- # xtdb (34)
Datalevin looks like a good a fit for an idea I had, but I need to store unsigned 256 bit integers. What would it take to add bigint support? I need to be able to compare these integers inside queries.
Also, in Datomic you can invoke arbitrary methods inside d/q
, is that possible in Datalevin? I didn’t see it in the docs or limitations section.
Because if arbitrary methods are allowed then I could just deserialize the big int as EDN and do my comparison operations on it. But that would have extra overhead.
What’s needed to store bigint is to find a way to serialize that into bytes, such that the bytes ordering is the same as the bigint ordering.
Big int’s have a toByteArray
instance method, would that work?
the constructor can also take in a byte array, for the reverse situation https://docs.oracle.com/javase/8/docs/api/java/math/BigInteger.html#BigInteger-byte:A-
ah but this byte array will be two’s-complement representation…
But that won’t be an issue for positive ints, not sure how negative numbers should be handled
If you wanna be lazy about it you can print bigint to string and then parse that
Not optimal in terms of storage
Don’t the internals of the engine have to have a comparison operation for putting things into the index? BigInteger already has a compareTo method, So it seems like toByteArray gets you the bytes to store, and the internals should pull the value to be compared, make it a BigInteger, and just use compareTo. That gets the AVE index in the right order, which is the primary concern isn’t it?
To have good performance, we would like to use the native bytes comparison. It’s not a big problem to encode BigInt correctly. We are doing the same for long, integer, double and float already. It’s tricker for BigDecimal, I am thinking of encoding it as double and use that as a prefix, so the range scan would be roughly in the correct order, and disambiguate with the exact representation of big decidmal.
so big decimal will be represented as 3 parts: a 64 bit double prefix, a 32 bit integer scale, followed by the big integer part
once we have bigint, it’s natural to have big decimal, the goal is to have feature parity with datomic
the next release will have bigint and bigdecimal. I will get to it after I have some time
Seems like you could just use the same approach as I described for bigint, but also store the scale, etc:
BigDecimal(BigInteger intVal, long val, int scale, int prec)
then you have compareTo again.sure, but I don’t assume I always have java method to access the bytes, so I am encoding things into the right bytes such that a native C function can compare very quickly.
e.g. our comparator for the graalvm version is a dead simple C function, only a few lines.
Because bytes comparison is pretty much most of the real work. Others are just pointer chasing.
These range compares only apply when using operations on key-value datastore mode, right? Not the datalog store?
Oh I wasn’t aware. I thought that each database was either datalog type or key-value type.
Ah I see, you have kv databases like "datalevin/ave"
and such. So if I have a datalog attribute :price
, how do I find all the entities that have it between 1000 and 10000 with a kv range query?
Attributes are stored as int id, say :price is 5, so the range query is something like [:closed <tel:5-1000 5-10000|5-1000-0 5-10000>-maxe], where - is concatenation. Of course, the integers are encoded such that the bytes order has the same order as the value
The above is the current implementation. The next datalog store and query engine is more sophisticated. So entity ids will most likely come from a bitmap.
Thank you for the explanation