This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-07-30
Channels
- # arachne (5)
- # beginners (42)
- # cider (35)
- # cljs-dev (25)
- # cljsrn (2)
- # clojure (107)
- # clojure-dev (32)
- # clojure-finland (2)
- # clojure-greece (3)
- # clojure-italy (6)
- # clojure-nl (7)
- # clojure-spec (27)
- # clojure-uk (45)
- # clojurescript (152)
- # core-async (3)
- # cursive (26)
- # data-science (4)
- # datomic (33)
- # defnpodcast (1)
- # duct (12)
- # editors (3)
- # emacs (6)
- # events (5)
- # fulcro (6)
- # jobs (1)
- # lein-figwheel (9)
- # off-topic (7)
- # onyx (7)
- # re-frame (1)
- # reagent (9)
- # reitit (31)
- # shadow-cljs (130)
- # slack-help (1)
- # spacemacs (53)
- # tools-deps (55)
- # yada (4)
I’m working on adding a list “tagged values” as a value and am looking for advice on how to structure it most “datomically”. Essentially, the tagged value entity has tagged-value/key
and tagged-value/value
, both Strings.
I hypothetically want to be able to search by key or value, but they’re otherwise essentially freeform. They’re not logically components of the parent object.
So far I’ve thought of either a) making them all unique — “shared” only by key equality, or b) building them as full-fledged entities, possibly via a tx-fn that looks for an existing one.
Anyone solved something similar, or have suggestions for what will bite me least downstream?
Hi - We have a similar requirement. We have a “key” attribute and 6 “value” attributes (string-value, keyword-value, boolean-value, etc) and a value-type attribute which tells us which value type was written
we then wrote a rule named kv-value which makes querying against that data structure not too painful
do you create each instance as a distinct entity and then link by value in your kv-value?
yes. we have a parent entity which has a cardinality/many ref to the individual key - value pairs
Well, it's probably not true, but just in case, first consider that your keys are really attributes. You can add a :mything/is-tagged? true
attribute to tag attributes.
I have a performance monitoring question: In our workload, we transact 200k datoms every 30 minutes. We expect this to increase over time. I wouldn’t be surprised if we are in a position to transact 2 million datoms every 30 minutes in a month or two. My understanding of the Datomic Cloud transact pipeline is that clients transact data into a single node. The transacting node must perform some CPU operation on the tx-data and then it writes into EFS, S3 and DynamoDB simultaneously. The transact operation complete when the transacting node is finished writing to all three storage surfaces. Is my understanding correct?
@eraserhd yeah, that would be the easy solution, but the idea here is specifically to capture user-defined “stuff” that’s outside the formal schema attributes.
@mark340 curious, at ~110 writes per second ( your current load), how is datomic handling it? That already seems too much for a single db.
After a little back and forth with Datomic support (great dealing with them, BTW!), it’s handling it just fine. The trick was to raise our DynamoDB provisioning to 250 write units.
I’m still not sure I understand the performance relationship between the Datomic log, EFS, S3 and DynamoDB. I have a question into Support about it and I expect an answer soon.
I was hoping for an answer from the community as well 🙂
Ok, nice, I haven't try the cloud stuff, but itsn't it supposed to scale automatically for you for stuff like write units?
it will scale automatically within limits that you get to set. the default limits are pretty good to get started and can be easily changed
yes, it is for a single db.
fyi - in Datomic Cloud, you you are still limited to a single transactor per database but you can have multiple databases and thus multiple transactor nodes. we might end up scaling that way but, as it stands right now, the single transactor seems be to holding up
will you be able to single query to multiple databases if you scale up to that? or you don't need that?
yes, datalog allows you to query multiple databases without much difficulty. i don’t know the performance implications of it, though
but Datomic Cloud doesn't support multiple dbs for query yet does it ?
I don’t know if Cloud supports it yet. I don’t remember reading about the limitation in the docs
last time I tried it didn't work
Oh, interesting. I’ll add it to my list of things to investigate 🙂
Does anyone know if the recommended "10 billion datum" limit has increased or if datomic cloud changes this? Trying to figure out if datomic will scale to our problem.
I had a similar question a couple of weeks ago. My memory of the answer: Cognitect tests up to 10 billion datoms. Performance implications vary widely based on structure of your data.
The 10 billion is not a limit
Yeah +1. We have reached 40 billion or so with on-premise. We do see increasing issues with transactor timeouts at this level. And we haven't tried to fully understand if we can circumvent these issues. Part of the reason might be that we also have extra indices on the biggest part of these datoms. For us it's no big of an issue as we solve the problem by starting with a fresh db when the problems get too bad (or the dynamodb costs too high). I would like to spent some more time to figure out how to scale properly to this amount some day though