This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-02-06
Channels
- # aatree (1)
- # alda (9)
- # beginners (63)
- # boot (124)
- # braid-chat (8)
- # cider (44)
- # cljs-dev (44)
- # clojure (79)
- # clojure-dev (1)
- # clojure-russia (47)
- # clojurescript (105)
- # community-development (16)
- # cursive (3)
- # datavis (1)
- # datomic (54)
- # editors (10)
- # editors-rus (10)
- # emacs (18)
- # garden (1)
- # hoplon (5)
- # jobs (1)
- # ldnclj (6)
- # lein-figwheel (2)
- # luminus (1)
- # off-topic (29)
- # om (49)
- # overtone (5)
- # parinfer (12)
- # proton (2)
- # re-frame (5)
- # reagent (6)
- # ring-swagger (1)
- # slack-help (3)
- # spacemacs (1)
- # yada (42)
@currentoor: there’s also the :with clause in Datalog query
btw, the first datalog pattern in your timestamps query [?eid _ _ ?tx _]
is made redundant by the second
@bkamphaus: what is the maximum size a Datomic database can reach? i vaguely remember Stu either talking about or writing about this somewhere but i can’t find it. i know 1 billion datoms is possible. what’s the total ‘address space’?
@robert-stuttaford: ~10 billion datoms is the problem point. Not an address space thing, but problematic
@robert-stuttaford: also note that you can have at most ~20k idents in the db, because every ident is in memory in every peer/transactor
thanks @tcrayford ! what makes 10b datoms a problem? can you direct me to something to read or watch?
@robert-stuttaford: Stu's answer on this thread elaborates a little more: https://groups.google.com/forum/m/#!topic/datomic/iZHvQfamirI -- it's a practical limit and the value is a rough rule of thumb. the database still functions, but probably not with acceptable performance characteristics especially if the transaction volume would reach that size limit quickly for any given use case.
thanks ben!
super valuable info
the in-memory aspect of idents is documented here: http://docs.datomic.com/identity.html#idents
@bkamphaus: Thank you for that link.
So is it fair to say that the ident
limitation is primarily felt with more complex schemas?
@meow: I’m not familiar with anyone running up against practical limits with ident count, though I imagine it would have an impact if you had e.g. generated or flexible tagging that users provided (if you anticipated thousands and thousands of that sort of tag, I would say switch to a unique/identity keyword or string attribute of your own.
there’s also a limit on on schema elements but it’s pretty high, 2^20 http://docs.datomic.com/schema.html#schema-limits
Is there a performance penalty to the unique/identity keyword or string attribute of our own.
ident is more performant but carries more memory overhead (pre-loaded). With your own unique attr on ref’d entity vs. ident you pay cost for retrieving segments and require warm cache etc. (three rough orders of magnitude to get segment from storage, memcached, object cache).
if by schema evolution you mean how to make the change, you can find every one of those enums and give it an identical attr/val keyword name for what the ident was, leave the entity intact.
but obviously pull, query, etc. and automagic around identy/eid translation is lost and requires more verbose lookup ref.
By schema evolution I mean the addition and/or removal of enitity attributes over time as the database design changes in a production environment along with the issues of migration of existing entities and how that works in datomic given that it is immutable.
I want to double check on that 20k limit, not sure if calculated or from a rule of thumb Stu or someone provided i.e. on a video. I do know that we caution people against too many idents but I’m not familiar with that specific boundary, @tcrayford if you don’t mind my quick follow question, can you refer me to the source for the 20k ident limit?
@meow: not immutable over time, i.e. you can retract idents, assert them on other attributes, etc. But for testing, staging, etc. a lot of times you’re using the database itself as a test then migrating the portion of the schema/data you prefer to keep.
the “present” database t/snapshot is the efficient one I mean, as in: http://docs.datomic.com/filters.html#usage-considerations
“queries about "now" are as efficient as possible–they do not consider history and pay no penalty for history, no matter how much history is stored in the system."
What schema is used when I query for something that happened yesterday. Is it yesterday's schema or today's schema, assuming the schema was changed?
Braid is an online group chat application with groups and tags, and no limits on either.
@meow answers to many of your questions are covered here: http://docs.datomic.com/schema.html#Schema-Alteration — however, an ident is not a schema element intrinsically (i.e. your own enums not in :db.part/db and an entity having an ident now or in the past doesn’t introduce the kind of complications you get from e.g. relaxing then trying to re-assert a unique constraint
"Thus traveling back in time does not take the working schema back in time, as the infrastructure to support it may no longer exist. Many alterations are backwards compatible - any nuances are detailed separately below."
I wrote Schevo in Python. Schevo was for "schema evolution". It was similar to datomic but OO.
I have to step away for a while, I’ll check in on the 20k limit re: idents Monday AM with the dev team. I’ll let you know how precise that limit is or if there are tradeoffs you can make (i.e. if you can keep running it up if it’s an important enough aspect of the architecture and you can accommodate via schema provisioning, cache settings, etc.).
tags in braid are just strings that we do look-up on, so the schema shouldn’t actually be growing
@jamesnvc: yes this is only about the count of entities that have :db/ident
and the impact on memory, I’m trying to source the practical limit that was quoted here as I’m not familiar with it, but the softer principle of limiting the total number of things with idents because you always pay their memory overhead should be a modeling consideration.
@robert-stuttaford: thanks!
I would apply the same guideline for Datomic Idents that I use for Keywords in Clojure applications: do not use Keywords for anything user-generated.
@bkamphaus: pretty sure I was wrong and the limit is just 2^20