This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-09-19
Channels
- # bangalore-clj (35)
- # beginners (42)
- # boot (89)
- # cider (9)
- # clara (2)
- # cljs-dev (29)
- # cljsjs (3)
- # cljsrn (14)
- # clojars (9)
- # clojure (332)
- # clojure-brasil (1)
- # clojure-dev (5)
- # clojure-italy (4)
- # clojure-russia (36)
- # clojure-spec (38)
- # clojure-uk (65)
- # clojurescript (114)
- # clr (11)
- # community-development (105)
- # core-async (10)
- # cursive (4)
- # datascript (1)
- # datomic (58)
- # defnpodcast (3)
- # emacs (4)
- # hoplon (7)
- # juxt (3)
- # keechma (8)
- # off-topic (7)
- # om (109)
- # om-next (8)
- # onyx (26)
- # pedestal (3)
- # planck (8)
- # re-frame (76)
- # reagent (28)
- # rum (25)
- # spacemacs (2)
- # specter (35)
- # untangled (31)
- # yada (27)
I heard some rumors about people having issues with using datomic entity ids in JavaScript, since JS' highest safe integer is 2^52-1, while Datomic entity IDs are 64-bit longs. But I see that all my entity IDs are around 175e+11, meaning I would run out of my allotted 10B datoms way before I encounter that issue. What am I missing?
I think you'll only have problems if you have a large number of partitions
As far as I understand, it uses conditional put for the root refs (index-root-id etc..) and then subsequent segments are either there or not there yet
Can't remember the exact behaviour if a query needs a segment that isn't there yet, I guess it throws an error and you just wait until it's available
But might be completely wrong
potentially i think, it depends when the last one was created, how much you change it, and in what direction
i think you're ok as long as you don't change it backwards past the time that the last one was created
that might cause other problems with transaction instants anyway
@yonatanel If you change the system clock to a time earlier than the last recorded transaction, Datomic will reject transactions until the clock "catches up" again
@magnars One note about EIDs in your external client - Itās generally not recommended to use the datomic-generated entity IDs as external identifiers. Whenever possible you should model identity yourself with a domain or generated identifier.
@yonatanel @danielstockton is mostly correct about the way consistency is implemented. The conditional put of the root ONLY occurs once all the segments below that root are written. This means that you can never have an inconsistent database. If the root is present, all nodes below it are as well.
@marshall: thanks for the heads up. I'm using them for only short-lived transient IDs in the client tho. They're never stored anywhere.
We have a Datomic database where the "core entitiesā are categorised by countries and a lot of other entities around these (indirectly also categorised by country). We also have users corresponding to one of these roles: ADMIN, COUNTRY_ADMIN and REVIEWER. Unless you are an ADMIN, you canāt read information from other countries than the ones you belong to. ADMIN always has āwriteā rights (to add facts) to all entities. All other roles can only write to the entities belonging to countries you are a member of. A REVIEWER can read ADMIN related information, but is not allowed to write ADMIN information (the same for COUNTRY_ADMIN -> ADMIN). We keep track of the current logged in user, and we store the countries he belongs to and which role he has. How should we best implement this? 1. By adding extra parameters to every function that does a Datomic query + extra criterias in the query. Maybe by using Datomic rules. 2. Have a central function that returns a filtered database, based on the current users countries and user role level, that we use to query the database. 3. Any other ideas to solve these cross cutting concerns?
Without getting into your specific use case, a filtered database is a convenient general solution, but comes with a performance cost (examining every Datom your query touches). Adding extra selection criteria to every query might be more efficient, but comes at the cost of added complexity on every query. Another possible approach is to do all your normal queries without considering authorization, then trim the results based on what the current user is allowed to see.
@marshall and @jaret, i've encountered this issue in my new infrastructure codebase https://groups.google.com/forum/#!msg/datomic/IXsSUqMkgGo/hMVLcUeqmNEJ
root cause is transactor isn't receiving a public ip (doesn't need one), and so setting alt-host is failing. any recommendations for options, or should i just set an ip?
I've encountered that issue as well. Adding a public IP is the easiest thing to do. Some users have reported success removing or editing the alt-host
line in the CloudFormation template; I don't know if that works.
public ip it is!
@stuartsierra: a recent cognicast mentioned your prediliction for 'decanting Datomic databases'. is this something you've done a lot?
i'm busy preparing to do this for a pretty large database. any .. uh, tips? š
by decant, i mean, rebuild in transaction order. and per tx, either discarding, or altering in flight
i have to use a streaming approach because it's tens of millions of transactions. i was wondering if there are any gotchas you may be able to warn me about
@robert-stuttaford: The main challenge is translating entity IDs from the "old" DB to the "new." If every entity in your database has a :db.unique/identity
attribute, then just use those.
Without that, you have to maintain a mapping from old EIDs to new EIDs. I used a key/value store like LevelDB.
If you're relying on that EID mapping in an external store, then you cannot stream the transactions, because you have to get the resolved tempids from the previous transaction before you can translate the subsequent transaction.
Also make sure the process you're building is resumable: During a long import job, the Transactor will pause occasionally, causing transaction errors. Your Peer process has to be able to continue where it left off without skipping any transactions. Ideally, you want it to persist its state (i.e., last transaction copied) on disk.
thank you, @stuartsierra -- i'm definitely planning a pause capable approach
happily, i think i will be able to avoid the external ID mapping, because i can just add unique ids to everything in the source database first
and use the source database as the mapping, because i don't care about its cleanliness in the long run
i may have a question or two, but what you've shared so far is great. thank you
You're welcome.
what's the largest database you've decanted, @stuartsierra?
I've deleted all datomic db's, ran gc-deleted-dbs, ran full/freeze VACUUM in postgres, restarted all processes and somehow the "datomic_kvs" table use ~2.5GB of disk space?
@robert-stuttaford about 9 billion datoms.
wow. that's awesome!
That took days.
yeah i was just about to ask
i haven't counted datoms yet, but we're looking at 50mil+ txes
i'm going to be interleaving two databases into one
going to be quite a lot of fun, and it's going to feel really good to expunge all the newbie mistakes we made over the last 4 years
some real š moments in there
That's a common motivation for doing it.
any idea how big that database was in storage, @stuartsierra ?
Another trick: consider "decanting" into a dev
database and then use backup/restore to move into distributed storage.
the longer term motivation for building this out now is that it becomes possible to rebuild far more quickly in future, e.g. to shard
oh, yes. definitely
is the diagnostic tool mentioned in the announcement of version 0.9.5302 easly available?
huh. think we'll go quite quickly; we're only at 84mil datoms
@ckarlsen So if you deleted the DB as mentioned earlier I am not sure you can run the diagnostic. Is there a reason you cannot just delete the table and assume the 2.5gig is garbage that can no longer be collected?
@jaret no reason, just curious. I've been doing lots of retractions and additions lately on local dev db during testing, and often the transaction throughput is horribly slow.. from ~5ms to 2-3sec for no apparent reason. This is orginally a database that's been through a lot of software upgrades