This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-01-11
Channels
- # admin-announcements (3)
- # beginners (51)
- # boot (14)
- # cider (55)
- # cljsrn (5)
- # clojure (105)
- # clojure-austin (2)
- # clojure-brasil (3)
- # clojure-dusseldorf (2)
- # clojure-greece (5)
- # clojure-italy (1)
- # clojure-mexico (1)
- # clojure-russia (74)
- # clojure-spec (66)
- # clojure-uk (22)
- # clojurescript (124)
- # cursive (10)
- # datomic (79)
- # events (2)
- # immutant (3)
- # jobs (4)
- # klipse (38)
- # leiningen (2)
- # luminus (1)
- # off-topic (25)
- # om (48)
- # om-next (36)
- # on (1)
- # onyx (19)
- # overtone (3)
- # pedestal (2)
- # proton (3)
- # re-frame (178)
- # reagent (49)
- # ring-swagger (1)
- # spacemacs (10)
- # specter (29)
- # testing (5)
- # untangled (6)
- # yada (65)
@favila, may i suggest you scan through the Datomic changelog?
http://my.datomic.com/downloads/free > click Changes in the the first row in the table
I've found the following Datomic db maintenance helper update-all
useful: https://gist.github.com/pesterhazy/479303224559cf7fa372c5af3c992768#file-datomic_maintenance-clj-L28
what does everyone else use in terms of quick 'n dirty db manipulation/transformation helpers?
@pesterhazy every-val
is (into #{} (map :v (seq (d/datoms db :aevt attr))))
🙂
indeed it is
can do the same with (map :e)
for every-eid
and you can switch every-entity
to use a transducer (sequence (map #(d/entity db %)) (every-eid db attr))
oh, you want a set; (into #{} (map #(d/entity db %)) (every-eid db attr))
you could do the transducer thing in every-val
too of course
good thing about datoms is that it's lazy
so maybe no do the set thing...
In previous versions of datomic, one would use :db/id #db/id[:db.part/user -X]
to add entity references in a transaction. In the newer verions, one can use :db/id "someid"
, which is really cool. However, what if the entity is in another part of the database? (If one has a entity reference like :db/id #db/id[:db.part/custom -X]
, how to use the newer version of datomic to reference this? :db/id "someid"
will write it in the db.part/user
, but I want it in another part.)
@kurt-yagram the v1 syntax is still supported
yeah, I know, question is, can we use the 2nd syntax together with the 'named' references somehow?
ok, thanks!
Hi all. I've been trying to understand the performance implication of datomic's partitions. Specifically, the datomic docs state that "entities you'll often query across... should be in the same partition to increase query performance". It just so happened that I ended up with a group of entities with a similar ":entity/type" attribute across two partitions that were being returned on a query result, and when I fixed it, there was no effect on performance. The database is relatively small. My thinking was that maybe it had to do with how the db is being cached in memory, but I'm really not sure. Can anyone shed more detailed light on how poor use of partitions might decrease performance? Specifically, a simple test I could run where I would see the performance effect of partitions? Thanks!
@zalky less index segments need to be read into peer memory from storage if all the datoms necessary for a query are partitioned together
this means less read-side pressure — on storage itself, in the peer cache (when it fills up) and in the 2nd-tier cache (memcached). for small databases (or really any database that fits in peer ram) this doesn’t really matter at all. large databases can suffer read performance if they have to read a lot of segments in for normal queries and so cause the cache to churn
basically, unless you think you’re going to have a large database, i wouldn’t worry about it 🙂
the fact that the newest release gives us a simpler partitioning scheme (essentially just sticking to one user partition) shows that it’s not such a worry for most users
Thanks @robert-stuttaford for the response. So to test out the performance effects, do you think it would be sufficient to generate a db of sufficient size, then run a query across partitions? A second question: are there any performance drawbacks of just using a single partition? Or is it just the flip side of the coin: not much to worry about for most use cases.
you’d have to generate a substantial database, and be watching your peer and storage metrics pretty carefully to notice a difference
unless you’re planning a big database and you have strict read-side SLAs to conform to, i wouldn’t worry for now
What's the preferred way of keeping track of an ordered sequence in a datomic store? say i have lots of elements that users of my site vote on, and from these votes I derive a ranking... how can I keep track of a ranking? :item/rank ? seems a bit funky to try and keep uniqueness.
this is a great question, @sova.
you could use a linked list - 2 links to 1, 3 links to 2, which makes it so small changes don’t require renumbering everything
but this does make determining position costly to do, due to having to traverse the list
question, datomic/datalog doesn’t seem to have a concept of paging, offset, etc? How do folks usually implement something like this ? Just walk back and forth through the list of keys?
or you could store a vector of entity ids in a string as a pr-str edn blob
robert,if you modified your suggestion to also have :item/rank
, reindexing would be a matter of swapping :item/rank
, no?
but that means managing a large string for large collections
if you do rank, then (as with linked list) items can only participate in a single ranking list
if you do rank, you’d have to re-calc all the items between the lower and upper rank for any given change
e.g. something moving from 72 to 45 means altering all of 45 through 72
which may be totally ok
do you know of anyone actually doing either at large scale, @marshall ? e.g. 1000s or 10,000s or gulp 1,000,000s?
cool 🙂
Well the rankings will be changing rather rapidly...
As people vote on things
Maybe it's better to calculate them on the fly. But I like the idea of persisting :item/rank in storage. Then I can do some excellent time-travel and see how the entity went down or up in rank over time.
As you said, Robert, "or you could store a vector of entity ids in a string as a pr-str edn blob" .. this is an interesting suggestion. Just keeping a file of all the entity-ids in their rank-order.... Hmm.. I'll have to consider my options a bit.
@eoliphant i'm curious about pagination as well. Based on some googling... http://docs.datomic.com/clojure/#datomic.api/seek-datoms seems very useful
In my own use case I think I'm set on every item having its own unique :item/rank ... so I'll have to make sure they are unique and that they get iteratively updated when the rankings change... I'm curious how this will function on the order of thousands of elements. will let y'all know eventually 🙂
If I wanted to import data into a local DB and then push it to prod, can I be using datomic:free locally and restore to datomic:pro ?
Yeah i’d looked at that @sova , but i’m needing to do it with query results sometimes
@jdkealy Yes you can backup a free and restore into a pro. The only restriction is that backup to S3 is only available in Datomic Pro
ah cool thanks... so when you back up... it makes many directories. would you then tar it, scp onto your transactor server and then backup from local?
You can do that. Alternatively, if you're using pro or starter locally you can backup directly to s3 and restore in production from s3
FWIW, in a past project I just kept an :item/rank
integer and recalculated it on move. It worked fine for me. I was always using pretty small lists (on the order of 10-30 elements or so), and reading the lists was much more common than reordering the lists
not sure what i'm doing wrong but i actually just tried on a fresh install after running ensure transactor
Say, I have a general question. I want to produce a filtered database value representing the universe of data that a specific user can access (based on security rules). I'm thinking about doing this by using (d/with)
to nuke a bunch of entities out of the database value before passing it down to code that will run client-provided pull patterns against it. Is this likely to be performant, or is there a better way to do it?
Another option I've considered is using (d/filter)
, but my rules are too general to filter at the datom level (eg, Bob can see Alice if and only if Bob's project has the same company as Alice's project), so it doesn't seem applicable
@timgilbert surely you can express that as a query?
Yeah, and to a limited extent I've been doing it with rules
But my dream system does this at the top level and then I don't need to add the same boilerplate logic into every single one of my queries
@timgilbert It is difficult to do this kind of access filtering while still supporting the full syntax of datalog queries or pull expressions. It is usually easier to define your own query syntax, perhaps as a subset of datalog or pull expressions, and enforce the access rules in your query evaluator.
Retracting a large number (thousands?) of entities with d/with
is unlikely to perform well. d/filter
with complex queries will also not be "fast."
Hmm, ok, I guess I'll reevaluate my approach
@timgilbert d/filter
may be fast enough. It doesn't sound like you have tried it yet?
The problem we're trying to address is that if our clients send us straight-up pull patterns, they can traverse from entities that they should be able to access to entities they shouldn't be, like Alice is an admin for two different companies, and suddenly Bob from BobCo can see all the data in CarlCo by back-navigating through :company/_admin
or whatnot
@favila, I'll think about it some more but since the argument to the filter predicate is a single datom I don't think it will work
Like I have a semantic context that I'm trying to enforce
@timgilbert it's a db and also a single datom
Oh, hmm, didn't realize that, thanks!
@timgilbert that said, sounds like you could also either whitelist pull attrs (preprocess the pull expr) or completely cover over the search/retrieval "ops" the users are allowed, so that you know they are safe
If you can't trust the client, don't accept raw d/pull
patterns. It will be hard to ensure you've covered all the cases for restricting access.
Yeah, that's what I've been learning. 😉
Think of it like SQL: you wouldn't let your clients send in raw SQL queries.
I mean, I wouldn't accept raw SQL for a postgres-backed service... haha jinx
Ok, well I'll think about this some more. Thanks for the advice @stuartsierra and @favila