This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2015-08-26
Channels
- # admin-announcements (33)
- # beginners (41)
- # boot (97)
- # clojure (220)
- # clojure-berlin (3)
- # clojure-russia (31)
- # clojure-sg (3)
- # clojurebridge (2)
- # clojurescript (137)
- # clojutre (13)
- # core-matrix (10)
- # core-typed (1)
- # cursive (18)
- # datascript (1)
- # datomic (93)
- # devops (6)
- # editors (18)
- # emacs (1)
- # funcool (43)
- # hoplon (4)
- # immutant (6)
- # instaparse (3)
- # jobs (25)
- # ldnclj (14)
- # ldnproclodo (4)
- # off-topic (20)
- # om (21)
- # rdf (79)
- # re-frame (14)
- # reagent (12)
- # ring-swagger (18)
- # yada (52)
Datomic is really neat technically but how to tackle the question about keeping your data in a closed system
@jonas i can talk to you about pagination
@robert-stuttaford: curious to hear your take. I have a lot of thoughts about it
@sdegutis: a new connect is also downloading the live index from the transactor
I'd expect that to typically be very small in the dev transactor, but maybe that's just my use case 😉
we’re actually struggling with pagination at the moment
but i think our issue is conceptual more than it is a fault of Datomic's
you have 100k entities. you want to see the 100 most recently active ones. you have to get all 100k, sort by activity date descending, then grab the first 100
how to break this work up?
we’re doing things like memoizing (on a redis backend) with the db value so that once you’ve generated the set, pages 2..n are super fast
but making that initial set is still super slow
we’re pre-calculating as much as we can, too, so that queries only need look at a single attr per entity
@robert-stuttaford: in that case, raw index walking isn't too hard, right?
but that’s not always possible
it is if you want descending order
storing ever descending values doesn’t work. it’s not really a solution if you have lots of existing data
uh, wha?
if it were easy to enable some sort of ‘clutch’ and allow edits in the past, then it’s easy to fix with ETL
but it’s not. to do that we’d have to retransact our entire db in time order and alter txes on the fly
we’re at over 40mil txes
oh, he stores them in ascending order by inverting them from Long/MAX_VALUE. Now I understand 😉
conceptually all you'd need is access to an inverted index (which seems… relatively doable?)
right now, we take the sort dimension you want to use, realise the full set for just that one ‘attr’ (might be computed, might be direct lookup), sort, paginate, then realise the rest of the data for each ‘row'
we’ve cut a lot of processing time like this, and i’ve got it all using datalog and transducers as much as possible
but it still takes long for big sets
(many folk have asked for inverted indexes though, so I assume they have good reasons for not doing it yet)
we have to find a better way. if we didn’t need to sort, then you can paginate very easily. unfortunately, unsorted data is fairly useless in a reporting context. sorting’s the real perf pain.
yeah 😞 And conceptually, the indexes have the already sorted data, just there's no way to ask datomic for it 😞
i would actually dig to have a 1 or 2 hour hangout with you tom, and whoever else has tried their hand at this to talk about novel options
i can talk through what we’ve done so far
what’s worked, how well, etc
bastard 😁
@robert-stuttaford: That would be great! I will need to read through and respond later. I have a few ideas myself as well
(actually it's bigger than that now, thinking about it. Still, like the number of entities is below 1k)
yeah no we are WAY beyond that
3 years of user data
@robert-stuttaford: yeah, understood 😉
@robert-stuttaford: aren't y'all paid users? Lean on dat support contract
it’s not a datomic support issue. datomic isn’t doing anything wrong
it’d be a consulting gig
no, but a "how do I use your product to do $COMMON_TASK" thing imo (and I think it is a datomic issue, because there's no inverted index access)
for descending sorts, yes
i’m looking into pre-processing with Onyx and creating Sorted Sets in Redis now
I would like (the possibility) to get sorted sets out of the datalog queries where you can specify the sort-order. Then you could also specify offset/limit
as Onyx is processing our data tx by tx, it can update many sets pretty quickly
@jonas: yep, although it’s all still going to happen app-side
if Datomic provides this, it’s going to be a layer around d/q, not a new internal part of it
and we can pretty much do that ourselves
that’s a big fat assumption on my part, of course
i don’t have any sort of insider knowledge or anything 😁
anyway. i have to be off. i’d love to show you guys what we’re doing at the mo, as it might help you, but also it’ll probably help me because you’ll likely poke holes in all of it
perhaps a hangout sometime in September?
So from the docs http://docs.datomic.com/clojure/index.html#datomic.api/q it looks I can (and perhaps should prefer?) to write queries as maps rather then vectors, but whenever I try it, I get "java.lang.IllegalArgumentException Don't know how to create ISeq from: clojure.lang.Symbol" am I doing it wrong, or maybe the docs are describing an as yet unreleased API? (I'm running 0.9.5206 which I think is the latest)
I don’t think the map form is preferred (except for when you’re generating queries programmatically). The IllegalArgumentException is probably unrelated. Note that when using the map form you need to wrap the “arguments” in an extra vector (or list): {:find [?a ?b ?c] …}
instead of [:find ?a ?b ?c …]
What are the advantages or disadvantages of using maps to describe transactions vs using vectors?
@sdegutis: maps usually refer to a single entity and they are easier to generate since you can assoc attributes with values in.
@sdegutis: I'm not 100% confident on this next point: vectors might be the only way to leverage user defined or built in functions like :db.fn/retract-entity
I guess I don't understand why transact returns a future when it's not async and waits for it to complete anyway.
so the result is not built if it's not needed
(would be my guess - I'm not on the datomic team)
http://docs.datomic.com/best-practices.html#t-instead-of-txInstant seems to say so
@alexmiller: those are ts, though, not txids?
ah, right
I think it is logical that txids would be as well (for ordering in the index, plus they are serialized at creation time), but I don't know that that is guaranteed