Fork me on GitHub
#datascript
<
2022-05-18
>
zeitstein15:05:32

I want to write a 'flat pull', i.e. produce a seq of normalized maps instead of nested maps. My first try was taking out the recursion from the pull pattern I was using, writing a loop and using the now recursion-free pull at each iteration. Simple and quick. It turned out to be 2-3 times slower than pull v3. Don't know if there is significant overhead to calling pull on each iteration? Perhaps this result alone points to my code being the main culprit. Next, I stopped using pull altogether, using a combination of datoms and entity to build up the results. That performs about the same. I mean, I can write exactly what I need, surely it should be able to perform at least the same? So, I decided it was time to solicit some advice, while I'm trying to improve my own code. As someone digging into Datascript internals for the first time, I tried going through pull_api but there's a lot going on there 🙂 I guess this points to what a fantastic job @tonsky did with pull v3 🎉

zeitstein15:05:03

One thing I'm not handling well is that at each level I need to do a join to the next level in order to sort. I.e. something like:

{:id 1 :children [{:id 2 :order 1} {:id 3 :order 0}]}
;; ->
{:id 1 :children [3 2]} ;; + {:id 2 :order 1} {:id 3 :order 0}
But then I discard that and read the same entities again on the next level.

dvingo16:05:49

is this using pull or just entity and datoms?

dvingo16:05:16

one thought i've had recently is to combine datascript (really any datalog db with entity) and flygraph https://github.com/aysylu/loom/blob/d458f0c0dee9021983c64381b90a470f0178cc8e/src/loom/graph.cljc#L539

dvingo16:05:00

maybe that could work here

dvingo16:05:16

1. describe your pull as a flygraph 2. issue dfs on it

dvingo16:05:14

there was a discussion in the #datalog channel about this which led me to look into graph algos instead of query langs for this sort of problem. in the thread in that channel I used recursive rules to do something like you want -you'll get back normalized data

zeitstein16:05:19

I'm not familiar with Loom. I'd delay looking into it, until I've given this a shot. At any rate, 2 times slower than pull v3 is not that bad. > is this using pull or just entity and datoms? Both, though with pull I'm using :xform to do the sorting. I should be able to come up with a better strategy to deal with that 'level join' thing. I think I have it.

Josh17:05:09

have you tried using a combo of pull and tree-seq? You should also be able to use only entity and a recursive function get similar performance to pull, at least that is what I do what I want a flat list of all of the children of an entity

zeitstein19:05:25

Have not tried tree-seq. Has to be slower than just using pull, though. Happy to report just solving that 'join-level' issue brings my custom 'flat pull' on par with just pull v3. To be clear, this is tailored to my data model, not a general solution.

zeitstein19:05:41

@U051V5LLP I'm interested in what lead you down that path. Can you link to that thread or summarise?

dvingo19:05:28

it’s the last thread in the #datalog channel

dvingo19:05:41

I was trying to have arbitrary recursive depth rules - but you can’t do that. the journey I went down: https://forum.datomic.com/t/how-to-do-graph-traversal-with-rules/132/5 led to => https://hashrocket.com/blog/posts/using-datomic-as-a-graph-database led to => https://github.com/Datomic/mbrainz-sample/blob/master/src/clj/datomic/samples/mbrainz/rules.clj and => https://github.com/aysylu/loom/ then on to https://www.youtube.com/watch?v=wEEutxTYQQU and that was where I learned about FLY GRAPHS lol greatest name - making a graph on the fly - this is where I stopped. I want to explore adding arbitrary logic to a graph walk. The end goal is being able to have pull-like behavior where at each step of the recursion you can perform any computation you want to determine your results (continue walking, stop walking, transform the results, etc)

dvingo19:05:49

also happened upon https://tonsky.me/blog/datascript-2/ (see section “No queries” and “Recursive walking”) which nudged me to think about this as accessing the data itself and not using the query engine (i was thinking just the entity api)

zeitstein20:05:51

Sounds very interesting and you've given me a lot to peruse 🙂 Thanks!

wizard 1