pathom

2024-07-03T20:35:20.560309Z

here's another high-level question about Pathom: does it have a good story for deduplicating in the middle of a graph? • I have a set of Container IDs I care about. • Containers can be nested; find the transitive closure of relevant Containers and fetch their details. ◦ (Batch those requests to the extent possible, ie. 1 query per "level" of the dependency spanning tree, not 1 per transitive Container.) • Each Container has the IDs of 1 or more Things it cares about. • Fetch the details of all the relevant Things in one request the batched resolvers for [container-id ...] -> [container-details ...] and [thing-id ...] -> [thing-details ...] are easy to write. the tricky parts are the recursive containers and the de-duplicated/normalized Things. I want to return {:containers [container-details-without-things ...], :things [...]} in a normalized way, and I'm not sure how to achieve that aside from wrangling several Pathom queries and manually factoring out the Things in the middle. (duplicating the Things on the wire (even if the requests are deduplicated internally) is not great, they're pretty big.)

wilkerlucio 2024-07-04T23:24:06.109009Z

not sure if I fully understand this, but sounds like you should be able to leverage the default caching, if same id is spread across different parts of the query but result in the same resolver call, it should do it only once for each unique id

caleb.macdonaldblack 2024-07-05T02:28:52.825379Z

(ns io.erical.sandbox.pathom.bshep6
  (:require
   [clojure.set :as set]
   [clojure.test :refer :all]
   [com.rpl.specter :as sp]
   [com.wsscode.pathom3.connect.indexes :as pci]
   [com.wsscode.pathom3.connect.operation :as pco]
   [com.wsscode.pathom3.interface.eql :as p.eql]
   [datascript.core :as d]))

(deftest example1
  (let [schema            {:things     {:db/valueType   :db.type/ref
                                        :db/cardinality :db.cardinality/many}
                           :containers {:db/valueType   :db.type/ref
                                        :db/cardinality :db.cardinality/many}}
        tx-data           [

                           [:db/add "container0" :title "Container 0"]
                           [:db/add "container1" :title "Container 1"]
                           [:db/add "container2" :title "Container 2"]
                           [:db/add "container3" :title "Container 3"]
                           [:db/add "container4" :title "Container 4"]
                           [:db/add "container5" :title "Container 5"]

                           [:db/add "thing1" :title "Thing 1"]
                           [:db/add "thing2" :title "Thing 2"]

                           [:db/add "container0" :containers "container1"]
                           [:db/add "container1" :containers "container2"]
                           [:db/add "container2" :containers "container3"]
                           [:db/add "container2" :containers "container4"]
                           [:db/add "container2" :containers "container5"]

                           [:db/add "container1" :things "thing1"]
                           [:db/add "container2" :things "thing2"]
                           [:db/add "container3" :things "thing2"]]

        {:keys [db-after tempids]} (-> (d/empty-db schema)
                                       (d/with tx-data))

        root-container    {:db/id (get tempids "container0")}

        debug-eid->tempid (comp (set/map-invert tempids) :db/id)

        resolvers         [
                           (pco/resolver 'id>containers
                             {::pco/input  [:db/id]
                              ::pco/output [{:containers [:db/id]}]
                              ::pco/batch? true}
                             (fn [{:keys [db]} inputs]
                               (prn 'id>containers
                                 (mapv debug-eid->tempid inputs))
                               (let [outputs
                                     (vec
                                       (sp/select
                                         [sp/ALL
                                          :db/id
                                          (sp/view (partial d/pull db [{:containers [:db/id]}]))
                                          (sp/if-path nil?
                                            (sp/view (constantly {:containers ::pco/unknown-value}))
                                            sp/STAY)]
                                         inputs))]
                                 outputs)))

                           (pco/resolver 'id>things
                             {::pco/input  [:db/id]
                              ::pco/output [{:things [:db/id]}]
                              ::pco/batch? true}
                             (fn [{:keys [db]} inputs]
                               (prn 'id>things (mapv debug-eid->tempid inputs))
                               (let [outputs
                                     (vec
                                       (sp/select
                                         [sp/ALL
                                          :db/id
                                          (sp/view (partial d/pull db [{:things [:db/id]}]))
                                          (sp/if-path nil?
                                            (sp/view (constantly {:things ::pco/unknown-value}))
                                            sp/STAY)]
                                         inputs))]
                                 outputs)))

                           (pco/resolver 'id>details
                             {::pco/input  [:db/id]
                              ::pco/output [:title]
                              ::pco/batch? true}
                             (fn [{:keys [db]} items]
                               (prn 'id>details (mapv debug-eid->tempid items))
                               (into []
                                 (comp
                                   (map :db/id)
                                   (map (partial d/pull db [:title])))
                                 items)))

                           (pco/resolver '>flat-containers2
                             {::pco/input  [:db/id (pco/? {:containers [:db/id]})]
                              ::pco/output [{:flat-containers [:db/id]}]
                              ::pco/batch? true}
                             (fn [_env inputs]
                               (prn '>flat-containers2
                                 (mapv debug-eid->tempid inputs))
                               (mapv (fn [{:keys [db/id containers]}]
                                       {:flat-containers
                                        (into [{:db/id id}] containers)})
                                 inputs)))

                           (pco/resolver '>flat-containers3
                             {::pco/input  [:db/id {:containers [{:flat-containers [:db/id]}]}]
                              ::pco/output [{:flat-containers [:db/id]}]
                              ::pco/batch? true}
                             (fn [_env inputs]
                               (prn '>flat-containers3
                                 (mapv debug-eid->tempid inputs))
                               (mapv (fn [{:keys [db/id containers]}]
                                       {:flat-containers
                                        (into [{:db/id id}] (mapcat :flat-containers containers))})
                                 inputs)))

                           (pco/resolver '>flat-things
                             {::pco/input  [:db/id {:flat-containers [(pco/? {:things [:db/id]})]}]
                              ::pco/output [{:flat-things [:db/id]}]
                              ::pco/batch? true}
                             (fn [_env inputs]
                               (prn '>flat-things
                                 (mapv debug-eid->tempid inputs))
                               (map
                                 (fn [{:keys [flat-containers]}]
                                   {:flat-things
                                    (into []
                                      (comp
                                        (mapcat :things)
                                        (dedupe))
                                      flat-containers)})
                                 inputs)))]

        env               (-> {:db db-after}
                              (pci/register resolvers))]

    (is (= {:flat-containers
            [{:title "Container 0"}
             {:title "Container 1"}
             {:title "Container 2"}
             {:title "Container 3"}
             {:title "Container 4"}
             {:title "Container 5"}]
            :flat-things
            [{:title "Thing 1"}
             {:title "Thing 2"}]}
           (p.eql/process
             env
             root-container
             [{:flat-containers [:title]}
              {:flat-things     [:title]}])))))

caleb.macdonaldblack 2024-07-05T02:29:14.742129Z

Logs:

id>containers ["container0"]
id>containers ["container1"]
id>containers ["container2"]
id>containers ["container3" "container4" "container5"]
id>containers ["container3" "container4" "container5"]
>flat-containers2 ["container3" "container4" "container5"]
>flat-containers3 ["container2"]
>flat-containers3 ["container1"]
>flat-containers3 ["container0"]
id>details ["container0" "container1" "container2" "container3" "container4" "container5"]
id>things ["container0" "container1" "container2" "container3" "container4" "container5"]
>flat-things ["container0"]
id>details ["thing1" "thing2"]

caleb.macdonaldblack 2024-07-05T02:30:00.077169Z

first three containers are at different depths. So no batching. 3, 4 & 5 are at the same depth and batch

caleb.macdonaldblack 2024-07-05T02:31:55.659099Z

>flat-containers2 & >flat-containers3 seem a bit jank, but it works. They flatten the containers. I experimented with a bunch of approaches. What I have there is the only one that worked.

caleb.macdonaldblack 2024-07-05T02:33:25.116159Z

id>details returns :title This is where you can pull all the details you want

caleb.macdonaldblack 2024-07-05T02:34:34.044089Z

id>things queries things for a container. This batches in one go, which is neat

caleb.macdonaldblack 2024-07-05T02:39:12.852939Z

>flat-things batches once too. Takes :flat-containers at the root, mapcat for :things and dedupes

caleb.macdonaldblack 2024-07-05T02:41:57.856199Z

I’ve made handful of assumptions however.

caleb.macdonaldblack 2024-07-05T02:44:46.108979Z

For example: 1. A container has exactly one parent (except the root one). So containers only show in the tree once. 2. A container & a thing have a many-to-many relationship 3. I Assumed that your db cannot query nested containers recursively. So this example leverages Pathom to walk the tree fetching children at every level a. If you can do a recursive query with you DB, things get way simpler:

(pco/resolver 'id>flat-containers4
  {::pco/input  [:db/id]
   ::pco/output [{:flat-containers [:db/id]}]}
  (fn [{:keys [db]} input]
    (prn 'id>flat-containers
      (debug-eid->tempid input))

    {:flat-containers
     (sp/select
       [(sp/recursive-path [] p
          (sp/stay-then-continue :containers sp/ALL p))
        (sp/submap [:db/id])]
       (d/pull
         db
         [:db/id {:containers '...}]
         (get input :db/id)))}))

;; logs
id>flat-containers "container0"
id>details ["container0" "container1" "container2" "container3" "container4" "container5"]
id>things ["container0" "container1" "container2" "container3" "container4" "container5"]
>flat-things ["container0"]
id>details ["thing1" "thing2"]

caleb.macdonaldblack 2024-07-05T03:04:59.566479Z

If you curious I uploaded all the files I worked on here: https://gist.github.com/CalebMacdonaldBlack/16eaac6a0133b0ee16a5643e9995a83a

2024-07-05T14:20:25.703819Z

this is fantastic, thank you so much!

2024-07-05T14:24:39.654639Z

your assumptions don't track perfectly, though. it's not a "parent" containment relationship so much as a DAG of dependencies. cycles aren't allowed but a container can depend on multiple other containers. (and the reverse - any container might be depended on by several other containers.)

caleb.macdonaldblack 2024-07-06T08:52:11.454969Z

No worries, was there you’re still unsure about?

2024-07-06T18:00:11.235769Z

I think I follow. I'll have to play around with it. thanks again!