pathom

tony.kay 2024-07-29T22:30:19.532609Z

Hey, I’m using Pathom 2.4. When I have a resolver that claims to output 3 things, and I ask for the three things, but some of them are not there, pathom seems to re-call that same resolver N times where N is the number of things it didn’t output. How do I prevent it from doing this? Or is this some kind of misconfiguration on my part?

caleb.macdonaldblack 2024-07-30T10:14:57.352489Z

I get what you mean, I’m also curious on conceptual level. I’ve run into these same issues using time and time again, specifically deriving data through joins/relationships. Its easy enough when the data is just a flat map, never any issues. It works too though relationships most of the time, but when it doesn’t the solution is always hacks

caleb.macdonaldblack 2024-07-30T11:47:45.453689Z

@tony.kay also fyi: using the :com.wsscode.pathom.core/continue seems to fix the repeating calls for missing attributes from your initial repro

(ns io.erical.scratches.scratch33
  (:require
    [com.wsscode.pathom.connect :as pc]
    [com.wsscode.pathom.core :as p]))

(pc/defresolver x-resolver [env input]
  {::pc/input  #{:x/id}
   ::pc/batch? true
   ::pc/cache? false
   ::pc/output [:x/a :x/b :x/c]}
  (do
    (prn "x")
    {:x/a ::p/continue :x/b ::p/continue :x/c ::p/continue}))

(def parser
  (p/parser
    {::p/env     {::p/reader [p/map-reader
                              pc/reader2
                              pc/open-ident-reader
                              pc/index-reader]}
     ::p/plugins [(pc/connect-plugin {::pc/register [x-resolver]})
                  p/error-handler-plugin]}))

(parser {} [{[:x/id 1] [:x/a :x/b :x/c]}])

wilkerlucio 2024-07-30T13:43:23.301129Z

hi @tony.kay, getting late to the party there, but yes, I see the issue with the lack of caching + partial reads. as pointed by @caleb.macdonaldblack you can set some value there, at Nubank we do that quite a lot, but we just use nil as the value to fulfill so pathom doesn't try to resolve again, in your partial case, you can nulify just the fields from that part if its pertinent, so you can still have extra requests for fields that weren't tried to be fetched before

👍 1
caleb.macdonaldblack 2024-07-29T22:30:55.057399Z

Do you have an example?

tony.kay 2024-07-29T22:31:51.898649Z

(defresolver x-resolver [env input]
 {::pc/input #{:x/id}
  ::pc/output [:x/a :x/b :x/c]}
  ... code returns empty map because those don't exist for some ID ...)
resolver gets called 3 times

caleb.macdonaldblack 2024-07-29T22:32:11.982459Z

Just noticed you’re question is about Pathom 2. I haven’t touched, only 3, it but I can still help as best as I can

tony.kay 2024-07-29T22:32:56.303179Z

so, technically it is “right”, but it is non-ideal from a performance perspective.

tony.kay 2024-07-29T22:33:26.402429Z

if the resolver returns a value for all 3, then it isn’t called again. Perhaps I should be putting an explicit not-found marker on the missing things.

tony.kay 2024-07-29T22:33:38.777609Z

I have some vague memory of this, now that I say it out loud 😛

tony.kay 2024-07-29T22:41:48.121959Z

I’m going to turn on pathom tracing and see what is actually going on

caleb.macdonaldblack 2024-07-29T22:55:40.311519Z

(ns io.erical.scratches.scratch12
  (:require
    [com.wsscode.pathom.core :as p]
    [com.wsscode.pathom.connect :as pc]))

(pc/defresolver x-resolver [env input]
  {::pc/input  #{:x/id}
   ::pc/output [:x/a :x/b :x/c]}
  (do
    (prn "hmm")
    {:x/a "1"
     :x/c "3"}))

(def parser
  (p/parser
    {::p/env     {::p/reader               [p/map-reader
                                            pc/reader2
                                            pc/open-ident-reader]}
     ::p/plugins [(pc/connect-plugin {::pc/register [x-resolver]})
                  p/error-handler-plugin
                  p/trace-plugin]}))

(parser {} [{[:x/id 1] [:x/a :x/b :x/c]}])
I wasn’t able to replicate but, i’ve likely misunderstood. What do I need to change here do get your behaviour?

caleb.macdonaldblack 2024-07-29T23:00:34.248349Z

I’m not sure if it helps, but in pathom 3 there is a special keyword/value you can return to to indicate an output wasn’t found for that resolver: https://pathom3.wsscode.com/docs/resolvers/#:~:text=cached%20resolver%20returns-,%3A%3Apco/unknown%2Dvalue,-.%20This%20is%20equivalent

caleb.macdonaldblack 2024-07-29T23:07:29.055619Z

which according to the docs, (and my experience) isn’t a requirement for pathom3, but just a nice to have. You can omit the “unknown” keyword from the resulting map for the same effect. But then pathom3 can’t infer the output for planning and will need you to explicitly define the output in the config. (I actually prefer this, although, it’s more verbose and there is some repetition. A reason for using it that I can get behind is being more explicit and clear. It’s the old null are bad situation. Pathom3 does consider nil to be an output though. (makes sense to me). but can be easy to trip up on if you not aware of it, thank you unit tests.

tony.kay 2024-07-29T23:14:28.544319Z

return an empty map

caleb.macdonaldblack 2024-07-29T23:15:01.103779Z

(ns io.erical.scratches.scratch12
  (:require
    [com.wsscode.pathom.core :as p]
    [com.wsscode.pathom.connect :as pc]))

(pc/defresolver x-resolver [env input]
  {::pc/input  #{:x/id}
   ::pc/output [:x/a :x/b :x/c]}
  (do
    (prn "hmm")
    {}))

(def parser
  (p/parser
    {::p/env     {::p/reader               [p/map-reader
                                            pc/reader2
                                            pc/open-ident-reader]}
     ::p/plugins [(pc/connect-plugin {::pc/register [x-resolver]})
                  p/error-handler-plugin
                  p/trace-plugin]}))

(parser {} [{[:x/id 1] [:x/a :x/b :x/c]}])

caleb.macdonaldblack 2024-07-29T23:15:20.161599Z

Still one “hmm” for me, unless the ‘do’ and/or ‘prn’ is screwing with it

tony.kay 2024-07-29T23:15:41.596669Z

I agree with you that it is only called once like this

caleb.macdonaldblack 2024-07-29T23:15:59.053309Z

There will be something subtle for sure

tony.kay 2024-07-29T23:16:11.988099Z

My simplified example clearly isn’t exploiting the same thing…my production query is huge, and has many joins

caleb.macdonaldblack 2024-07-29T23:16:23.730859Z

yeah, i’ve been there haha

tony.kay 2024-07-29T23:16:26.760459Z

yeah

tony.kay 2024-07-29T23:16:35.760259Z

thanks for the input though

caleb.macdonaldblack 2024-07-29T23:16:38.746469Z

chop it up I rekon, binary search or something

tony.kay 2024-07-29T23:18:01.711169Z

yeah, 300k lines of code with resolvers written by people that didn’t read the docs originall 😛

tony.kay 2024-07-29T23:18:08.911289Z

so, yeah, fun

caleb.macdonaldblack 2024-07-30T00:12:37.285239Z

@tony.kay Double check your a your parser com.wsscode.pathom.core/parser your reader com.wsscode.pathom.connect/reader2 And your what your resolvers are returning

tony.kay 2024-07-30T00:12:43.398619Z

I think these resolvers have caching set to false, which I think might be the problem

caleb.macdonaldblack 2024-07-30T00:13:18.793689Z

Different combinations sync & async configuration is able to produce behaviour similar to what you described from my testing

caleb.macdonaldblack 2024-07-30T00:14:06.500529Z

(ns io.erical.scratches.scratch18
  (:require
    [com.wsscode.common.async-clj :refer [go-catch <?]]
    [clojure.tools.logging :as log]
    [com.wsscode.pathom.core :as p]
    [clojure.core.async :as async :refer [go]]
    [com.wsscode.pathom.connect :as pc]))


(pc/defresolver abc-resolver [env input]
  {::pc/input  #{:x/id}
   ::pc/output [:x/a :x/b :x/c]}
  (go-catch
    (log/spyf :info "abc"
      {:x/a "a"
       :x/b "b"
       :x/c "c"})))

(def parser
  (p/parser
    {::p/env     {::p/reader [p/map-reader
                              pc/reader2
                              ;pc/parallel-reader
                              pc/open-ident-reader]}
     ::p/plugins [(pc/connect-plugin {::pc/register [abc-resolver]})
                  p/error-handler-plugin]}))
                  ;p/trace-plugin]}))


(comment
  (do
    (log/info "START")
    (async/<!!
      (parser {} [{[:x/id 1] [:x/a :x/b :x/c]}]))))

tony.kay 2024-07-30T00:15:05.033419Z

yeah, not using async parsers

caleb.macdonaldblack 2024-07-30T00:16:20.523219Z

(do
    (log/info "START")
    (async/<!!
      (parser {} [{[:x/id 1] [:x/a :x/b :x/c]}])))
INFO: START
Execution error (IllegalArgumentException) at clojure.core.async.impl.protocols/eval6085$fn$G (protocols.clj:15).
No implementation of method: :take! of protocol: #'clojure.core.async.impl.protocols/ReadPort found for class: clojure.lang.PersistentArrayMap
INFO: abc
INFO: abc
INFO: abc

caleb.macdonaldblack 2024-07-30T00:17:03.633719Z

I happens with sync parser & reader, but an async response from a resolver

caleb.macdonaldblack 2024-07-30T00:17:25.414499Z

I doubt this is your issue though, the errors are pretty obvious

tony.kay 2024-07-30T00:20:46.129499Z

caching was it

tony.kay 2024-07-30T00:20:55.568649Z

I have ::pc/cache? false set on those resolvers

tony.kay 2024-07-30T00:21:00.676999Z

I have no idea why

tony.kay 2024-07-30T00:21:16.839379Z

but changing it to true fixes the multiple calls

tony.kay 2024-07-30T00:34:41.645229Z

Just FYI:

(pc/defresolver x-resolver [env input]
    {::pc/input  #{:x/id}
     ::pc/batch? true
     ::pc/cache? false
     ::pc/output [:x/a :x/b :x/c]}
    (do
      (prn "x")
      {}))

  (def parser
    (p/parser
      {::p/env     {::p/reader [p/map-reader
                                pc/reader2
                                pc/open-ident-reader
                                pc/index-reader]}
       ::p/plugins [(pc/connect-plugin {::pc/register [x-resolver]})
                    p/error-handler-plugin]}))

  (parser {} [{[:x/id 1] [:x/a :x/b :x/c]}])
reproduces it

caleb.macdonaldblack 2024-07-30T00:39:38.332489Z

yeah that does it for me too

tony.kay 2024-07-30T00:43:19.329209Z

and this demonstrates why we added it to the code:

(do

    (pc/defresolver foo-resolver [env input]
      {::pc/output [{:foo/xs [:x/id]}]}
      {:foo/xs [{:x/id 1}]})

    (pc/defresolver foo2-resolver [env input]
      {::pc/output [{:foo/ys [:x/id]}]}
      {:foo/ys [{:x/id 1}]})

    (pc/defresolver x-resolver [env input]
      {::pc/input  #{:x/id}
       ::pc/output [:x/a :x/b :x/c]}
      (let [query (::p/parent-query env)]
        (println query)
        (select-keys {:x/a 1 :x/b 2 :x/c 3} query)))

    (def parser
      (p/parser
        {::p/env     {::p/reader [p/map-reader
                                  pc/reader2
                                  pc/open-ident-reader
                                  pc/index-reader]}
         ::p/plugins [(pc/connect-plugin {::pc/register [foo-resolver x-resolver foo2-resolver]})
                      p/error-handler-plugin]}))

    (parser {} [{:foo/xs [:x/a]}
                {:foo/ys [:x/b]}]))

tony.kay 2024-07-30T00:44:05.027479Z

So, basically, when we run the real database query, if we use the incoming client query against the database (simulated by select-keys), then some parallel path of request that asks for different data from the same entity ends up with a bad cached result

tony.kay 2024-07-30T00:44:31.350189Z

So, EITHER we have to over-query the database, OR we have to turn off the caching

tony.kay 2024-07-30T00:44:55.301289Z

I actually added this code back in October and forgot why 😛

tony.kay 2024-07-30T00:45:09.051919Z

I had to look through git history, and see my name attached facepalm

caleb.macdonaldblack 2024-07-30T00:45:29.551949Z

lol. every time

tony.kay 2024-07-30T00:45:45.531539Z

it was an attempt to reduce db traffic, but in this case it’s actually making db traffic 5x worse (in terms of overall query count)

tony.kay 2024-07-30T00:46:54.916249Z

looks like I need to write a more advanced caching plugin for pathom that considers BOTH the id of the entity AND the query being applied

caleb.macdonaldblack 2024-07-30T00:51:50.083539Z

that’s very interesting

caleb.macdonaldblack 2024-07-30T01:06:36.909099Z

can you clarify the problem you’re having? Is this right? : You are querying for :foo/xs and :foo/ys in parallel but sometimes(?) they both relate to the same entity. But since the queries are happing at different times, they sometimes return different data, maybe data was written after the first query and before the second. And the solution was to use the batch functionality and have pathom fetch the common entity in a single query/tx?

caleb.macdonaldblack 2024-07-30T01:35:10.357379Z

You probably already considered this, but you can query for attributes the resolver is configured for, and path will select the keys for you:

(ns io.erical.scratches.scratch22
  (:require
    [clojure.test :as t]
    [com.wsscode.pathom.connect :as pc]
    [com.wsscode.pathom.core :as p]))

(t/deftest query-for-all-outputs-of-the-resolver
  (let [resolvers [(pc/resolver 'xs-resolver
                     {::pc/output [{:foo/xs [:x/id]}]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'xs-resolver)
                         {:foo/xs [{:x/id 1}]})))

                   (pc/resolver 'ys-resolver
                     {::pc/output [{:foo/ys [:x/id]}]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'ys-resolver)
                         {:foo/ys [{:x/id 1}]})))

                   (pc/resolver 'x-resolver
                     {::pc/input  #{:x/id}
                      ::pc/output [:x/a :x/b :x/c]}
                     (fn [{:keys [io.erical.pathom2/trace! com.wsscode.pathom.core/parent-query]} input]
                       (do
                         (trace! 'x-resolver)
                         (select-keys {:x/a 1 :x/b 2 :x/c 3} parent-query)
                         {:x/a 1 :x/b 2 :x/c 3})))]
        *trace (atom [])
        parser (p/parser
                 {::p/env     {:io.erical.pathom2/trace! (partial swap! *trace conj)
                               ::p/reader                [p/map-reader pc/reader2 pc/open-ident-reader pc/index-reader]}
                  ::p/plugins [(pc/connect-plugin {::pc/register resolvers})
                               p/error-handler-plugin]})

        result (parser {}
                 [{:foo/xs [:x/a]}
                  {:foo/ys [:x/b]}])]
      (t/is (= {:io.erical/result {:foo/xs [{:x/a 1}]
                                   :foo/ys [{:x/b 2}]}
                :io.erical/trace  '[xs-resolver x-resolver ys-resolver]}
               {:io.erical/trace  @*trace
                :io.erical/result result}))))

caleb.macdonaldblack 2024-07-30T01:48:02.800099Z

if this results in querying more attributes than needed more often than not, then it’s probably no good

caleb.macdonaldblack 2024-07-30T02:01:22.500889Z

just playing around with another idea, I don’t think is any good though:

(ns io.erical.scratches.scratch25
  (:require
    [clojure.test :as t]
    [com.wsscode.pathom.connect :as pc]
    [com.wsscode.pathom.core :as p]))

(t/deftest query-for-all-outputs-of-the-resolver
  (let [resolvers [
                   (pc/resolver 'xs-resolver
                     {::pc/output [{:foo/xs [:x/id]}]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'xs-resolver)
                         {:foo/xs [{:x/id 1}]})))

                   (pc/resolver 'ys-resolver
                     {::pc/output [{:foo/ys [:x/id]}]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'ys-resolver)
                         {:foo/ys [{:x/id 1}]})))

                   (pc/resolver 'xa-resolver
                     {::pc/input  #{:x/id}
                      ::pc/output [:x/a]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'xa-resolver)
                         {:x/a 1})))

                   (pc/resolver 'xb-resolver
                     {::pc/input  #{:x/id}
                      ::pc/output [:x/b]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'xb-resolver)
                         {:x/b 2})))

                   (pc/resolver 'xc-resolver
                     {::pc/input  #{:x/id}
                      ::pc/output [:x/c]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'xc-resolver)
                         {:x/c 3})))

                   (pc/resolver 'xabc-resolver
                     {::pc/input  #{:x/id}
                      ::pc/output [:x/a :x/b :x/c]}
                     (fn [{:keys [io.erical.pathom2/trace!]} input]
                       (do
                         (trace! 'xabc-resolver)
                         {:x/a 1 :x/b 2 :x/c 3})))]

        *trace (atom [])
        parser (p/parser
                 {::p/env     {:io.erical.pathom2/trace! (partial swap! *trace conj)
                               ::p/reader                [p/map-reader pc/reader2 pc/open-ident-reader pc/index-reader]}
                  ::p/plugins [(pc/connect-plugin {::pc/register resolvers})
                               p/error-handler-plugin]})

        result (parser {}
                 '[{:foo/xs [:x/a :x/c]}
                   {:foo/ys [:x/b]}])]
      (t/is (= {:io.erical/result {:foo/xs [{:x/a 1}]
                                   :foo/ys [{:x/b 2}]}
                :io.erical/trace  '[xs-resolver x-resolver ys-resolver]}
               {:io.erical/trace  @*trace
                :io.erical/result result}))))

caleb.macdonaldblack 2024-07-30T02:03:49.936459Z

But i’m curious about splitting that single abc resolver into multiple a, b & c resolvers. downsides are multiple queries. then combination problem going deeper

tony.kay 2024-07-30T03:19:26.637929Z

I’m specfiically talking about a pathalogical case, which my code clealy demonstrates. A resolver can resolver many things, and when it is needed that is what I want (one db query), but I also want that resolver to be able to minimally query the db (select-keys simulates this) so that I don’t over-query. But, since the EQL can possibly try to resolve the same entity down different paths, with different sub-queries, the caching (which keys JUST by the ID of the input entity) can potentially cache a version that is insufficient for some other path in the same EQL. Thus, the fix (which I mentioned) is to fix the caching (e.g. write reader3) that keys the cache by input + subquery. Don’t really need more suggestions. That is the answer.

tony.kay 2024-07-30T03:20:07.711849Z

making more resolvers just gets me a requirement to run multiple db queries. On the average, I want one, and with this fix, that’s what I will get the vast majority of the time.

tony.kay 2024-07-30T03:33:19.196389Z

i.e. one resolver = one datomic pull. The resolver indicates what is avail, but honors client request to minimize pull time. But, this caching problem means it can result in weird pathological cases (as shown) because the caching isn’t smart enough.