Fork me on GitHub
#pathom
<
2023-08-14
>
thheller10:08:39

before I dig into trying to make a repro for a problem we found in pathom3, maybe someone can point me to the place in the code in pathom that might be responsible for this. knowing what the code actually does might make it easier to create a repro. the problem is as follows: there are two resolvers, one that takes no input and one that takes an input, i.e. look thing up by id. both resolvers happen to share one (or more) namespaced keyword in their output. so both have ::pco/output [: ...] basically. now we run a query with a ident to look up such as

[{[: 1]
  [:
   :
   :
   :
   :]}]
which correctly hits the only resolver it can for this. but for some reason depending on the placement of the shared : attribute in that query vector it also sometimes hits the second resolver (which requires no inputs) and returns its : value instead. The first resolver already returned that value though. why does pathom not use it and run a second resolver?

thheller10:08:09

my assumption was that it can only take one resolver given the input provided first

thheller10:08:32

but that doesn't seem to be true though. so my guess is that it find two paths to the : attribute and for some reason plans to use both?

thheller10:08:02

what is weird is that changing the order of the vector can affect which result you get

thheller10:08:14

[{[: 1]
  [:
   :
   :
   :
   :]}]

thheller10:08:24

so this might give the desired result, while the above doesnt?

thheller10:08:43

I know the sin here is sharing the attribute but I'd like to understand how this happens. Given the earlier advice of "keep it flat" the potential for this to happen will only grow

thheller10:08:09

lets say the ident resolver has

{::pco/input [:]
 ::pco/output [: : : : :]}

caleb.macdonaldblack10:08:12

You can configure a priority so one resolver is preferred over another. https://pathom3.wsscode.com/docs/resolvers#prioritization

thheller10:08:15

I'm not looking for ways to work around the problem. I'm trying to understand why it takes a second resolver when the first already provided all data

caleb.macdonaldblack10:08:53

The visual tool shows exactly what happens during planning and execution. https://pathom3.wsscode.com/docs/debugging#debug-with-pathom-viz

caleb.macdonaldblack10:08:05

You can step through the planning as well

caleb.macdonaldblack10:08:24

Also it’s probably how pathom indexes the data as described in these docs, that is the reason behind the behavior you’re experiencing. https://pathom3.wsscode.com/docs/indexes#index-oir It would seem that pathom indexes the output attribute. So when figuring out what data as you’re requesting, it looks up each attribute in the oir index, first by output on a per-attribute basis. I’m not certain however, just speculating based on a quick glance at the index-oir docs.

caleb.macdonaldblack10:08:54

And I doubt these indexes are sorted maps so it’s probably arbitrary with regards to what resolver is selected first, based on matching output-input lookup

nivekuil10:08:36

it shouldn't call any extra resolvers once it has the data. I've hit a similar bug and iirc it was not trivial to track down https://github.com/wilkerlucio/pathom3/pull/154/files pathom-viz helps a lot to see what's really going on. It should always plan to traverse both resolvers, but I think the runner's execution order is undefined with equal priority nodes, and if you're using the parallel runner it might end up doing extra work in those cases

wilkerlucio12:08:08

hello @U05224H0W! to clarify, what @U3XCG2GBZ and @U797MAJ8M said is correct, the way pathom figures paths is from the output first, so if you have two ways to provide a given attribute (one with a required input, and one with no input required), from Pathom perspective you have 2 paths for that, and unless you define priorities, both paths will have equal probability to be called (in case you have the input data available, otherwise only the one without input will be considered)

Björn Ebbinghaus15:08:16

That made me think… wouldn’t implicit priorities based on specificity be a good feature? More Attributes = More specific?

wilkerlucio15:08:47

not necessarily, this is up to the modeler to decide

wilkerlucio15:08:03

that said, you can implement a priority algorithm to work in that fashion

Björn Ebbinghaus15:08:31

A manual priority would still be "most specific". But adding that implicit behaviour would remove the randomness from that part of pathoms behaviour, which should be a win for everyone, as it makes the behaviour repeatable, thus helping with debugging. (It would only be a breaking change if someone would depend on that randomness. But I guess that's not a case. 🙂) Something like: 1. Select the resolver that outputs the most missing attributes. 2. Select the resolver with the highest priority. 3. Select the resolver with the most specific input. 4. Undecided / choose one at random / (throw error at index time? Could be useful to prevent elusive bugs.)

wilkerlucio15:08:57

but I can consider adding the most attributes as well, but needs some thinking, can you please add an issue for it?

wilkerlucio15:08:46

I wouldn't do 4, because it is an expected characteristic of the system that you have many options for the same thing, and its also expected the values are consistent, not matter what path (at least semantically)

wilkerlucio15:08:21

this is specially a thing when I think about the usage of distributed graphs, where you pull other graphs in

thheller15:08:04

FWIW in this context I absolutely hate the "one namespace per entity" for keywords pattern. thus I has one namespace for my apps, e.g. shadow.cljs.model and in there I'll have my attributes which are then still somewhat qualified. so :shadow.cljs.model/build-id and shadow.cljs.model/resource-id, vs shadow.build.model.build/id or just shadow.cljs.model/id. I'm prefer this since it doesn't have the "alias noise" you'd otherwise have in every ns. one alias is better than 5 😛

thheller15:08:25

and it is sometimes convenient to "share" certain attributes between entities, but I guess that breaks pathom then if this is expected behaviour

wilkerlucio16:08:59

its ok to share attributes, thats kinda the point of them :)

wilkerlucio16:08:29

if you use one or multiple ns, also fine, as long as each attribute has its distinct semantic purpose

wilkerlucio16:08:08

in this world the attributes are the primary means of the data modeling, which is different from the norm of types, but aligns well with clojure premises (spec, datomic) and also with RDF concepts (which is the inspiration for the clojure primitives, Rich said so in some talk)

thheller16:08:25

yes, the problem only arises since there is one path to the shared attribute that doesn't require any inputs

thheller16:08:11

I guess since there is one resolver that pulls it out of the env (i.e. current user)

thheller16:08:44

what surprised me is that it picks the correct resolver to get the data, and then picks another one although it already has everything it needs

wilkerlucio16:08:12

if the data is fulfilled it shouldn’t try to get it from anywhere else

wilkerlucio16:08:22

it never overrides data it already loaded

thheller16:08:59

thats what surprised me 🙂 although I'm only remotely debugging this via screen sharing, so could totally be that it doesn't actually call the resolvers in the unexpected order

thheller16:08:41

just in the undesired order 😉

wilkerlucio16:08:13

for current user case, I suggest instead of making it at global level, make a resolver that outputs something like: :system/current-user, with the data inside of it

wilkerlucio16:08:34

this way you can avoid the clash with other user related resolvers to load data

wilkerlucio16:08:35

because otherwise it becomes ambiguous if you are trying to load current user vs some other form of loading (like via id)

jacekschae16:08:10

we tried this and this works well, with nesting, but then you have to unpack on the other side ... i think the most confusing thing was that the output was based on the order of properties

thheller16:08:55

@U8A5NMMGD we didn't do that exactly. only one case, not all of them 😛

jacekschae16:08:19

i did wrap it for this exact case and it solved the issue -- not sure what you refer to when you say all of them

thheller16:08:45

we are not talking about wrapping in general

thheller16:08:56

we are talking about wrapping in this specific case for the resolver that has no inputs

thheller16:08:10

I have not seen the current-user I also suggested yet 😉

wilkerlucio16:08:26

one other way to say it, its that the User is a kinda of a node in the system, when you add the notion of a "global user" (meaning current user in the case we are talking), that's a big and dangerous inference, instead its better to give a name for that context, in the future you could have somethibng like "master admin", which would be a different pointer. so, the rule of thumb here is: avoid global paths for specific entities, use global (no input required) when you really mean global (where no matter the context, the values should be the same)

thheller16:08:17

but this ambiguity is still troublesome in some cases, even for cases that require input. but I guess that can be avoided with some discipline.

wilkerlucio16:08:08

yeah, for me the trickiest one is when we flatten multiple entities in the same context (same map), this is a nice feature that makes things very convenient (and Ive suggested doing so for many cases where there is a 1-1 relationship)

wilkerlucio16:08:29

but it hurts when you try to converge their data to some common interface

wilkerlucio16:08:33

for example, if you try to make a generic :ui/title, that should work for many entities, them, when you have multiple entities at the same map, becomes hard to understand from which entity the title will be (will be it :user/title or :company/title for eg?)

wilkerlucio16:08:48

that's where the design of the structure should be though to suite the needs of the app

thheller16:08:02

yeah, thats what I meant with discipline. better to join [{:company [:ui/title]}] over just [:ui/title] and leaving it up to pathom to resolve what that means

nivekuil16:08:33

what about always returning data with the most specific attribute, and do the interface part in pathom? so you return :company/title and you have an alias resolver from that to :ui/title

nivekuil16:08:58

performance overhead sure but we're already using pathom 🙂

wilkerlucio16:08:30

thats kinda the problem right, if you have a map containing: {:user/title "foo", :company/title "bar"} (which happens when you flatten the relationship, considering the user can only have one company) and have two alias: :user/title -> :ui/title :company/title -> :ui/title

wilkerlucio16:08:41

now, is pathom gonna pick the user or the company title?

nivekuil16:08:56

it's the client's fault for not specifying

nivekuil16:08:05

pathom can return whatever, contract fulfilled

nivekuil16:08:22

but this is what placeholders are for no?

wilkerlucio16:08:45

not really, the solution is like @U05224H0W said, ensure the distinct entities (avoid flattening, if you desire such shared attributes)

wilkerlucio16:08:56

placeholders just give you the same context in a different part of the structure

nivekuil16:08:28

I still think this is on the client to properly specify, but I was thinking if you only had a :ui/title attribute does this not work? {:>/user [:user/id :ui/title] :>/company [:company/id :ui/title]}

wilkerlucio16:08:19

these two different joins make no difference for Pathom, given what matters for the Pathom context is the data available (which remains the same when using placeholders), not the user requeqst

Ben Grabow21:08:26

I ran into a funky bug today that made me question reality for a while:

(ns pathom-bug-test
  (:require [com.wsscode.pathom3.connect.operation :as pco]))
=> nil
(pco/defresolver foo
  []
  {::foo 1})
=> #'pathom-bug-test/foo
(foo)
=> #:pathom-bug-test{:foo 1}
(def foo-result (foo))
Execution error (AbstractMethodError) at com.wsscode.pathom3.connect.operation.Resolver/applyTo (operation.cljc:-1).
Method com/wsscode/pathom3/connect/operation/Resolver.applyTo(Lclojure/lang/ISeq;)Ljava/lang/Object; is abstract

Ben Grabow21:08:11

It turns out that this is a general problem with top-level def on a form that calls an implementation of IFn that is lacking an applyTo definition. https://clojure.atlassian.net/browse/CLJ-1715

Ben Grabow21:08:40

Maybe it would be nice if Pathom's Resolver got an applyTo definition to smooth out the experience of calling resolvers directly.

wilkerlucio22:08:15

wow, that's a new one for me, sure, I think adding applyTo should be easy enough

wilkerlucio22:08:51

can you open an issue please?

nivekuil00:08:15

do you want to make some noise about that ignored 8 year old bug that cost X amount of your sanity? Stuff like this is the worst part about clojure

thheller07:08:57

I ran into the same thing a while ago and asked in #clojure-dev. this is not a ignored bug, this is a missing implementation on the code part. all Ifn impls technically should have applyTo implemented since they otherwise don't work with apply. if you never call them through apply you never notice though.

Ben Grabow14:08:28

I'll see about making some noise on the CLJ issue too.