Fork me on GitHub
#pathom
<
2024-04-26
>
Braden Shepherdson00:04:21

I'm struggling with nesting and context. I've got a chain of resolvers that open a binary file, decode it, produce {::big-map {id-number {:foo/bar 7, ...} ...}} with a map of records. that part works fine, the map is present and correct. the part that's not working is one resolver which returns a list like {:matches [{:foo/id 12} {:foo/id 13}]}. I'm using a (pbir/attribute-table-resolver ::big-map :foo/id [:foo/bar ...]), which I was hoping would allow for a query like (p.eql/process index {:file/path "..."} [{:matches [:foo/bar ...]}]) to populate that list with all the :foo/* attributes.

Braden Shepherdson13:04:47

still stuck on this. I dug into the ex-data and see that ::big-map is in the :unreachable-paths, along with all the other global stuff. is a nested chunk like [:matches 0] not allowed to see global things? is it because there's a cycle I'm not seeing?

Braden Shepherdson13:04:38

is there a barrier between separately registered indexes? I've got a few groups of resolvers in different files and they're getting combined like (pci/register [abc/index xyz/index local-resolver]) where abc/index is itself a (def index (pci/register [several local resolvers])).

wilkerlucio14:04:37

hello Braden, can you make a small repro so I can run and check on my side?

wilkerlucio14:04:04

about the last question, no, it doesn't matter when you registered the resolvers (or how you group their registering), its always like adding one at a time

Braden Shepherdson14:04:38

I've been trying to create a small repro... but they're all working. I still haven't figured out what's breaking it. I'm betting on a bad output shape somewhere, but I can't find it. I'll send a repro if I get one going.

👍 1
Braden Shepherdson14:04:46

should I be concerned about potential cycles? the chain is a little involved, but it's something like this: • ::big-map distills the binary file into the map of :foo/id to several properties. ◦ its output is [{::big-map [:foo/bar ...]}] giving the complete shape • ::big-list is [{:foo/id}] for all of the vals of ::big-map • a resolver considers a few :foo/bar props and provides :foo/match? which is a bool • :list/match? filters ::big-list to only those with :foo/match? true all of that seems to be working, I get a correct result for :list/match? [{:foo/id 123} {:foo/id 456}] with only a few entries (`::big-list` has a few thousand). I expected that I could query for [{:list/match? [:foo/bar :foo/baz]}] and it would populate those matches with values from ::big-map - but that's the part that chokes.

Braden Shepherdson14:04:32

if that plan is sound in theory then I'm not sure where it's going wrong. alternatively, I'm happy to be told that that's a bad way to model all this!

Braden Shepherdson14:04:42

but I'm concerned about cycles since :list/match? and its entries already depend on ::big-map, so perhaps they can't reference it again?

wilkerlucio16:04:25

about cycles, Pathom 3 does cycle detection and stop those pathways, laterally and nested as well but for your case, this is more complex than I can hold in my mind at once and infer from description, the issue on this cases is that little nuances are really important to understand whats happening, so a working repro is ideal because there I dont have any missing information (I can see exactly what you are doing), lets continue once you get some repro that we can discuss around

Braden Shepherdson13:04:02

I haven't disappeared. I tried to build a simplified repro of this on the weekend and it worked fine. then I tried more debugging and some simplification of the real thing, and it's still broken. I ran out of time but when I get back to it I want to try to find where there seems to be an explosion in the size of the graph. trying to print the (ex-data *e) after a query error is crashing my Conjure and locking up my REPL; trying to send it to Pathom Viz either locks it up or it shows empty everything. my current guess is that the huge map that results from parsing the ~4MB binary file is getting shape-descriptor'd even though I'd be better to treat it as opaque. but that's just untested speculation. it might even have cycles in it, I'm not sure.

wilkerlucio13:04:25

thanks for the effort to repro Braden 🙏

Braden Shepherdson13:04:49

I'm betting 90%+ that it'll be my bug at the bottom of this, but probably there's a documentation improvement to be wrung out of this at least.

Braden Shepherdson19:05:41

I'm still wrestling with this, trying to narrow down what's wrong with my data or resolvers. what's the right way to express returning a map in ::pco/output? like I have a resolver that returns a map of :job/id to maps like

{:job/id 2721863168
 :job/origin {:location/id "string_slug", :company/id "string_slug"}
 :job/destination {:location/id "b", :company/id "c"}
 ...}
I want to stick this map somewhere and then reference it in a pbir/attribute-table-resolver. so is that a resolver with {::pco/output [::big-map]} returning {::big-map {123456 {:job/origin ...}}} plus (pbir/attribute-table-resolver ::big-map :job/id [{:job/origin [:location/id :company/id]} ...])? I can't find any examples with maps like that except in the built-in resolvers docs.

Braden Shepherdson19:05:11

I'm wondering if there's some EQL syntax I'm missing that expresses a map of data as opposed to fixed keywords as keys.

wilkerlucio22:05:49

nested outputs can be expressed with the join syntax, for the example you sent above:

[{:job/origin [:location/id :company/id]}
 {:job/destination [:location/id :company/id]}]

Braden Shepherdson01:05:17

okay! I finally nailed down the issue. there was an important but subtle difference remaining between the attempted repro and my broken code. resolvers with inputs are not available for nested data, and resolvers without inputs are available everywhere(?)

Braden Shepherdson01:05:57

(perhaps in all cases, or perhaps only if there's a graph relationship between the nesting and the inputs?)

Braden Shepherdson01:05:08

I'm turning it into a distilled example now.

Braden Shepherdson02:05:58

https://gist.github.com/bshepherdson/b5667894a37220a25627c18811cb1e63 hopefully that makes the issue clear. the test is a little convoluted, but it's trying to show the difference, in the small edit to add an input.

Braden Shepherdson03:05:07

I kept accidentally making it work trying to shrink the repro. it does seem like the key is that there's a diamond dependency.

::problem-input
     /               \
::list             ::customers
     \               /
      [::customers 0]

Braden Shepherdson03:05:04

though that doesn't explain why the pbir/constantly-resolver works.

Braden Shepherdson03:05:26

I should say, none of that is necessarily a bug report. This may well be working as intended. I'd love to understand the nuances of diamond paths. It seems to me that there's nothing fundamentally broken about the diamond dep, it's not a tree but it's still a DAG. But perhaps I'm missing something. Perhaps there is a good way to make this work within one query. I think I can restructure my resolvers to keep this in one pass, and if I can't then I can make it two passes with the big map as an input or placeholder to the second one.

wilkerlucio12:05:51

thanks @UCY0GT0QM! today I'm packed, I expect to have a proper look at it tomorrow 🙏

Braden Shepherdson12:05:46

Sure thing, no rush.

wilkerlucio19:05:04

hi @UCY0GT0QM, sorry the delay, but having a look now, it looks like a bug to me, specially considering it works when we provide the data with a resolver, it might be missing the check for entity data in some point, I'm debugging it now

wilkerlucio19:05:58

ah, actually, I think I just understood why the difference in providing it vs creating a resolver

wilkerlucio19:05:36

when you send the data as entity, that data is only gonna be available for the root entity, but when you provide it as a resolver, it will be available everywhere

wilkerlucio19:05:42

and that's what causes the difference you see, because when you do: [{::customers [:data/name]}], the place where ::big-data is going to evaluated is at the path [::customers 0] so, when you send as entity, its like having this:

{::problem-input 1
 ::customers [{... looking for ::big-data here ...}]}
as we can see here, at [::customers 0] we dont have ::problem-input available, and that causes the planning failure

wilkerlucio19:05:00

but when you provide ::problem-input as a resolver, that makes it available at any path, so it works

wilkerlucio19:05:03

does it make sense?

Braden Shepherdson19:05:07

that makes sense, though it seems like you might have the logic inverted, since adding ::problem-input is what makes it fail to plan.

wilkerlucio19:05:23

Im bringing the comparison between your two last cases:

(testing "fails to plan with an input"
        (is (thrown-with-msg?
              Exception #"Pathom can't find a path for the following elements"
              (p.eql/process input {::problem-input 1} [{::customers [:data/name]}]))))
      (testing "weirdly it works with constantly-resolver and I don't know why"
        (is (= {::customers [{:data/name "Alice"} {:data/name "Carol"}]}
              (p.eql/process magic [{::customers [:data/name]}]))))

wilkerlucio19:05:49

the weirdly... part is what I'm trying to explain here

wilkerlucio19:05:38

its because when you have the ::problem-input required for ::big-map, when ::problem-input is provided as entity, its only available at root, but to do {::customers [:data/name]}, you need that to be available at each entity of a customer, which in the case of providing as entity doesn't satisfy

wilkerlucio19:05:09

but when ::problem-input is provided as a resolver, it makes it available for every entity, no matter where it is (what path inside the output it is)

wilkerlucio19:05:47

the confusion here might be thinking that when you provide entity data, its gonna be available everywhere, but it isn't, the entity data is like merging at the root, when navigating you not gonna see things at parent levels

Braden Shepherdson19:05:59

okay yeah, that makes sense.

Braden Shepherdson19:05:35

is there a better way to accomplish that kind of thing, when there is a "problem input"? because the only path I can see is that the HTTP request comes in, and then I run two p.eql/process or similar queries. first to compute ::big-map and return it, second to compute the actual response structures, passing the ::big-map data such that it's globally available.

wilkerlucio19:05:40

here is a way to illustrate this, how it could work with your setup:

(testing "illustrating putting the ::problem-input where it needs to be"
        (is (= {::customers [{:data/name "Alice"} {:data/name "Carol"}]}
              (p.eql/process input {::customers [{:data/id 123 ::problem-input 1}
                                                 {:data/id 789 ::problem-input 1}]}
                [{::customers [:data/name]}]))))

wilkerlucio19:05:53

is that input being provide directly by you? or its part of some dependency chain?

wilkerlucio19:05:16

because one way to make it available is to add a resolver for (as discussed, in a way it gets available everywhere)

wilkerlucio19:05:52

or, you can pull the input from env, and send it at env, this is just a different approach, but also makes it globally available

wilkerlucio19:05:02

(talking about the input point you provide to run the query)

Braden Shepherdson19:05:32

• the problem input is effectively a log file name • ::big-map is the result of digesting that big file into a map from ID to maps with a bunch of attributes • that should all be globally available for the below: • the response is a bunch of lists of "matching" entries from ::big-map, with only some of the attributes included.

wilkerlucio19:05:19

there is an important thing to account for here: is that input for ::big-map, consistent though the whole query? if not, if it should just affect that sub-tree (because you might have different inputs at different places), them that wont work

wilkerlucio19:05:41

if the case is the second, then you need to forward that dependency down when you create the collection (the ::customers in your example)

wilkerlucio19:05:01

making it look like this:

customers    (pco/resolver
                       `customers
                       {::pco/input  [::problem-input ::big-map]
                        ::pco/output [{::customers [::problem-input :data/id]}]}
                       (fn [_env {m ::big-map i ::problem-input}]
                         {::customers (for [id (keys m)
                                            :when (odd? id)]
                                        {::problem-input i :data/id id})}))

wilkerlucio19:05:20

changing the customers resolver this way fixes the input case that was throwing before

wilkerlucio19:05:26

also note I add the nested output in the ::pco/output section, this is the recommended thing to do, and also, without this some nested input queries that should succeed might fail (because Pathom checks nested paths, and without the nested description its unable to know what is available at that level)

Braden Shepherdson19:05:23

the input and ::big-map are consistent across all the rest of the query, hence why I said I could fall back to treating them as two separate queries.

wilkerlucio19:05:59

well, in that case you dont need to, just make an env-resolver and provide it as env, that is a good way to provide any dependency that should be globally available

Braden Shepherdson19:05:40

noted, that makes sense.

wilkerlucio19:05:02

a simple way to create a resolver to pull data from env: (pbir/constantly-fn-resolver ::problem-input ::problem-input)

wilkerlucio19:05:27

this will read ::problem-input from env and provide it as ::problem-input (for any input requirement)

Braden Shepherdson19:04:20

what's the status of Pathom in the browser, for 2 or 3? is it production ready?

wilkerlucio21:04:46

same as the clj versions