Hi! I'm following the Pathom tutorial for building a Hacker News scraper ([tutorial link](https://pathom3.wsscode.com/docs/tutorials/hacker-news-scraper#traversing-pagination)) and encountering an error when running the following code:
(->> (p.eql/process env
[{:hacker-news.page/news-all-pages
[:hacker-news.item/title]}]))
The resolver contains a recursive part: {:hacker-news.page/news-next-page '...}. And, I get the following error:
Execution error (ExceptionInfo) at com.wsscode.pathom3.connect.runner/processor-exception (runner.cljc:933).
Graph execution failed: Required attributes missing: [:hn.page/news-all-pages] at path [:hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page]
I understand this is due to the fact that the last page doesn’t have a link to the next page, and this attribute should be optional. How can I fix this? Is the tutorial outdated, or is there a different way to handle missing attributes in recursive resolvers?
(same behaviour for [this code example](https://github.com/wilkerlucio/pathom3-docs/blob/master/src/main/com/wsscode/pathom3/docs/demos/tutorials/hacker_news_scrapper.clj))
(Also, I noticed a large error output: error in process filter: progn: Sync nREPL request timed out (op analyze-last-stacktrace nrepl.middleware.print/stream? 1 nrepl.middleware.print/print cider.nrepl.pprint/pprint nrepl.middleware.print/quota 1048576 nrepl.middleware.print/buffer-size 4096 nrepl.middleware.print/options (dict right-margin 70)) after 10 secs (emacs)
Is there anything I can do to resolve this, or reduce the output size?)
Thanks in advance for any help!to improve docs we started working on a https://github.com/wilkerlucio/pathom3-docs/pull/49 for this tutorial, but you can access a preview update if you're having some problems and want to use hickory
hi, sorry the delay, Im meaning to have a look at this but end of year has been a bit crazy, but I like to let you know its on my radar and I'll have a look asap
Thx for the answer, no worries 😃
@lanjoni Thx for updating guide! Different selectors/classes is not a problem for me. I'm struggling with stopping recursive queries.
@rende11 have you tried using bounded recursive queries? for example, instead of [:query {:recursive ...}] do [:query {:recursive 5}], this way it should limit the recursion to 5 steps, instead of being unbounded
@wilkerlucio bounded recursive queries works well. I expect that I could end recursion when pages are ends.
its been a while since I done recursive queries, if I remember correctly, the proper way would be to return a nil in the recursion point when you are done, but need to double check
(There's a chance this is related to the above question but I'm not positive, I hit this independently today)
It seems that when there are multiple paths to an attribute, one being directly on the entity and another requiring traversing to a nested attribute, pathom will always choose the nested option, even when there are required attributes on the nested path that aren't present on the input. Is this supposed to happen? Is this where ::pcr/choose-path is mean't to be used?
(ns demo
(:require
[com.wsscode.pathom3.connect.operation :as pco]
[com.wsscode.pathom3.connect.indexes :as pci]
[com.wsscode.pathom3.interface.eql :as p.eql]))
(def resolvers
[(pco/resolver
{::pco/op-name `temp-from-sensors
::pco/input [{:room/sensors [:sensor/high :sensor/low]}]
::pco/output [::temperature]
::pco/resolve (fn [_ input]
(let [{:sensor/keys [high low]} (get input :room/sensors)]
{::temperature (/ (+ high low) 2)}))})
(pco/resolver
{::pco/op-name `temp-from-direct
::pco/priority 100 ;;adding priority doesn't help
::pco/input [:room/temperature]
::pco/output [::temperature]
::pco/resolve (fn [_ input]
{::temperature (:room/temperature input)})})])
(def env (pci/register resolvers))
(comment
;; Fails - tries sensors path even though keys missing
(p.eql/process env
{:room/temperature 72
:room/sensors {}}
[::temperature])
;; Works - only uses direct path
(p.eql/process env
{:room/temperature 72}
[::temperature]))thanks for this, and sorry the long delay to respond, it does look like a bug indeed, I would expect the first option to work, but it doesn't
I need to find time to properly debug this, probably will have some next week
I was looking into this too. Seems like a bug for sure. I used git bisect to figure out if it was introduced but it looks like it has been like this since the lenient mode commits
without lenient mode it works (obviously) so it doesn’t seem to have cropped up due to recent changes or anything
also a resolver with input shaped the same way also breaks
what you mean by also a resolver with input shaped the same way also breaks?
@caleb.macdonaldblack in my tests it fails without lenient mode too
for the short evaluation I did, it seems a bug in the planner
for some reason it's not considering the path from just temperature in this context, but I don't know yet why
Its working for me
with lenient mode. as you would expect because it ignores missing attributes
which one are you running? there are two there, lets forget lenient mode for now, does the first example runs without error there?
forgetting lenient mode. I can replicate the same
so, without lenient mode it works (obviously), isn't valid, right? because in strict mode it does fail
yeah sorry. that’s what i meant
i recalled it as “strict mode” so i messed up that comment when i corrected
This is my test
no worries, I'm just trying to understand the points you brought
this is what i meant about the resolver with the same shape
instead of providing {:room/sensors {}} as input/entity/available-data, a resolver provides it. And it fails the same
gotcha, thanks for the details
okay, so it seems like the planning stage is running for the nested inputs first.
The planner determines that it cannot resolve inputs for the respective resolver and then verify-plan! is throwing an exception right after
And this is preventing all the remaining plan work from finishing.
But if given the chance to finish, it will eventually work out that those errors aren’t needed.
So a hack for demonstration only, is changing it so verify-plan will only throw at the very end. one time only
This also seems to still catch any genuine errors that were skipped earlier
I’m still very inexperienced with the internals of this project. I’ll leave it with you to decide how we move forward from here
also worth mentioning that this PR is just to help debug. I just threw an atom in there to hack something together.
thanks, as I said, I can't have a deeper look at it now, but will next week 🙂