pathom

rende11 2024-12-03T15:05:45.582749Z

Hi! I'm following the Pathom tutorial for building a Hacker News scraper ([tutorial link](https://pathom3.wsscode.com/docs/tutorials/hacker-news-scraper#traversing-pagination)) and encountering an error when running the following code:

(->> (p.eql/process env
	[{:hacker-news.page/news-all-pages
		[:hacker-news.item/title]}]))
The resolver contains a recursive part: {:hacker-news.page/news-next-page '...}. And, I get the following error:
Execution error (ExceptionInfo) at com.wsscode.pathom3.connect.runner/processor-exception (runner.cljc:933).
Graph execution failed: Required attributes missing: [:hn.page/news-all-pages] at path [:hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page :hn.page/news-next-page]
I understand this is due to the fact that the last page doesn’t have a link to the next page, and this attribute should be optional. How can I fix this? Is the tutorial outdated, or is there a different way to handle missing attributes in recursive resolvers? (same behaviour for [this code example](https://github.com/wilkerlucio/pathom3-docs/blob/master/src/main/com/wsscode/pathom3/docs/demos/tutorials/hacker_news_scrapper.clj)) (Also, I noticed a large error output: error in process filter: progn: Sync nREPL request timed out (op analyze-last-stacktrace nrepl.middleware.print/stream? 1 nrepl.middleware.print/print cider.nrepl.pprint/pprint nrepl.middleware.print/quota 1048576 nrepl.middleware.print/buffer-size 4096 nrepl.middleware.print/options (dict right-margin 70)) after 10 secs (emacs) Is there anything I can do to resolve this, or reduce the output size?) Thanks in advance for any help!

wilkerlucio 2024-12-16T18:02:02.334599Z

@rende11 HN have changed their HTML since the creation of this tutorial, @lanjoni is having a look to make a new implementation, we will update the tutorial once that lands

👍 1
guto 2024-12-16T19:46:51.019699Z

to improve docs we started working on a https://github.com/wilkerlucio/pathom3-docs/pull/49 for this tutorial, but you can access a preview update if you're having some problems and want to use hickory

👀 1
wilkerlucio 2024-12-05T16:37:52.299129Z

hi, sorry the delay, Im meaning to have a look at this but end of year has been a bit crazy, but I like to let you know its on my radar and I'll have a look asap

rende11 2024-12-05T18:56:25.218879Z

Thx for the answer, no worries 😃

rende11 2024-12-20T11:41:58.744659Z

@lanjoni Thx for updating guide! Different selectors/classes is not a problem for me. I'm struggling with stopping recursive queries.

wilkerlucio 2024-12-20T12:53:19.059759Z

@rende11 have you tried using bounded recursive queries? for example, instead of [:query {:recursive ...}] do [:query {:recursive 5}], this way it should limit the recursion to 5 steps, instead of being unbounded

rende11 2024-12-20T13:41:25.775469Z

@wilkerlucio bounded recursive queries works well. I expect that I could end recursion when pages are ends.

wilkerlucio 2024-12-20T13:42:24.650069Z

its been a while since I done recursive queries, if I remember correctly, the proper way would be to return a nil in the recursion point when you are done, but need to double check

👀 1
2024-12-03T16:56:56.819869Z

(There's a chance this is related to the above question but I'm not positive, I hit this independently today) It seems that when there are multiple paths to an attribute, one being directly on the entity and another requiring traversing to a nested attribute, pathom will always choose the nested option, even when there are required attributes on the nested path that aren't present on the input. Is this supposed to happen? Is this where ::pcr/choose-path is mean't to be used?

(ns demo
 (:require
   [com.wsscode.pathom3.connect.operation :as pco]
   [com.wsscode.pathom3.connect.indexes :as pci]
   [com.wsscode.pathom3.interface.eql :as p.eql]))


(def resolvers
 [(pco/resolver
    {::pco/op-name `temp-from-sensors
     ::pco/input [{:room/sensors [:sensor/high :sensor/low]}]
     ::pco/output [::temperature]
     ::pco/resolve (fn [_ input]
                     (let [{:sensor/keys [high low]} (get input :room/sensors)]
                       {::temperature (/ (+ high low) 2)}))})

  (pco/resolver
    {::pco/op-name `temp-from-direct
     ::pco/priority 100 ;;adding priority doesn't help
     ::pco/input [:room/temperature]
     ::pco/output [::temperature]
     ::pco/resolve (fn [_ input]
                     {::temperature (:room/temperature input)})})])

(def env (pci/register resolvers))

(comment
 ;; Fails - tries sensors path even though keys missing
 (p.eql/process env
   {:room/temperature 72
    :room/sensors {}}
   [::temperature])

 ;; Works - only uses direct path
 (p.eql/process env
   {:room/temperature 72}
   [::temperature]))

wilkerlucio 2024-12-16T18:06:05.008619Z

thanks for this, and sorry the long delay to respond, it does look like a bug indeed, I would expect the first option to work, but it doesn't

wilkerlucio 2024-12-16T18:06:22.019119Z

I need to find time to properly debug this, probably will have some next week

🙏 1
caleb.macdonaldblack 2024-12-17T13:24:35.496489Z

I was looking into this too. Seems like a bug for sure. I used git bisect to figure out if it was introduced but it looks like it has been like this since the lenient mode commits

caleb.macdonaldblack 2024-12-17T13:26:00.216369Z

without lenient mode it works (obviously) so it doesn’t seem to have cropped up due to recent changes or anything

caleb.macdonaldblack 2024-12-17T13:27:49.089909Z

also a resolver with input shaped the same way also breaks

wilkerlucio 2024-12-17T13:53:12.522519Z

what you mean by also a resolver with input shaped the same way also breaks?

wilkerlucio 2024-12-17T13:53:40.244219Z

@caleb.macdonaldblack in my tests it fails without lenient mode too

wilkerlucio 2024-12-17T13:54:06.015039Z

for the short evaluation I did, it seems a bug in the planner

wilkerlucio 2024-12-17T13:54:19.444499Z

for some reason it's not considering the path from just temperature in this context, but I don't know yet why

caleb.macdonaldblack 2024-12-17T14:10:04.189469Z

Its working for me

caleb.macdonaldblack 2024-12-17T14:10:52.839059Z

with lenient mode. as you would expect because it ignores missing attributes

wilkerlucio 2024-12-17T14:11:42.097539Z

which one are you running? there are two there, lets forget lenient mode for now, does the first example runs without error there?

caleb.macdonaldblack 2024-12-17T14:12:05.255159Z

forgetting lenient mode. I can replicate the same

wilkerlucio 2024-12-17T14:12:31.080289Z

so, without lenient mode it works (obviously), isn't valid, right? because in strict mode it does fail

caleb.macdonaldblack 2024-12-17T14:13:21.879739Z

yeah sorry. that’s what i meant

caleb.macdonaldblack 2024-12-17T14:13:46.949819Z

i recalled it as “strict mode” so i messed up that comment when i corrected

caleb.macdonaldblack 2024-12-17T14:14:28.661359Z

This is my test

wilkerlucio 2024-12-17T14:14:32.877349Z

no worries, I'm just trying to understand the points you brought

caleb.macdonaldblack 2024-12-17T14:21:11.226069Z

caleb.macdonaldblack 2024-12-17T14:21:36.984219Z

this is what i meant about the resolver with the same shape

caleb.macdonaldblack 2024-12-17T14:22:57.455699Z

instead of providing {:room/sensors {}} as input/entity/available-data, a resolver provides it. And it fails the same

wilkerlucio 2024-12-17T14:23:20.561419Z

gotcha, thanks for the details

caleb.macdonaldblack 2024-12-17T18:30:20.272729Z

https://github.com/wilkerlucio/pathom3/pull/223

caleb.macdonaldblack 2024-12-17T18:32:08.433099Z

okay, so it seems like the planning stage is running for the nested inputs first. The planner determines that it cannot resolve inputs for the respective resolver and then verify-plan! is throwing an exception right after

caleb.macdonaldblack 2024-12-17T18:34:32.777439Z

And this is preventing all the remaining plan work from finishing.

caleb.macdonaldblack 2024-12-17T18:35:11.498779Z

But if given the chance to finish, it will eventually work out that those errors aren’t needed.

caleb.macdonaldblack 2024-12-17T18:36:16.889249Z

So a hack for demonstration only, is changing it so verify-plan will only throw at the very end. one time only

caleb.macdonaldblack 2024-12-17T18:36:58.921969Z

This also seems to still catch any genuine errors that were skipped earlier

caleb.macdonaldblack 2024-12-17T18:47:01.723029Z

I’m still very inexperienced with the internals of this project. I’ll leave it with you to decide how we move forward from here

caleb.macdonaldblack 2024-12-17T18:53:58.820099Z

also worth mentioning that this PR is just to help debug. I just threw an atom in there to hack something together.

wilkerlucio 2024-12-17T19:19:29.433819Z

thanks, as I said, I can't have a deeper look at it now, but will next week 🙂