This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-04-26
Channels
- # babashka (7)
- # beginners (85)
- # calva (39)
- # cider (3)
- # clara (1)
- # clj-kondo (10)
- # clojure (194)
- # clojure-europe (36)
- # clojure-madison (2)
- # clojure-nl (13)
- # clojure-spec (11)
- # clojure-uk (2)
- # clojurescript (17)
- # community-development (5)
- # component (9)
- # conjure (4)
- # core-async (3)
- # cursive (32)
- # data-science (26)
- # datomic (31)
- # graalvm (22)
- # holy-lambda (31)
- # honeysql (7)
- # introduce-yourself (1)
- # jobs (9)
- # jobs-rus (1)
- # lsp (3)
- # malli (9)
- # off-topic (54)
- # pathom (27)
- # pedestal (6)
- # portal (1)
- # re-frame (4)
- # releases (1)
- # remote-jobs (1)
- # sci (3)
- # shadow-cljs (4)
- # spacemacs (13)
- # vim (14)
- # xtdb (3)
I'm looking at ways to manage my ETL pipeline. A common solution from libraries like https://domino-clj.github.io/ and https://github.com/commsor/titanoboa is to define the parts of the pipeline in a dependency graph and follow that graph to perform the steps of the pipeline.
I think there is some potential to use Pathom in this way. We already define a data dependency graph when using it. On the other hand, I know Pathom adds some overhead and this might make it difficult for a very large number of records, as ETL often includes, but there is that ::pco/final
that maybe be of some help.
Does anyone know if using Pathom for ETL like this has been attempted somewhere?
I've considered that, I believe its possible to leverage the planner in Pathom to have the schematics on what to run, but for the runner it would do something very different, maybe generating spark statements or something
I'm not familiar with spark statements but I'll read up about it. Indeed, maybe just using the planner could be a good step in the right direction.
I am updating from pathom3 version 2022.02.01-1-alpha to 2022.03.17-alpha, 2022.04.20-alpha , but it is causing some existing mutation unit tests to fail with the message
ERROR in (create-comment-test) (planner.cljc:474)
Uncaught exception, not in assertion.
expected: nil
actual: java.lang.AssertionError: Assert failed: Tried to remove node 24 that still contains references pointing to it. Move
the run-next references from the pointer nodes before removing it. Also check if
parent is branch and trying to merge.
(if node-parents (every? (fn* [p1__44092#] (not= node-id (get-node graph p1__44092# :com.wsscode.pathom3.connect.planner/run-next))) node-parents) true)
Is this a bug, or is it simply the new planner catching something odd we were doing from before that I could fix?
How could I start digging in deeper to debug this?Ah. I should just use a more updated version, preferrably non-alpha 😅
Ah. Everything is still alpha. This same error happens for me on version 2022.04.20-alpha
hello @U7Y7601B2, yep, all alpha still 😅 can you give a me a repro? its possibly a regression, but need an example to check
(anyway, this error should never happen, its presence means there is something wrong in the planner algorithm)
Ok, I narrowed it down as much as I could.
(ns repro
(:require [clojure.test :refer :all]
[com.wsscode.pathom3.connect.indexes :as pci]
[com.wsscode.pathom3.interface.eql :as p.eql]
[com.wsscode.pathom3.connect.operation :as pco]
[com.wsscode.pathom3.connect.built-in.resolvers :as pbir]))
(pco/defresolver get-comment []
{:comment/author {:user/id "user-id"}})
(def aliases (pbir/equivalence-resolver :comment/author :user))
(pco/defresolver user-resolver
[{id :user/id}]
{:user/avatar-filename "avatar-filename"})
(pco/defresolver avatar
[{user :user}]
{::pco/input [{:user [:user/avatar-filename]}]}
{:user/avatar user})
(pco/defresolver user-object-resolver
[]
{::pco/output [{:user [:user/id]}]}
{:user {:user/id "user-id"}})
(deftest repro-test
(is
(thrown?
AssertionError
(p.eql/process
(pci/register [get-comment
aliases
user-resolver
user-object-resolver
avatar])
{}
[{:user
[:user/avatar]}]))))
thanks, can you please open an issue in Pathom 3 repo (https://github.com/wilkerlucio/pathom3/issues)?
@U7Y7601B2 I think I understand already the bug, its a situation where a node must be removed, but the algorithm wasn't expecting the node to have parents in this case, and your repro demonstrate a case where it does happen
Pathom is trying to remove the node author->user-alias
because it notices that this path can't fulfill the nested requirements
but a node can't have parents when its removed, which is correct, so I think the way to go here is to remove the whole ancestor chain as well, this would translate in the user-object-resolver
being the only valid path in this scenario
Ah. Well I finished submitting the case for you 😅 https://github.com/wilkerlucio/pathom3/issues/136
@U7Y7601B2 pushed a fix to main, can you try so I can confirm the fix?
I am unfamiliar with how to bring in dependancies from an active branch rather than released :mvn/version could you point me to some docs or show me what a deps.edn import would look like?
sure, one sec
you can use this to import Pathom 3:
com.wsscode/pathom3 {:git/url ""
:sha "28956c7f5d6dd259effc09567829c096932714a7"}
Oof, I was so close to doing it right. Thanks!
@U066U8JQJ That fixed it. Unfortunately for me, there are 2 other unit tests of ours that fail on updating pathom3 that are apparently unrelated. Ones with less obvious errors thrown in my face. Looks like I need to make more repros
thanks for bringing those, happy to keep debugging and tacking those with you
Hi. I’m just getting started with pathom and have a pretty basic question. I have the following code:
(def env
(pci/register
[(pbir/constantly-resolver :products
[{:product/id 1 :product/slug "test1"}
{:product/id 2 :product/slug "test2"}])]))
(p.eql/process env [{:products [:product/slug]}])
;; => #{:products [#:product{:slug "test1"} #:product{:slug "test2"}]}
(p.eql/process env [{:products [:product/id]}])
;; => #:{:products [#:product{:id 1} #:product{:id 2}]}
(p.eql/process env [{[:product/id 1] [:product/slug]}])
;; => {[:product/id 1] {}}
The first two queries above work as expected, but the last query doesn’t.
In order to process
(p.eql/process env [{[:product/id 1] [:product/slug]}])
Do I need to create another resolver to match the product/id
?Yes, you need a resolver that explicitly knows how to look up a product/slug from a product/id. In your example this could be a resolver that takes :product/id and :products , then filters the products for one that matches.
The resolver you have provides :products , but for all pathom knows there is no relation between :product/id and :product/slug (Other than them both being present in :products). They could be completely random data, or magic labels, or whatever.