This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2024-05-21
Channels
- # announcements (4)
- # beginners (47)
- # cider (7)
- # clj-kondo (9)
- # cljs-dev (16)
- # clojure (8)
- # clojure-dev (33)
- # clojure-europe (39)
- # clojure-germany (2)
- # clojure-my (1)
- # clojure-nl (1)
- # clojure-norway (18)
- # clojure-uk (6)
- # clojuredesign-podcast (8)
- # clojurescript (12)
- # cursive (9)
- # datomic (24)
- # docker (3)
- # fulcro (23)
- # hoplon (7)
- # hyperfiddle (2)
- # java (5)
- # jvm (3)
- # leiningen (9)
- # lsp (6)
- # off-topic (75)
- # pathom (17)
- # polylith (21)
- # reitit (1)
- # rewrite-clj (11)
- # scittle (2)
- # shadow-cljs (57)
- # uncomplicate (6)
- # yamlscript (27)
Love this new series! Something about how to decipher the code itself would be great too. Especially code that wasn't necessarily always written with best practices. Like where is it called from, what are the call stacks, what are the potential parameters, how does it behave, and (probably most straightforward) what are the return values? In Java, I'd probably use a debugger to help. In clojure, I can sprinkle prints around using the repl, but otherwise I feel lost in the sea of possibilities. :man-rowing-boat:
@U02PB3ZMAHH That comic is painfully brilliant! I love that topic. I made a note of it. There are a lot of little ways to help. At a high level, some techniques that come to mind are... Starting to add schemas/spec. Creating fiddle files to work with parts in isolation. Using portal+tap. Static analysis like clj-kondo and clojure-lsp. Using the Java profiler for thread and performance issues.
As per ep 116, my questions are maybe closer to the desktop app. It seems to me that clojure web apps have a well known control flow.
IO on the edges is often discussed in the podcast as means to better testability and purer and more deterministic logic. I'm wondering, however, about the implications for a system that is organized around components. It appears there are at least two approaches to designing such an IO-component:
; Approach 1: IO on the edges
(defprotocol CustomersRepo-IO-on-edge
(execute! [_ op-map])
(get-all-customers-op-map [_]) ; returns an operation map with all the instructions needed to execute it
(add-customers-op-map [_ customers]))
(comment
; IO on the edge
(execute! customer-repo (add-customers-op-map customer-repo [{:name "John Doe"}]))
(execute! customer-repo (get-all-customers-op-map customer-repo)))
; Approach 2: IO embedded
(defprotocol CustomerRepo-IO-embedded
(get-all-customers! [_])
(add-customers! [_ customers]))
(comment
(add-customers! customer-repo [{:name "John Doe"}])
(get-all-customers! customer-repo)))
Some preliminary thoughts:
• verbosity: CustomersRepo-IO-on-edge
is quite more verbose to use, compared to CustomersRepo-IO-embedded
• orchestration: CustomersRepo-IO-on-edge
allows both interactive usage, as well as more involved usage, such as the decide-work-assimilate technique discussed in the podcast; with IO-embedded, the op-map gets lost
• maybe there's middle ground between the two by letting the IO-embedded
enrich the result with the op-map. However, such enrichment feels to better fit into an orchestration function, plus it closes the door to decide-work-assimilate technique.
• I was curious if cognitect's-aws-api—which seems to perhaps partly inspire the io-on-the-edges approach—returns the op-map in response or not. It doesn't.
It seems this question is relevant to all IO components, so I'm wondering if there's some guiding principle that could be broadly applied here.
Any thoughts? Thanks for reading.@U0609QE83SS Some thoughts that come to mind.... Firstly, you're in luck because our next several episodes will be about pure data models. Stay tuned. I avoid protocols unless I need Java interop. (Probably worth a longer discussion.) I/O on the edges is more clear when looking at an application as a whole. You build up more complex computations out of pure data models in the "middle" of the application. As more data models start to intersect, the pure approach gets more and more useful. For example: - A few different "fetching" functions that gather data from different systems - A pure functions for each result to transform it into a "working model" - Pure schemas and other pure validators for that data - Pure functions that reason and transform using those working models - Pure functions that turn the result of the "reasoning" into operation instructions - I/O functions that execute those instructions in other systems Look how much pure is in the "middle". You could wrap that whole process into a single function to do all the steps. That function won't be pure, but as long as that function only connects the other functions (it doesn't reason at all), you'll still have all the benefits of unit testing practically everything with pure data. When developing, you can definitely start with imperative decomposition and start factoring out pure functions as the code grows. We took that approach in the Sporitfy series. If it's useful for repling and fiddling, you can make helper functions that fetch, transform, and update all at once. Hopefully it's clear from the above example that the benefits of pure data models are felt more strongly the larger the application grows.
Hi @U5FV4MJHG thank you for responding on this.
> our next several episodes will be about pure data models.
This is great news, I cannot wait.
> I avoid protocols unless I need Java interop.
Interesting, thanks for mentioning that. How to then build fakers for a group of related operations? The https://clojuredesign.club/episode/025-fake-results-real-speed/ with the fake twitter handler and this https://www.juxt.pro/blog/abstract-clojure/#protocols suggest that protocols are suitable for such dispatching. Alternatively, one can go for multimethods, but then the related operations loose that relatedness aspect.
> I/O on the edges is more clear when looking at an application as a whole.
Thank you for those examples and the explanation. From the examples you provided, the IO are on the very edges of an application with pure "middle" that can be tested. The point of unclarity to me is when we zoom into the last two steps of the examples you provided:
- Pure functions that turn the result of the "reasoning" into operation instructions
- I/O functions that execute those instructions in other systems
And, particularly, zooming into the handover of the operation instruction and if that instruction is decoupled from the IO component or not.
When we were https://clojurians.slack.com/archives/CKKPVDX53/p1706054084114799?thread_ts=1699637695.133179&cid=CKKPVDX53, you mentioned that
> For transient information, such as an API call, I will save the raw data for debugging purposes. I've used rolling logs for that kind of information. I can search it or load it when trying to figure out what went wrong. If I hit an especially strange edge case, I will have data that can reproduce it!
In one of the episodes on composition, you recommended to try to put side-effects as shallow on the stack as possible and see where it leads you.
Let's say we'd like grow a bag of data over a process with an IO component at the end, and store transient information for debugging purposes in the bag as well, with the bag to be temporarily stored somewhere. (This approach appears to me to be such a radical improvement to system observability, that's what motivates me to explore it further). We'd also like to have side-effects as shallow as possible.
Let's say the "reasoning" resulted in the instruction "add customer john-doe" to db. Now, how to implement that?
1. The reasoning layer could create the instruction by reaching out to the customer-repo, such as (repo/add-customers-op-map customer-repo [{:name "John Doe"}])
, which returns pure data, which then the orchestrating function hands over to a thin repo/execute!
. The orchestrating function could store both the instruction and the result in the bag of data, and the side-effect appears as shallow as possible.
2. Another approach would be the "reasoning" to be decoupled from the customer-repo and the instruction could be {:op "add-customer" :customer {:name "John Doe"}}
or sth like that. The orchestrating function figures it should call repo/add-customer! (:customer instruction)
. If we'd like to get the db query to the bag of data for debugging, we'd need to return it in repo/add-customer!
. The side-effect is not as shallow as in the previous example; but inside the component, the side-effect could still be very shallow, if we decouple db query composition from the execution.
3. Another approach could be like the previous one, but dispatching on the operation, for example (repo/execute! {:op :add-customer :customer {:name ...})
where execute!
is a multimethod, with :op
being the dispatching function. If we'd like to add a faker repo and omit protocols, perhaps we can dispatch on repo type and action, such as (repo/execute! {:op :add-customer :type :fake-db :customer {:name ...})
. The execute!
returns both db-query used and the result. As I'm writing this, it seems that this last approach hits the sweet spot, with the trade-off of the side-effect being one level lower. Instructions are data-driven (loved also https://www.youtube.com/watch?v=j382BLptxCc) and semantically decoupled from the repo. The downside is to spec the return vals.
I see now that I'm complecting multiple topics together in this post. I'm sorry for that, some more clarity emerged only after I've written these lines.