Fork me on GitHub
#clojure
<
2024-01-25
>
joakimen15:01:35

Hey there. Apologies for the long post. I am looking for some guidance on how to improve the robustness of my programs, particularly with regards to the nature of dynamic typing. Disclaimer; this is not about dynamic and static typing; I am simply looking for some general guidelines in how to improve the Clojure-code that I write (which happens to be dynamic 🙂). I know dynamic and static typing is a worn-out topic, but from lightly browsing the threads here https://clojureverse.org/t/dynamic-types-where-is-the-discussion/8968, I haven't been able to find concrete, hands-on advice on how to compensate (in lack of a better term) for the lack of types with regards to robustness. I am not referring to logic bugs, more so that exceptions, missing data, bad data and good data is handled as expected. I've used clojure.spec and malli, in a few projects, and while they certainly do their job, I can't shake the feeling that I'm just "introducing types" to solve my dynamic typing challenges. There might be something more to this that I don't understand yet, but I am asking here since I get the feeling that a lot of the veteran Clojurians don't necessarily rely on these tools as much; that they solve their robustness concerns in other ways. But how, exactly? Some concrete thoughts and questions 1. Dynamic typing helps me cobble together things really fast, but I'm unsure how to go about "tightening" the code, making sure as many code paths as possible are accounted for, as my projects graduate from "fun little experiments" 2. Concrete scenario: I want to chain 5 functions. How do you handle absent/blank/invalid/valid and exceptions in each piece of the chain? Do you instrument every single function in your function chain with spec/malli and validate input/output using stuff like map/function specs? a. If yes; isn't this practically an ad-hoc type system? b. If no; where do you handle it, and how? Generative testing comes to mind, but I haven't explored that a whole lot 3. Are threading-macros perhaps just not suited for function chains that might behave in a number of different ways? a. If yes, what are alternative and concise ways to chain functions while making sure each part of the chain behaves as expected? b. nil-punning and nesting if-let/`when-let`'s may be convenient, but the nesting makes it feel.. Hairy If there are some good resources out there (books, articles) on the topic of how to make your Clojure projects as robust and "production grade" as possible, I'd appreciate any links. Thanks in advance!

👍 5
haris16:01:54

I enjoyed David Nolen's thoughts on this, they felt practical and served my own projects well: http://swannodette.github.io/2015/01/09/life-with-dynamic-typing/

👍 1
haris16:01:25

Design by Contract is the jumping off point I feel for his thoughts

daveliepmann16:01:15

1. I progressively define the contract each function offers for input & output. For me, docstrings with examples and simple stuff like (when-not <test> (throw (IllegalArgumentException.)) are a step separate and before spec/malli. 2+3. Throwing exceptions is one way. It's also not so hard to write a set of some->-style monadic thread macros that handle return values which indicate an error.

👍 1
lukasz16:01:36

My solution to this is "strong typing at the edge of the system" - no idea if it has a better (or formal) name, but it boils down to ensuring that inputs and outputs of your application perform strict validation (e.g. for HTTP services validate request & response bodies against some sort of schema, same for background job producers/consumers and so on). "Internal" code then has a bit more freedom and needs less handholding, although it still needs unit tests to validate that your logic is working as expected. (I'm a bit short on time, so maybe I need to write more about this and publish it)

1
Joshua Suskalo18:01:26

For myself I focus primarily on generative testing when I want to focus on robustness. Spec and malli are both good tools for this, but I generally feel like they're applied "in the wrong way" when they're used for this. Specifically, spec'ing every function and using stest/exercise is I think a heavy-handed approach that basically amounts to re-introducing types (albeit very expressive types) to your system. This is also tied to what types of tests should be written. I am very strongly in favor of integration tests over unit tests. This is an established preference in the hardware industry where robust testing and exact correctness are both hard requirements of the industry. If you combine a strong preference for integration tests with a well-designed domain data schema with spec, you can get wonderfully expressive and robust tests without too much development overhead as long as you design your system with this in mind. To be specific: spec boundaries of your system. Fully self-contained modules, IO boundaries, etc. Write tests which use those specs to generate data that you can feed in at those boundaries to tests a large surface area of your code in a single test, with many properties built into individual tests. When something goes wrong, stacktraces and logging can lead you to the general area of the issue, and test.check's narrowing facilities can zero you in on the specific cases that cause the problem. All of this leads to a pleasant development experience where you can have a high level of confidence that your system is robust.

👍 1
Joshua Suskalo18:01:01

As for threading with error handling, my recommendation is to look into fmnoise/flow, which can allow making robust pipelines with proper error handling.

Joshua Suskalo18:01:08

(I also want to note the integration testing that I'm referring to is not end-to-end testing and shouldn't generally require an actual spun-up environment outside of maybe a db connection, and this is generally achievable even with microservices)

kennytilton19:01:19

Old Common Lisper here: 1. I loathe threading. 🙂 2. In any Lisp, partly thanks to dynamic typing, I try to write meta-code wherever possible. That makes for (a) much less code to validate that (b) gets exercised much more heavily; 3. yeah, maps make things harder; 4. I tend to slap simple assertions in at the top of functions when I am nervous about always getting usage right. They only take a second to bang in, and now I am checking more than just the type of an input value. Certainly if I track down a bug that an assertion could have caught, in it goes; and 5. I do not use spec or malli.

Noah Bogart19:01:53

what do you mean by "meta-code", kenny?

Joshua Suskalo19:01:33

I also question whether you're talking about parallelism threading or syntax threading. If you hate syntax threading I'd be curious about the reasoning.

kennytilton19:01:52

Threading syntax, @U5NCUG8NR. I cannot read threading expressions. Mentally splicing is too much like work. And worrying about -> vs ->> just adds insult to injury. And cond->? as-> helps a little, but I was raised on Paul Graham's On Lisp: let is taxed, and threading is preciseley a refinement by let* We are functional programmers we write

(the-color-of
   (the-car
      (the-dog-chased)))
🙂

👍 1
kennytilton19:01:12

Meta-code, @UEENNMX0T, is what I call it when I am able to handle some repeated requirement generically, by writing code that takes sufficient parameters to handle all the cases. Once that code works, when adding a new case I have a lot of confidence that the generic code is heavily vetted by prior use. That is just a side-benefit, though. I just do not like repeating myself, the DRY thing, so I look to make functions generic. I guess the real win I am after is easy refactoring and, yes, debugging/fixing all in one place.

👍 2
1
p-himik20:01:19

Agree on the the-color-of example above, but IMO threading is really useful when you do a lot of similar operations on a data structure.

(-> a-map
    (assoc :x 1)
    (update :y inc)
    (add-some-stuff)
    (remove-some-other-stuff))

(->> a-coll
     (filter odd?)
     (map inc)
     (interpose 0))

5
joakimen12:01:15

Thank you all for the replies! What I collect from the replies: • (haris) assertions using :pre and :post conditions can be useful (in the article referenced by haris) ◦ I had somewhat dismissed these after reading the gotchas here: https://clojureverse.org/t/why-are-pre-and-post-conditions-not-used-more-often/2238/3. • (haris, kenny) funnelling data through well-tested, shared entrypoints (as called by the article referenced by haris) or meta-code (as mentioned by kenny) can help reduce the burden of asserting/specing everywhere • (lukasz, joshua) strong typing at the edge of the system (lukasz), or "Functional Core, Imperative Shell". The main efforts of validation should happen at the edges, while the functional core may be better suited for a barrage of tests, being pure and easy to test and all that • (joahua) favor a combination of integration-tests and a well-defined domain model/aggregates; define specs for data at the boundaries using spec/malli, then use those specs to generate test-data, which can be fed into a test that exercises a large surface area of the code • (joshua, kenny) avoid threading for functional pipelines, alternatively consider https://github.com/fmnoise/flow for threading with added flexibility • (kenny) adding assertions as needed to the top of functions is a fine and reliable approach • (p-himik) threading remains useful for passing a data structure through a chain of operations (for operations that don't throw or require special handling of return values) If I misunderstood anything, please correct me :)

👍 1
💯 2
haris15:01:27

Nice summary. More broadly speaking, I feel by adopting a dynamic language you naturally make the choice from one "right, correct" program to a broad spectrum of programs. And, seeing a fair share of terrible Python and JS codebases, that spectrum is really broad to be fair. So, the classic half of dynamic codebases is worse than average typed codebases generally holds imo. But, the upper half of programs that are very robust and expressive only belong in dynamic systems.

haris15:01:10

Eventually, your mindset shifts away from one-true-way to many principals and strategies, e.g. Postel's Law, well-placed assertions, default values, etc.