Fork me on GitHub
Matthew Twomey07:02:02

I’m experimenting with spec and I’m stuck trying to figure out how to modify this simple keyword args spec to also check that the parameters are strings (right now it validates their existence just fine):

(s/fdef copy-job-env
  :args (s/keys* :req-un [::source-name ::source-region ::target-name ::target-region]))
Any hints from anyone?


(s/def ::source-name string?)

Matthew Twomey07:02:15

But where does that go? Sorry I’m struggling with the syntax here a bit - how do I fit that into my fdef?


You need to define it separately, outside fdef

Matthew Twomey07:02:17

I still learning about this. If I define it outside, won’t it apply to any functionI spec that uses that keyword argument then?

Matthew Twomey07:02:31

I was wanting to define it specifically for this particular function?

Matthew Twomey07:02:15

For a more general example let’s say I have two functions I’m trying to spec both taking keyword arguments: Both take :value - but for function 1 I want to both make sure :value (as a keyword argument) exists and also that it’s > 100. For function 2 I want o both make sure :value (as a keyword argument) exists and also that it’s > 500. How can I do that?


(s/def :copy-job-env/source-name string?) use a different namespace then

Matthew Twomey07:02:12

So put each function I want to spec in it’s own namespace?


> So put each function I want to spec in it’s own namespace? No, you don't have to do that. ::source-name should be a unique identifier in your program. The requirement to have multiple definitions of it with the same name is not how spec should be used. spec uses a global registry, so you need to s/def all the things that your program needs to know about, regardless of where they're going to be used.


(s/fdef copy-job-env
  :args (s/keys* :req-un [::source-name :my.ns1/source-region ::target-name ::target-region]))

(s/fdef copy-job-env2
  :args (s/keys* :req-un [::source-name :my.ns2/source-region ::target-name ::target-region]))
This is perfectly valid

Matthew Twomey08:02:31

ok, I’m almost understanding

Matthew Twomey08:02:57

Right now I’m defining one of these functions like this:

(defn copy-job-env
  [& {:keys [source-name source-region target-name target-region]}]

Matthew Twomey08:02:20

Do I need to define it differently to make use of the :my.ns1/source-region part of what you showed?


No, it will work as-is. :req-un "collapses" the namespace for you, so you can have two different functions with the same signature but different specs for the arguments


A (convoluted) example:

(s/def :my.ns1/source-region string?)
(s/def :my.ns2/source-region keyword?)

(defn copy-job-env
  [& {:keys [source-name source-region target-name target-region]}]
(s/fdef copy-job-env
  :args (s/keys* :req-un [::source-name :my.ns1/source-region ::target-name ::target-region]))

(defn copy-job-env-with-keyword-region
  [& {:keys [source-name source-region target-name target-region]}]
(s/fdef copy-job-env-with-keyword-region
  :args (s/keys* :req-un [::source-name :my.ns2/source-region ::target-name ::target-region]))

Matthew Twomey08:02:37

Oooh excellent thank you very much. I am trying to wrap my head around spec. It’s just been a bit tricky. I’ve read the main docs but some of it still isn’t clicking. I think I need to re-read from the ground up especially regarding the global registry.

Matthew Twomey08:02:27

I also did not grasp properly about :req-un collapsing the namespace. That was key - thanks again.


Happy to help! In a few words, all the s/defs "build" the global registry and then with s/fdef you tell spec what part of the global registry a particular function needs (and how) Don't hesitate to follow-up with more questions 🙂 and check out #C1B1BB2Q3 too Also the Rich hickey talks about spec are great, make sure to watch/read those if you're interested in a deeper understanding: •

Matthew Twomey08:02:58

Thanks - yeah, I want to understand this better. Will read these for sure. Really appreciate it.


> did not grasp properly about :req-un collapsing the namespace yup, the un in req-un/`opt-un` stands for unqualified, that's in contrast to req/`opt` which keep the namespace


I typically create a single namespace in a project to hold all specs, e.g practicalli.appname.spec Then start defining simple specifications for each value I may wish to validate at some point, e.g ::source-name The values that come from outside Clojure code are the most likely candidates (database, API, UI, stream, etc) I compose simple specs into composite specs, e.g. spec/keys defining which are optional or required Then I create functional specs using the value specs, focusing on those functions that talk to the outside world (outside of the Clojure application) There are some practical example in this guide, along with some video of live coding spec

👍 2
Matthew Twomey18:02:12

Awesome thank you @U05254DQM! (and all on this thread). The video was excellent and I found you worked on in the video particularly good as just reference material.

👍 2
Matthew Twomey23:02:51

Ok - now I’m really starting to get it. This might not be refined yet, but starting to get the picture:

(s/def :gcp/region
  #{"asia-east1" "asia-east2" "asia-northeast1" "asia-northeast2"
    "asia-northeast3" "asia-south1" "asia-south2" "asia-southeast1"
    "asia-southeast2" "australia-southeast1" "australia-southeast2"
    "europe-central2" "europe-north1" "europe-southwest1" "europe-west1"
    "europe-west2" "europe-west3" "europe-west4" "europe-west6" "europe-west8"
    "europe-west9" "me-west1" "northamerica-northeast1"
    "northamerica-northeast2" "southamerica-east1" "southamerica-west1"
    "us-central1" "us-east1" "us-east4" "us-east5" "us-south1" "us-west1"
    "us-west2" "us-west3" "us-west4"})

(s/def :copy-job-env/source-name string?)
(s/def :copy-job-env/source-region (s/and string? :gcp/region))
(s/def :copy-job-env/target-name string?)
(s/def :copy-job-env/target-region (s/and string? :gcp/region))

(s/fdef copy-job-env
  :args (s/keys* :req-un [:copy-job-env/source-name :copy-job-env/source-region
                          :copy-job-env/target-name :copy-job-env/target-region]))

👌 2
Matthew Twomey23:02:15

I just realize I can make the source / target more generic and get rid of two of them.

Matthew Twomey23:02:12

Oh… no I can’t.

Matthew Twomey23:02:02

I guess I don’t really need the string? with the (s/and) at all with the regex.

👍 2

I feel like I'm probably overcomplicating this, is there a better way/core fn to partition using some accumulated value:

(defn make-batch
  [size-fn threshold]
  (let [batch-size (volatile! 0)
        batch-counter (volatile! 0)]
    (fn [element]
      (if (<= (vswap! batch-size + (size-fn element))
          (vreset! batch-size 0)
          (vswap! batch-counter inc))))))

  (let [input-files  [{:url "a"
                       :content-length 100}
                      {:url "b"
                       :content-length 200}
                      {:url "c"
                       :content-length 300}
                      {:url "d"
                       :content-length 1000}]]
    (partition-by (make-batch :content-length 500) input-files)))


Not bad, for my money. Would loop be cleaner? Btw, when we overflow the threshhold, should batch-size be initialized to the size of the element that triggered the overflow? [edit: Wow, surprised to be how partition-by works! Don't mind me. :)]


loop would let me push batch-sized+ elements into their own batch and keep accumulating in the current batch, which I couldn't think of how to do with partition-by


I was getting the basic sort of results I expected with partition-by(didn't test too rigorously), but I probably do need to account for that residual batch size when looping


You could conceivably use ‘reductions’ in a situation like this so that’s worth knowing, but your approach doesn’t seem unprecedented. Why not go all the way and make it a full stateful transducer and get rid of the partition by bit?

Bob B22:02:52

my solution uses reductions to decide how many to take, and then lazy-seq to recur


> Why not go all the way and make it a full stateful transducer and get rid of the partition by bit? Mostly because transducers are kind of in draw-the-rest-of-the-owl territory for me.


I read some stuff and cribbed partition-all and came up with:

(defn batch-by
  "Returns a transducer that accumulates items into a batch until the 
   summed values returned by `size-fn` exceed `threshold`. 
   Items whose `size-fn` exceeds `threshold` are batched immediately 
   and the current batch is left to possibly accumulate more items."
  [size-fn threshold]
  (fn [rf]
    (let [batch (java.util.ArrayList.)
          batch-size (volatile! 0)]
        ([] (rf))
         (let [result (if (.isEmpty batch)
                        (let [v (vec (.toArray batch))]
                          (.clear batch)
                          (unreduced (rf result v))))]
           (rf result)))
        ([result element]
         (let [size (size-fn element)]
           (if (> size threshold)
             (rf result [element])
               (.add batch element)
               (if (> (vswap! batch-size + size) threshold)
                 (let [v (vec (.toArray batch))]
                   (vreset! batch-size 0)
                   (.clear batch)
                   (rf result v))


My turn, with loop:

(defn partition-max [max-size input-files]
    (loop [files input-files
           size-so-far 0
           accum []
           outputs nil]
      (if (not (seq files))
        (reverse (concat (when (seq accum) [accum]) outputs))
        (let [{:keys [url content-length] :as file} (first files)]
          (if (>= (+ size-so-far content-length) max-size)
            (if (seq accum)
              (recur files 0 [] (conj outputs accum))
              (recur (rest files) 0 [] (conj outputs [file])))
            (recur (rest files)
              (+ size-so-far content-length)
              (conj accum file)
make-batch looks better/simpler. 🤷