Fork me on GitHub

In Joy of Clojure 2nd ed. (p. 253 - 255) they give a following example of making array mutations safe:

(defn make-safe-array [t sz]
  (let [a (make-array t sz)]
    (reify SafeArray
      (count [_] (clj/count a))
      (seq [_] (clj/seq a))
      ;; is locking really neccessary for aget? what could happen?
      (aget [_ i] (locking a
                    (clj/aget a i)))
      (aset [this i f] (locking a
                         (clj/aset a i (f (aget this i))))))))
(full sample here: I'm wondering why they lock aget at all? Isn't it enough to lock aset? Why should I block readers while there's a write in progress?


Likely because java.lang.reflect.Array/get doesn't say anything about it being thread-safe.


Hmm, that might be it. But what would that mean? Like observing a half-set value? What would that even be?


I.e. same reason why we mark variables as volatile.


Exactly, you need something that orders both reads and writes respective of each other, otherwise the jvm can do things like read the array index once and cache it in a register, and just say your writes all happened after the read


@U06BE1L6T locking emits a memory fencing instruction prevents operation reordering and that makes sure your CPU caches are synced. One thread might update a value in L1 cache (which are per-core) then another thread on another core might read the same value in it’s own L1 cache. Typically memory fence causes the changes to get pushed to L3 cache which isn’t per core. writing a volatile does the same, so generally for scalar values (int, long, writing a reference), volatile is sufficient


Ah right, so the read lock is there only to provide a fresh value - otherwised it could get cached; I think it's unlikely to happen here (I increment the array values in 100 concurrent threads, then read them all afterwards), maybe because the cache coherence protocol will actually fetch the proper value when it's modified by the aset operation (even when there's no lock in aget) I definitely couldn't find any consistency issue when removing the aget lock and testing it (


Such bugs are notoriously difficult to test for. Sometimes you may catch them with such tests, but there is no guarantee you will


Yeah, based on my understanding of JMM and memory consistency properties ( they do the right thing in the book; in particular: > Actions prior to "releasing" synchronizer methods such as Lock.unlock, Semaphore.release, and CountDownLatch.countDown happen-before actions subsequent to a successful "acquiring" method such as Lock.lock, Semaphore.acquire, Condition.await, and CountDownLatch.await on the same synchronizer object in another thread. Here are some good resources dealing with more details regarding the notion of volatile et al meaning "flush to main memory" (which was the impression I got from reading some Java book a decade ago but found much later that this is likely false when reading about the MESI cache coherence protocol): •


@U06BE1L6T locks or not, there's a race condition here because the sequence can be constructed before the threads are done mutating. Look at the result of (-> (make-safe-array Integer/TYPE 8) (doto pummel) seq)


Oh yeah, you're right. I think they basically rely on the reader waiting until the threads are done (which is quick for a human experimenting in the REPL 🙂 ). ... in which case, I think, the read lock basically doesn't matter at all but would be the right thing to do for an operation happening immediately after a previous aset, right?


there's many possible reasons why you could see the latest value without explicit synchronization, but in general physical time is not something you should rely on

👍 3

What’s the default for and elide-meta jvm options when doing a lein jar or lein uberjar ?

Alex Miller (Clojure team)15:12:36

by default those aren't used at all afaik

Alex Miller (Clojure team)15:12:47

so no direct linking, no elide-meta


@roklenarcic a build tool should not change these options unless the users asks for it


code in an uberjar might still rely on non-direct linking or metadata for example


Anyone here using vim with conjure in a monorepo? My issue is that I typically open files in multiple projects and it becomes tedious to launch the repl for every file. Is there a way to configure vim to find the projects root path and launch an nrepl-server in that dir?


I do but I don't start the REPL from nvim, I start a bunch of REPLs using a kinda custom docker-compose wrapper then I set up Conjure to connect to the right REPL depending on what dir I :cd into. Conjure allows you to work on multiple projects at a time by setting the :ConjureClientState [state-key]


At work, I set up a "cwd changed" autocmd that sets my ConjureClientState to the cwd path. So every time I :cd I get a fresh Conjure state with it's own nREPL connection and config.


You could set up something similar + use something like if you really want to start your REPL from within nvim. I still recommend setting up your REPLs outside of nvim with your own script though, ensure you write your .nrepl-port files into each sub-repo directory, then :cd into each module as you work on them and Conjure will auto connect. Then you can set up the autocmd to set the state as you hop around to have multiple concurrent connections.

augroup conjure_set_state_key_on_dir_changed
  autocmd DirChanged * execute "ConjureClientState " . getcwd()
augroup END


I have a script that goes through my docker processes and maps the nREPL ports into .nrepl-port files in the correct directories of the mono repo. Making :cding into directories synonymous with connecting to them.


You can also discuss conjure over at if you so wish 🙂


I guess I can simply use a script to launch repls for all projects.. I guess it will eat some memory. Anyway, I joined #conjure so I'll ask future questions there.


I believe there is talk around adding that, you can check in the #conjure channel


spec generators rely on the Clojure property testing library test.check. However, this dependency is dynamically loaded and you can use the parts of spec other than gen, exercise, and testing without declaring test.check as a runtime dependency. 
The above is from the spec guide where it speaks of loading the test.check lib. What does it mean to dynamically load a lib ? how does that work ?


:test-deps {:extra-paths ["test"]
                       :extra-deps {org.clojure/test.check {:mvn/version "1.0.0"}
                                    peridot/peridot {:mvn/version "0.5.2"}}}
           :run-tests {:extra-deps {com.cognitect/test-runner
                                   {:git/url ""
                                    :sha "209b64504cb3bd3b99ecfec7937b358a879f55c1"}}
                      :main-opts ["-m" "cognitect.test-runner"
                                  "-d" "test"]}
an example of adding test.check

Alex Miller (Clojure team)15:12:42

if you do generator stuff, it will load the test.check.generator namespace. if you don't, then it won't.

Alex Miller (Clojure team)15:12:07

so you can safely include test.check at test/repl time but exclude it at production time


(map (fn [k v]
                     (println " K " k)
                     (println " v " v)
                     (if-not (re-matches #"^[a-z]+\*$" (->str v))
                             (->str v)))
                     {:id "john"}) 


(fn [k v] …) is for 2 arguments. If you want to have key and value you need (fn [[k v]] …).


(defn foo [x1 x2 x3] ...) is the fn with 3 arguments (defn foo [x1 [k v] x3] ...) is the function with 3 arugments, but second one is destructed to [k v]


if we use reduce-kv then then parameter will be [k v] right? why so?


so it takes x2 which is [:keyword-foo “value”] and place this under k and v


because it is different fn which get different parameters - in simply words 😉


it is designed to already get this parameters like that


while on the beginning it can look confusing later it is very intuitive

👍 3

so it already destruct this value for you


There was website with challenging tasks to transform data where you can try to solve this online. After all you can compare your solutions to the best solutions made by other people. This is really god place to start.


But I forgot the URL


Maybe someone else remember URL to website where you can do online tasks challenger and compare your solution to other people?


is that 4 clojures?


At least it is how I was learning many years ago


(map (fn [[k v]] (println "===1===k " k) (println "===1===v " v) (println "matches " (re-matches #"^[a-zA-Z0-9]+\.\*$" (->str v))) (if (re-matches #"^[a-zA-Z0-9]+\.\*$" (->str v)) true false)){:id "john"})


in this case it is returning (true) or (false) as list


how can convert that to get as boolean?


the is you are using map not in right context


{:id "John"} is a map, but you want to use map functions on collection like [{:id "John"} {:id "Popeye"}]


map return a list


so it process each element in vector and return the output of your function


if you want to process only one map {:id “John”}, then not use map function


what we can use if we have only 1 key and value?


just remove map from there


ok, this will be not enough 🙂


(map println {:id "john" :foo "bar"})
[:id john]
[:foo bar]
=> (nil nil)
(map println [{:id "john"} {:foo "bar"}])
{:id john}
{:foo bar}
=> (nil nil)


Do you see what I mean?


there is no side efect?


What do you mean by side effect?


not returning nil?


println return nil


the function returning vice versa? like if you apply on map it returning as vector?


I don’t understand the question. The logic is map take each element from collection and run function with this element. The result is returned by list.


so map get from vector {:id john} and run (println {:id john} which return nil etc.


yes, I got the functionality of map, In my logic i want to take key value which will be single map element and do pattern patching and result us true or false


if you want to operate on single map, then you don’t need to use map as a function at all


unless you want to operate on each pair of key and value in map, then map is ok


yeah, i got the error while running this function


((fn [[k v]] (println "===1===k " k) (println "===1===v " v) (println "matches " (re-matches #"^[a-zA-Z0-9]+\.\*$" (->str v))) (if (re-matches #"^[a-zA-Z0-9]+\.\*$" (->str v)) true false){:id "john"}))


is that the way we call? sorry first time i am writing this


[[k v]] is not correct anymore


how can we achive then, I am passing single key and value


((fn [m]
   (println m))
 {:foo "bar" :x "y"})
{:foo bar, :x y}
=> nil
(map (fn [m]
   (println m))
 {:foo "bar" :x "y"})
[:foo bar]
[:x y]
=> (nil nil)


((fn [m]
   (println (:foo m)))
 {:foo "bar" :x "y"})
=> nil
if you want to check :id (which is :foo here)


((fn [{:keys [foo] :as m}]
   (println foo))
 {:foo "bar" :x "y"})
or like above


but not everything at once 🙂


On the end you wouldn’t write anonymous function and call them right a way like that


(let [f (fn [{:keys [foo] :as m}]
          (println foo))]
  (f {:foo "bar" :x "y"}))
this can be easier to understand


yeah will explore

👍 3

how about this


(fn [v] (println "===1===v " v) (println "matches " (re-matches #"^[a-zA-Z0-9]+\.\*$" (->str v))) (if (re-matches #"^[a-zA-Z0-9]+\.\*$" (->str v)) false true )(map val (:id "john")))


no, this is not how you want to do this 🙂


BTW if you want to get all values only from map use vals so (vals {:foo "bar" :x 1})


really hard to talk about how things should be done while we are doing things to learn


you have to experiment and figure out things


Hello Team I am passing a map to anonymous function and wanted to validate the function and tried with below code ,but it is not working, how can I pass {:id "john"} to anonymous function ?


Maybe someone else remember URL to website where you can do online tasks challenger and compare your solution to other people?


from > As of Clojure 1.10, protocols can optionally elect to be extended via per-value metadata:

(defprotocol Component
  :extend-via-metadata true
  (start [component]))
Is there a resource that talks about how to decide if a protocol should opt in to extension via metadata?


Here's a fun little example of why Functional is better than OOP 😛

data = None

if data and "domain" in data:
  domain = data.get("domain").get("name", "foo")
  domain = "bar"
Notice in this code, you need the condition to be: if data and "domain" in data:, the reason we have to check for the fact that data is not None otherwise the type None will not have a in method and you will see: TypeError: argument of type 'NoneType' is not iterable


If you didn't use methods, and instead used a functional approach, and in was a function, this would not be a problem, because you could easily implement a None check inside that function.


This is also a good example why nil isn't as bad in Clojure as it is in non null-safe OOP languages like Python or Java


cljs.user=> (key nil)
ERROR - No protocol method IMapEntry.-key defined for type null: 
you have to check nil and types in Clojure too 🙂


Yes, sometimes, but now it's just a design choice, not a limitation of the paradigm. Key is just a function implemented with:

(defn key
  "Returns the key of the map entry."
  (-key map-entry))
If it wanted, it could handle nil in any way.


I wouldn't say that's a fair comparison. you typically wouldn't want to accept data as either None or a dict. I think it would be appropriate to only expect a dict. additionally, idiomatic python follows "it's easier to ask for forgiveness than permission". I would expect to just see:

data.get("domain", {}).get("name", "bar") 


To complete the example :-): (data or {}).get("domain", {}).get("name", "bar") That being said these days I end up with a get-in function in python code.

👍 3

the above is a nice addition. I still prefer clojure to python by quite a bit, but python isn't so bad


same here! ie. python isn't bad but I prefer clojure


I wasn't specifically singling out Python, more OO vs Functional.


My point being, what if you wanted a .get that can handle None or any other type, maybe vector, etc.


In OO, all types would need to agree to share a .get interface, and provide an implementation for it


But also, in this particular case, ya I do find Python's handling of None on .get less then ideal. Think Clojure's handling is much nicer specifically because I think the above is a common source of bug.


And not withstanding, I found this example because it was in our case 😅


I think we understood and agreed with your point, but we didn't think that the comparison was fair. In practice (at least on python codebases I worked on) that python code would look like: get-in(data, ('domain', 'name'), 'bar') or get-in(data, '', 'bar') which doesn't compare that unfavourably to (get-in data ["domain", "name"], "bar") as your initial example.


It's possible, no one on our team is really a pro at Python, more like learned at university or picked it up here and there. This code is in a script file part of our infra, so it also doesn't get the same level of code review scrutiny and all. I can't seem to find get-in though? Is that from a popular library?


If so, I think it demonstrates my point pretty well, and I'd be curious to look at the implementation. My guess is get-in is a function that people create for this very problem. Instead of adding a method to Dictionaries and None, if people have found the need to change get from a method to a function, that would be a good example of what I'm talking about. In Python, you could argue that you want a null error to be thrown, maybe you prefer the fail fast, and if you didn't explicitly handle null, maybe you consider a null appearing a bug that you'd want to know about. So that can be a design choice, what do you do with data being None? And while I like that Clojure has get handle nil by default, I don't want to say that throwing a null error if get encounters a null is necessarily worse or bad. But, in OOP, you actually can't do anything about it if you did want to handle this case the way Clojure does. That's because of how methods work versus functions. If the type is wrong, the methods won't exist. All you can do is add the method to more and more types, but even then, there's always a chance a type shows up that doesn't have the method, and you get an error again. That's one of the Functional advantages in my opinion. Which you could also do in Python, since it has Functions, you could make get a function and do this.


I would say the biggest difference for me is I can focus on moving from room A to B instead of object door which is not what I am interested in to achieve, because I want to move to B - but this is very abstractive description :)


I am going sleep, good night


I'm working on an app where I'm making several api calls concurrently to fetch data. The number is variable but let's say it's 50 on average. I'm currently using pmap to transform the urls into the response in parallel, but I was wondering if it could be faster since pmap is limited to 2 + num_cpus and the time is mostly spent in I/O wait. Any tips?


When I had an app that heavily used APIs, the pattern that worked best was to have separate resource pooling per API service. This is because there's usually a per API limit (either imposed by the API, or their own resources being able to serve you)


that pooling could be a thread pool (eg. claypoole which lets you use futures with custom pools) or a queue per service, with a different number of workers dedicated to each queue


if you aren't hitting the limits of the APIs, you can just use future for each call, and skip pmap which is rarely the right answer


if you need to do any coordination (eg. combining results from multiple calls before calling another endpoint) look into core.async (but make sure all the io is inside core.async/thread calls)


Also note that pmap will very likely run more than 2+cpus tasks at the same time due to chunking:


@U06BE1L6T I don't think you're correct here. The parallelization level is restricted by the thread pool it uses, chunking won't change that.


the parallelization is controlled by the lag between the launch of new futures and the deref, it uses future which is an expanding unlimited pool


chunking changes the behavior of (map #(future (f %)) coll) which is what actually creates the threads


so the answer is weird and complicated (another reason I don't like pmap) - chunking causes futures to be launched a chunk at a time, if the input is chunked, otherwise the number of futures in flight is controlled by the lag between future generation and future realization (which is done via the blocking deref)


(defn pmap
  "Like map, except f is applied in parallel. Semi-lazy in that the
  parallel computation stays ahead of the consumption, but doesn't
  realize the entire result unless required. Only useful for
  computationally intensive functions where the time of f dominates
  the coordination overhead."
  {:added "1.0"
   :static true}
  ([f coll]
   (let [n (+ 2 (.. Runtime getRuntime availableProcessors))
         rets (map #(future (f %)) coll)
         step (fn step [[x & xs :as vs] fs]
                 (if-let [s (seq fs)]
                   (cons (deref x) (step xs (rest s)))
                   (map deref vs))))]
     (step rets (drop n rets))))
  ([f coll & colls]
   (let [step (fn step [cs]
                 (let [ss (map seq cs)]
                   (when (every? identity ss)
                     (cons (map first ss) (step (map rest ss)))))))]
     (pmap #(apply f %) (step (cons coll colls))))))


the (drop n rets) creates the lag between creation of new futures and blocking deref to wait on them


breaking a common piece of advice to not mix lazy calculation with procedural side effects


Oh ya, my bad, I was thinking of agent send


I actually never deep dived the impl of pmap, hum..


Doesn't the implementation of step here unchunks?


;; changes to this atom will reported via println

(def snitch (atom 0))

(add-watch snitch :logging
           (fn [_ _ old-value new-value]
             (print (str "total goes from " old-value " to " new-value "\n"))))

(defn exercise
   (pmap (fn [x]
           (swap! snitch inc)
           (print (str "processing: " x "\n"))
           (swap! snitch dec)
user=> (exercise (range 10))
total goes from 3 to 4
total goes from 4 to 5
total goes from 2 to 3
total goes from 1 to 2
total goes from 0 to 1
processing: 0
processing: 4
processing: 2
processing: 3
processing: 1
total goes from 5 to 4
total goes from 4 to 3
total goes from 1 to 0
total goes from 2 to 1
total goes from 3 to 2
total goes from 0 to 1
total goes from 1 to 2
processing: 6
processing: 7
total goes from 2 to 3
total goes from 3 to 4
total goes from 5 to 4
total goes from 4 to 5
processing: 8
total goes from 4 to 3
processing: 9
processing: 5
total goes from 3 to 2
total goes from 2 to 1
total goes from 1 to 0
(0 0 0 0 0 0 3 2 0 0)
max parallelism here is 5 - I'm going to try a version where I capture the max and exercise it more aggressively


@U0K064KQV I am not good enough with lazy-seqs to read the pmap code and know whether it unchunks, so I'm working empirically


Haha, no one is 😛


yeah, here's my version of exercise that captures the max parallelism:

(defn exercise
  (let [biggest (atom 0)]
     (pmap (fn [x]
             (swap! snitch inc)
             (swap! biggest max @snitch)
             (print (str "processing: " x "\n"))
             (swap! snitch dec)
(exercise (range 1000)) prints a lot more than I'm going to paste here, and returns 19


lmk if that's flawed, but to my eye that will accurately tell you the max futures spawned concurrently by pmap


(nb range is chunked, which is why I'm using it here)


Hum. Ya, looking at the code, its kind of hard to get a full picture. I think the branch of if-let that uses cons will unchunk, but the other branch would not. And the drop n will also trigger the first chunk.


all the retries on that poor little atom make the output with bigger inputs absurd


or maybe that's caused by the printing contention...


Might be better to use a sempahore? I think a lock instead of atom's retry maybe would make this more clear?


(the reason all the prints call str is because otherwise the parts of the prints overlap in the output


Oh, no I don't think that's what I meant. Whatever the thing that is a locking counter is called


Then again, hum... What if you changed the impl of pmap so that inside the future it incremented and decremented the counter before and after running f ?


that would be the same behavior, with more work to achieve it


I rewrote to an agent (doesn't retry), the prints are now in intelligible order, the answer is still high (33, 37, 38, 39, 36 ...)


max value in theory is 42 (32 chunk size + 8 processors + 2)


Ya, so that matches my interpretation of the code


The first branch I think unchunks, but the drop is what triggers the first chunk


So instead of getting n parallelization, you get size of first chunk


(when you overlap the next chunk)


Oh boy, that's one confusing little function haha. It does seem like, it was written pre-chunking though, so I guess chunking just wasn't taken into account. Hum, I wonder if that explains why I see poor performance improvements from it in practice, like with chunking, the thread overhead is way too high for parallelization


it launches chunk-size futures, but iterates by nproc+2 delay between reader of input and reader of future values, if your input is big enough to have multiple chunks you can have more than chunk size in flight


that could be - I consider it more like "an example of what you could do to parallelize a specific problem" that happened to make it into the codebase, and it doesn't match most people's problems


reducers are more general, but I haven't used them in anger and haven't seen much usage of them in the wild


Ya, I think having to require their namespace and the fact that only fold is still useful now that we have transducers makes them kind of DOA


Well, maybe this chunking behavior is actually a blessing in disguise? Now it means using this re-chunk function:

(defn re-chunk [n xs]
   (when-let [s (seq (take n xs))]
     (let [cb (chunk-buffer n)]
       (doseq [x s] (chunk-append cb x))
       (chunk-cons (chunk cb) (re-chunk n (drop n xs)))))))
Taken from clojuredocs, you can actually control the concurrency level of pmap 😛


(dorun (pmap (fn[_] (Thread/sleep 100)) (re-chunk 1 (range 1000)))) Will give you ~2+cores (dorun (pmap (fn[_] (Thread/sleep 100)) (re-chunk 100 (range 1000)))) Will give you ~100


Not sure what to think about this. It probably just be nice if pmap was re-written to unchunk and take the number of cores+2 or an optional n.


I have the same feeling and that’s why I created map-throttled in the repo; but it’s for a very specific use case. In most cases it’s better to use Executors or claypoole