I am very new to core.async. I was wondering if somebody would mind providing any feedback on this short snippet that - if I understand correctly - writes files asynchronously. I would definitely appreciate any constructive feedback. I'll post the code snippet in the thread to not clutter up the channel.
(ns file-async.core
(:require
[clojure.core.async :as async]
[ :as io]))
(def files
[{:filepath "./data/output1.txt" :content "Hello, world 1!"}
{:filepath "./data/output2.txt" :content "Hello, world 2!"}
{:filepath "./data/output3.txt" :content "Hello, world 3!"}
{:filepath "./data/output4.txt" :content "Hello, world 4!"}
{:filepath "./data/output5.txt" :content "Hello, world 5!"}])
(def many-files
(mapv
(fn [n] {:filepath (str "./data/output" n ".txt") :content (str "Hello, world " n "!")})
(range 1 101)))
(defn write-file
[filename data]
(try
(with-open [w (io/writer filename)]
(.write w data)
{:ok filename})
(catch Exception e
{:error (ex-message e)})))
(defn file-writer-worker
[in worker-id]
(async/go-loop []
(when-let [{:keys [filepath content]} (async/<! in)]
(let [{:keys [ok error]} (write-file filepath content)]
(if ok
(println (str "Worker " worker-id " - Wrote " filepath))
(println (str "Worker " worker-id "Error writing " filepath ": " error))))
(recur))))
(defn start-writer-worker
[num-workers]
(let [in (async/chan)]
(dotimes [n num-workers]
(file-writer-worker in n))
in))
(defn write-files-async
[files]
(let [worker (start-writer-worker 4)]
(doseq [file files] (async/>!! worker file))))
(defn -main
[& args]
(println "Writing files asynchronously...")
(write-files-async many-files)
(println "Done writing files asynchronously.")) one thing that sticks out right away is that you shouldn't do any io in a go / go-loop . You could just swap out oyur async/go-loop with (a/thread (loop [] ... and then change the ! calls to !! in that block
Okay, I can do that. Can you elaborate on why?
It uses a fixed thread pool that can become deadlocked
from the go docstring:
> go blocks should not (either directly or indirectly) perform operations
that may block indefinitely. Doing so risks depleting the fixed pool of
go block threads, causing all go block processing to stop. This includes
core.async blocking ops (those ending in !!) and other blocking IO.
(in practice the best advice seems to consider "indefinitely" there to mean "for any meaningful length of time")
for long running but finite length of time io it is less about deadlock and more just kind of fairness / starvation. go blocks run on a shared fixed size threadpool, so if you have long running work in there that doesn't yield (the parking channel ops are yields), other work may not get a chance to run until your long running stuff is finished
the new java virtual thread stuff has a similar kind of idea(virtual threads are multiplexed over "platform" threads), but because it has jvm integration it can also transform io into non-blocking io with yields
Okay, so thread is better for io? So if I used thread would this be a reasonable solution for writing files asynchronously?
I think it looks ok. for switching to thread all I would do is wrap the call to write-file like (async/<! (async/thread (write-file ...
the pattern is control flow and logic as gos and then little units that do io in one off threads. That is just kind of a rule of thumb, sometimes it is too much trouble or the control flow and logic is so minimal that the back and forth is too much of a hassle so you just use a thread for the whole thing.