This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-03-16
Channels
- # babashka (48)
- # beginners (72)
- # calva (65)
- # cider (10)
- # clerk (11)
- # clj-kondo (14)
- # clojure (85)
- # clojure-austin (11)
- # clojure-czech (1)
- # clojure-europe (26)
- # clojure-nl (1)
- # clojure-uk (6)
- # core-matrix (1)
- # cursive (8)
- # datomic (20)
- # docker (38)
- # emacs (2)
- # events (1)
- # fulcro (6)
- # funcool (6)
- # hyperfiddle (79)
- # introduce-yourself (1)
- # lsp (131)
- # malli (32)
- # off-topic (11)
- # pathom (3)
- # re-frame (11)
- # reagent (15)
- # releases (2)
- # shadow-cljs (49)
- # sql (3)
- # tools-deps (36)
Pseudocode:
(->> jobs (pmap http/download) vec)
pmap
works great for HTTP io, but it's limited. It doesn't allow me to customize the number of concurrent jobs or what to do in case of an exception.
Is there a more customizable option?Bonus points if it works in babashka
looks like somebody just suggested using an Executor previously: https://clojurians-log.clojureverse.org/babashka/2022-11-25/1669366747.630009
Yeah, pretty much nobody recommends pmap
for anything, except maybe one-off REPL evaluations. :)
Promesa supports babashka and has useful stuff for this https://funcool.github.io/promesa/latest/executors.html
raw executors are pretty much meant for this and has the virtual thread sweetness too in bb 😄
Good point about java.util.concurrent.Executor. Is there a nice Clojure example example for lazy copy-n-pasters like me?
> id recommend against the use of pmap for side effects Any particular reason other than "`map` should be free of side-effects"?
yeah pmap has lazy chunking and would have weird effects
laziness+side effects = generally bad idea
as long as the function executions are independent (e.g. fetching different urls) it's fine tho
yeah, i think it's fine for simple cases as well (but clearly not a perfect match)
the classic clojure quote: its easy not simple
pmap does a lot of fancy stuff to be able to process a list incrementally, but if you want all of the results at the same time you can just use Executor.invokeAll like that example above
Can I say "download 10 at a time" with promesa?
> • px/fixed-executor
: creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
similarly the vanilla executor has a fixed threads version of it with the virtual threads factory
Awesome thanks for the input
@U06F82LES you can use babashka.http-client with async requests
Processing async request results can be done like this (I want to make this easier using callbacks or so):
(-> (http/request (-> (assoc request :async true)))
(.thenApply
(reify Function
(apply [_ resp]
(do-something-with resp))))
(.exceptionally
(reify Function
(apply [_ e]
(throw? e))))))))
You can give it a fixed thread executor with:
(http/client (assoc http/default-client-opts :executor (java.util.concurrent.Executors/newFixedThreadPool 10))
That's pretty cool @U04V15CAJ
I really enjoy using claypoole, would recommend. There's a host of functions that are helpful: https://cljdoc.org/d/org.clj-commons/claypoole/1.2.2/api/com.climate.claypoole.lazy#pdoseq
Would you say that Claypool 's pmap can safely be used for i/o, like http downloads?
id say use the virtual promesa ones or the raw virtual threaded executor. if on java 19+ and not using virtual threads for IO is quite the waste of resources IMO
also claypoole's pmap seems to be lazy as well: https://cljdoc.org/d/org.clj-commons/claypoole/1.2.2/api/com.climate.claypoole.lazy#pmap
but is there a way to limit the threads in a virtual executor? this seems important since pesterhazy asked for "10 at a time"
yeah you create a fixed thread pool with the vitual thread factory
can whip up an example
can you adapt this example: https://clojurians.slack.com/archives/C03S1KBA2/p1678974657196359?thread_ts=1678968328.528409&cid=C03S1KBA2
(def vfactory (.. (Thread/ofVirtual)
(name "blazing-vthread-" 0)
(factory)))
(def executor (java.util.concurrent.Executors/newFixedThreadPool 10 vfactory))
(http/client (assoc http/default-client-opts :executor executor))
needs bb 1.3.176+i just love the design the loom people did to be forever backwards compatible 😄
What are the benefits and tradeoffs of virtual threads? Never used them
its similar to go routines or erlang processes. detects when io happens and offloads from the carrier thread
@U06F82LES for 10 threads it's ok to use OS threads, but virtual threads multiplex on real threads, so you can have a near-unbounded amount of them, similar to the go macro in clojure, but more efficient since it's on the JVM level
much more efficient waiting on IO
I rewrote the clojure.core.async go macro to take advantage of virtual threads in the next release of bb
as part of the project loom, most of lower level java SDK was rewritten to make them notify the jvm of the IO calls and it manages the vthrerad->osthread mapping
biggest change to the jdk since java
tradeoffs are: • you dont think of pools anymore but that could mean the tiny db you were talking to could die under the unbounded load 😉 • vthreads are not meant for CPU bound tasks. its for concurrency and not parallelism. use the real threads for that
Thanks, super interesting
another nice solution to the https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ apart from Go, Erlang/Elixir, Lua, Zig
1.3.176 released which supports the above virtual thread stuff and also has async callbacks now when making async requests
Choice of concurrency-primitives question:
I'm building a "massively-single-player" game; ie, each player has their own independent game-state, but it's managed server-side. In addition to events coming in from the client, timers may fire, and 'authors' can do things with their own clients which result in events the player's game 'loop' needs to handle.
I currently have a function (`step!`) which is passed an event, grabs the player's current state, constructs a next-state and sequence of 'actions' (side-effecting), then tries to execute the actions, and if they all succeed, calls reset!
on the atom containing the player's current-state.
There is an obvious race-condition in this, if that function is called in parallel. Using goroutines or agents are obvious options, but I worry about the scalability; should I? Does a mutex over that function make more sense? Or maybe an event-queue that doesn't rely on core.async or agents (but then, how do I pull events, especially timers, without a busy-loop)?
Maybe I'm overthinking it, since as long as it's abstracted enough I can try different things later, if I have the nice problem of having to scale; on the other hand, I'm curious what the consensus is on best-practice for this sort of thing.
What kind of side-effecting actions are you considering? What happens if they partially fail/succeed?
In theory, updating the current player's state is also a side-effecting action (and could be treated similarly to the other side-effects).
the actions are currently all messages to the client, to change its state. If any of them fail, it's probably because the client dropped offline, and the state before that will at least be consistent when they reconnect
Scaling up usually comes from being able to add more servers. For that, the harder problem is usually moving "player's states" between servers so that you can scale up/down.
> If any of them fail, it's probably because the client dropped offline, and the state before that will at least be consistent when they reconnect There's actually no way to know what state the client was in when they disconnected.
this is a hobby project, and requires ML inference on a gpu; I'm serving it from home, and would like a single very beefy server to handle everything for the foreseeable future.
yes, but when they reconnect, everything is reestablished from the last-known-consistent state
Usually, you keep a queue of events and have a way for the client to ask for "everything since message id X" so there's no benefit to waiting for an acknowledgement before committing state updates on the server.
yes; I could do that, and probably will longer-term, but the client is very 'thin', and another benefit of the way I'm doing all-or-nothing state-updates is that bugs in my rapidly changing codebase don't leave state inconsistent, whether client or server-side. all of this is orthogonal to my question
are the messages sending deltas, the full state, or a mix?
The answer to your original question really depends on what the side effects and what happens if they fail. Since the side effects are messages to the client, I would suggest updating the server state right away and have a separate process that syncs the server state to the client (via deltas, full state, or both)
Unless there's a good reason not to, I would use swap!
rather than reset!
to make sure your updates are consistent and avoid race conditions.
I guess I didn't explain myself well. I'll try again some other time.
our billing system at work is sort of the same kind of thing, there is a billing state machine, and we run the state machine for each user
the state for each user is essentially an atom (really a custom IAtom type backed by a database row), and we advance through the state machine processing different events using compare-and-set!
we build up a set of side effects to run if a given compare-and-set! succeeds, then do the compare-and-set!, and if it succeeds we execute the side effects, and if not different things happening depending (sometimes there is a retry loop, etc)
Thanks, Nate. That is, indeed, the scenario in question, except that I'm not currently funneling all events through one 'updater', in a strict fashion. That isn't a problem yet, because it's close enough in practice. Using swap, without changing anything else, would be worse than pointless. I didn't think isolating pure functions from side-effecting ones would be at all controversial here. My pure code is about 100x as complex as the impure: calling into pytorch, etc. I could work on making all the side-effects strictly idempotent (they mostly are, again, in practice), or go the route of reliably replicating state between client and server, but that seems premature to me, and I'd have to explain a lot about what my client currently does (not much), and what all my plans are, to convince anyone of that. I was really just wondering what people were doing within the scope of what I described, these days. I've gotten the impression that core.async and agents have fallen out of 'fashion'.
Does clojure.data.int-map
keys are sorted? Ie. is keys
function on int-map
returns sorted sequence? Looks like yes, but I'm not sure...
i don’t see it in the documentation but it seems so
(let [numbers (range 1 1e6)
shuffled (shuffle numbers)
int-map (into (i/int-map) (map (fn [x] [x x]) shuffled))]
(= (keys int-map) numbers))
It was a deliberate change, albeit not documented: https://clojure.atlassian.net/browse/DIMAP-4
Asked https://ask.clojure.org/index.php/12778/document-that-clojure-data-int-map-is-sorted.