missionary

Dallas Surewood 2024-11-25T11:06:46.844209Z

If I wanted to use m/race to run 3 http requests simultaneously WITHOUT making new threads, is there a way to do that? Would I need an http library that supports some async and does one exist?

xificurC 2024-11-25T11:28:03.418729Z

yes, yes and yes. Whether a thread gets created depends on the http client's async impl. I expect you're talking JVM

Dallas Surewood 2024-11-25T11:29:18.699939Z

I am yes. What would the implementation need?

leonoel 2024-11-25T12:10:37.179949Z

The implementation just needs to expose success/failure callbacks.

Dallas Surewood 2024-11-25T12:18:38.298559Z

Would that actually prevent it from blocking the thread it runs on though?

Dallas Surewood 2024-11-25T12:20:10.330789Z

I'm wondering what's the difference between a blocking operation like this (which takes success and failure) and something that can genuinely cede control to another function for the thread

(def my-task (fn [success failure] (Thread/sleep 1000) (success 1)))

(time
 (m/?
  (m/join vector my-task my-task)))

leonoel 2024-11-25T12:31:23.784179Z

an async API is not about preventing blocking, it is about allowing non-blocking

Dallas Surewood 2024-11-25T12:33:30.161379Z

But what would non-blocking look like? In the case of Java, is it a specific primitive? Like if I wanted to write an http/get function that didn't block any thread, what would I be looking at for an implementation?

xificurC 2024-11-25T12:53:19.698609Z

Why is run-on-the-same-thread important? What are the requirements here

xificurC 2024-11-25T13:01:06.465619Z

as an example, java's httpclient has a non-blocking API returning a CompletableFuture. How is the async done? I guess that's an implementation detail, maybe they switched to virtual threads in the newer versions

xificurC 2024-11-25T13:02:07.228489Z

vert.x uses a custom event loop. There are http clients wrapping netty, which also uses an event loop

Dallas Surewood 2024-11-25T13:06:56.099359Z

From what I understand there are cases where an event loop is gonna be more efficient than having a few threads going at once handling requests. I am just trying to understand these things a bit more.

Dallas Surewood 2024-11-25T13:20:17.372559Z

There are a lot of examples in Missionary using m/sleep which can park, as if to imply it's a placeholder for other async things that can park. But when I ask about what would be a practical thing in place of m/sleep, I hear I should just be using separate threads anyway. Which makes it unclear why m/sleep is used in so many examples.

xificurC 2024-11-25T13:37:47.020669Z

m/sleep examples are easy to understand and run. Anything that is not immediate (disk read/write, network, expensive computation ...) raises questions like how to write efficient code for it, how to handle failure, how to compose them

Dallas Surewood 2024-11-25T13:40:34.939379Z

But I am wondering those things, because as far as I understand I can't write to the disk or fetch from the network in a way that resembles m/sleep because those things are thread blocking. They can't park, they have to be on another thread. Is that correct?

xificurC 2024-11-25T13:49:53.739579Z

on the JVM, before virtual threads, without third-party libraries, yes, you'd run these in a separate thread, typically on a thread pool to amortize the cost of spawning threads. There were/are libraries like netty or vert.x which use an event loop instead. With the newer JVMs there's virtual threads

xificurC 2024-11-25T13:51:39.485649Z

missionary doesn't care. You plug in what you have. If it uses OS threads fine, if it uses virtual threads or netty that's fine too. Remember, missionary also runs in JS where there aren't traditional threads

Dallas Surewood 2024-11-25T13:57:56.994569Z

So if I wanted to implement an async function myself without creating new threads I should look into making my own event loop essentially?

xificurC 2024-11-25T13:58:57.757929Z

is this for educational purposes? Are you trying to understand the underlying mechanics? Because practically speaking of course not, just use a library

Dallas Surewood 2024-11-25T14:00:22.385319Z

Yes for education. And to clarify, when I use an async library that does make a new thread, like babashka http-client in async mode, is that thread completely blocked while it waits for I/O? If not, how does the JVM handle switching context away from that thread while it waits for I/o?

xificurC 2024-11-25T14:05:21.406549Z

The solution is typically domain specific. There's no one-size-fits-all answer. E.g. the browser has event handlers, you'll have to wrap/orchestrate those. A message queue might also have a onNextMessage handler. OS processes can send signals. OS threads yes, they block waiting for IO. JVM's new virtual threads park. Their state is captured in memory and resumed later (yes, there's a scheduler, i.e. event loop somewhere inside)

Dallas Surewood 2024-11-25T14:10:35.233729Z

In the case of Java a lot of people are using CompletableFuture. This runs on a separate thread (specifically from Javas ForkJoinPool). If I'm on a single core system, does this mean if I run 5 I/O tasks in 5 CompletableFutures, they will only run one at a time because the core can only do one thread at at time? Or is the JVM somehow able to know an I/O operation is inside CompletableFuture and temporarily gives up execution context so one thread doesn't hog the core?

Dallas Surewood 2024-11-25T14:11:34.957849Z

Because if you give a CompletableFuture a callback, how can it run that callback if it gives up execution before it runs it? How does the OS/JVM know to return to that context eventually?

Dallas Surewood 2024-11-25T14:23:35.195359Z

Sounds like I have a misunderstanding of how I/O works. It was my understanding that when we wait synchronously for I/O, it's the same as CPU blocking and the thread is wasting a CPU core. But it sounds like even if the thread is "blocked", the OS/Runtime knows how to suspend it and let other threads do work until the I/O is done. Is this correct?

xificurC 2024-11-25T14:29:03.496269Z

they will run concurrently, subject to OS and JVM scheduling. Having a single core means they will run on that core only but most likely their execution will be interleaved

xificurC 2024-11-25T14:33:18.690289Z

if you're curious how does the JVM handle that you'll have to dig up some resources on thread scheduling. But yes, the JVM knows when to give up execution context, see https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Thread.State.html

🙏 1
Dallas Surewood 2024-11-25T14:38:20.326079Z

Thank you

whilo 2024-11-25T18:10:39.176929Z

Would cloroutine/missionary break without this custom analyzer https://github.com/leonoel/cloroutine/blob/2f5e0ad6f7b2de56671218cfcbb9d42c000bd942/src/cloroutine/impl/analyze_clj.clj#L9 ?

leonoel 2024-11-25T18:27:03.695429Z

if I remember correctly this is necessary to support nested coroutines when the inner one closes over bindings of the outer one

whilo 2024-11-25T19:12:00.952829Z

Ok, thanks!

whilo 2024-11-25T21:21:13.869849Z

Missionary flows are just functions. Would it be reasonable to define a custom type for them so they can be extended with protocols and interacted with outside of missionary more transparently?

leonoel 2024-11-27T20:21:16.854929Z

for the record, I consider types and protocols as a tool and I'm happy to discuss it as long as it starts with a clear problem statement (e.g "I want to see information X at the REPL", "I spend too much time debugging errors of the kind Y", "I'd like to prove ahead of time that my program validates property Z")

leonoel 2024-11-27T20:41:03.857999Z

I do not see much value in having core.async channels being flow-compliant out of the box. Compared to an explicit conversion, it just saves one function call (for reading only). Also it's not a future-proof solution because the channel concrete type is not part of the public API, and I'm not even sure there's only one.

Dustin Getz (Hyperfiddle) 2024-11-26T11:54:17.735049Z

I discussed this at length with Leo a few years ago - the current design has a few key benefits • using functions as the effect type ("flow protocol") requires zero dependencies, which means users can author missionary-compatible flows in a library without dependency on missionary • missionary flow pipelines can and do interleave continuous and discrete stages. The exact semantics of a continuous vs discrete flow are difficult (maybe impossible) to model correctly with types without damaging expressiveness

whilo 2024-11-26T18:44:56.372659Z

I see. I am not sure how much flows without a dependency on missionary will generalize, I can see that composition might be tricky, but at the moment it feels a bit like programming in Scheme were everything is an untyped closure and you need to know exactly where it is coming from to be able to interact with it.

Dustin Getz (Hyperfiddle) 2024-11-26T18:48:46.817469Z

it feels like programming in Clojure, yes

Dustin Getz (Hyperfiddle) 2024-11-26T18:49:32.208789Z

that's flamebait, i take it back

whilo 2024-11-26T19:30:49.879669Z

I did not mean this in a bad way, I like Scheme. I just sometimes lose orientation when playing with missionary.

leonoel 2024-11-26T19:34:02.573379Z

@whilo are you challenging : 1. the choice to model tasks and flows as instances of IFn (instead of instances of an ad-hoc defprotocol) 2. the fact that missionary operators return plain functions that happen to be opaque (instead of returning an inspectable deftype implementing IFn)

whilo 2024-11-26T19:34:53.393619Z

I would have expected 2. to be the design choice.

whilo 2024-11-26T19:35:40.921199Z

I am ok with your design decisions and will work with them, I just bring it up to build my mental model and understand things better. I want to help to make this stack better.

Dustin Getz (Hyperfiddle) 2024-11-26T19:48:21.655499Z

iiuc whilo is asking for type DiscreteFlow extends Flow type ContinuousFlow extends Flow which is different than Leo's #2, which is type LatestOp extends Fn

whilo 2024-11-26T20:37:31.915489Z

If that makes sense, yes. If this is suggesting a false sense of compositionality then I am happy to keep things as they are.

whilo 2024-11-26T20:38:40.479539Z

Also if it was possible to turn core.async channels to a DiscreteFlow by extending them to this protocol it would be nice. Then all the core.async libraries would be missionary compatible right away.

whilo 2024-11-26T20:39:04.144909Z

And I guess DiscreteFlow could implement the core.async protocols.

whilo 2024-11-26T20:39:58.103409Z

(maybe in an optional namespace that you can explicitly require if you have core.async already as a dependency)

whilo 2024-11-28T17:43:21.495939Z

That is true, you might not want to officially support the core.async conversion, although core.async is barely changing at all and I don't think the channel types and implementation will change at all. Only the go functionality will probably change to fibers now. I think my underlying argument is that if systems like missionary, core.async, manifold, ... expose types that can be extended via protocols then they can be made transparently compatible via polymorphism, which is nice for streaming systems. Having to inject an explicity function call means that systems do not really compose, because they might call each other internally and pass streaming abstractions, at which point you would have to change their code to inject all the calls.

fs42 2024-11-25T23:28:58.360719Z

Is there any example that shows how you could use missionary to model/manage/supervise a websocket interaction from the client: open connection; subscribe; process data; send heartbeat ping; process pong; deal with timeouts on connection and heartbeat, stop/reset/retry interaction, etc. It feels like missionary is made for this kind of complicated state machinery, but I'd love to see previous "art" before taking a stab myself. (my use case is a data feed where the only data that the client sends is a subscription request for a certain data channel and ping's to ensure the connection is still up - the server sends new channel-data back to the client whenever is has an update... could be quiet for hours or spurts of multiple message with new data every ms)