Is there some way to provide more context for errors like this?
#error {
:cause "Dataflow variable derefence cancelled."
:via
[{:type missionary.Cancelled
:message "Dataflow variable derefence cancelled."}]
:trace
[]}
I have little idea where it is coming from, but I can isolate it manually.I suspect it is happening because at runtime threads are swapped somehow. During the startup of my app I don't see the error, but when the m/reduce call picks up after having processed the initial batch I seem to get this error. I can also see debug messages sometimes showing the right machine hostname and sometimes "Unknown", which I guess comes from them running on a different thread (?).
I think it comes from https://github.com/whilo/simmie/blob/main/src/is/simm/runtimes/rustdesk.clj#L124
I am currently in the process of reimplementing this in Electric in a new project, this is an experimental codebase. I still need to understand missionary better and figure out how to make it integrate reasonably well with my core.async legacy. I also like the pub-sub system of core.async to be honest, I haven't seen something comparable in missionary yet. I can role my own of course, but I think a dispatch mechanism like this would be good to standardise.
several people in this channel reported timbre randomly breaking in missionary, so the first step would be to remove it from all missionary calls
Ok, cool. Is there an understanding why this is happening?
Everything I know about the timbre problem is here https://github.com/leonoel/missionary/issues/105
If you suspect an issue with threads I would check :
• interrupted state - it may be true due to https://github.com/leonoel/missionary/issues/115
• anything dynamically scoped that may be required by timbre
I have deployed the fix in b.41, let me know if it fixes your issues
Yep, it fixed my issue. Thanks!
Is this a syntax problem with timbre in that I should not use its macros inside a missionary call context or does it also apply when any underlying code executes timbre at runtime? Unfortunately timbre was the only cross-platform logging solution when I started working on the replikativ stack and I decided to opt for it at some point instead of keep rolling my own. This would mean I have to change all the replikativ libraries to use them with missionary. I am not complaining about this in itself necessarily, but I think if missionary creates these incompatibilities it makes it somehow its own programming language that is incompatible with standard Clojure semantics. That will make it harder to win people over and should be transparent before people pick it up.
I can take a look at missionary and try to help fix the issue. I think this would be better than having to swap timbre . I am not a big fan of it and am also happy to swap it out (for telemere or a simple vanilla logging solution), but it doesn't feel like the right solution. You mentioned in 115 that thread sleeping causes the InterruptedException (we also had this in datahike calls in missionary before I moved to the CompletableFuture). I will try to study the Thread pool executor/runtime code of missionary to understand better why this happens.
I have fix for 115, hang on
re pubsub - check this thread https://clojurians.slack.com/archives/CL85MBPEF/p1731215950520719
Hey Leonel. Thanks for the context! Does this mean that any JVM dependency could break missionary and I could only realize it later in an edge case? Core.async or JVM fibers don't have this problem. I think it is very tricky to generally define what blocking behaviour is, many computations change their runtime depending on state and inputs and can become blocking (in the CPU sense) at any point. My strategy with core.async is to avoid blocking OS threads with synchronous IO because that is expensive, but still allow it along paths that don't clog many threads. Would I have to offload all functions to the cpu/blk threadpool in missionary then?
The "do not block" rule is not a hard requirement, like in core.async it should be observed from a practical perspective, more like "do not exceed your time budget". The worst that could happen is your program will not be as concurrent as it should be. Waiting for a lock under low contention is usually fine. I also recommend leaving CPU-bound operations synchronous by default, and only defer them to a dedicated thread pool after the bottleneck is identified.