This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2024-04-16
Channels
- # announcements (1)
- # architecture (319)
- # babashka (27)
- # beginners (101)
- # biff (1)
- # calva (30)
- # cider (6)
- # clj-kondo (38)
- # clojure (41)
- # clojure-boston (1)
- # clojure-europe (80)
- # clojure-nl (1)
- # clojure-norway (21)
- # clojure-uk (4)
- # community-development (7)
- # conjure (1)
- # data-science (18)
- # datalevin (6)
- # datascript (30)
- # datomic (2)
- # events (2)
- # fulcro (1)
- # graalvm (3)
- # holy-lambda (2)
- # hyperfiddle (10)
- # jobs (3)
- # lsp (2)
- # malli (9)
- # matcher-combinators (3)
- # missionary (72)
- # nbb (40)
- # off-topic (1)
- # other-languages (14)
- # planck (5)
- # re-frame (2)
- # releases (4)
- # rewrite-clj (22)
- # shadow-cljs (3)
- # sql (2)
- # squint (17)
- # yamlscript (1)
what is the context of the success continuation function and an error continuation function, are their limitations documented somewhere? As I mentionned yesterday I have noticed cases where some async(?) code in these functions did not seem to run (timbre/log not logging anything but println working fine)
the limitations are • don't block the thread • don't throw exceptions • return quickly
So in the task specification the sentence "it must return a canceller, must not throw and must not block the calling thread" also apply to the success continuation function and error continuation functions?
or is it specific to missionary implementation of the spec
ok, it's an interesting gotcha because one doesn't necessarily think that a log statement might be doing crazy async stuff in the background
that got me worried the past few days of experimenting I've add quite a few runs where I thought missionary was not properly terminating the task because it wasn't logging from the success continuation
and I didn't suspect a log statement to fail
So I assume because timbre must be blocking the thread or something, it falls into undefined behavior
I mean it does as we saw the other day from that error https://clojurians.slack.com/archives/CL85MBPEF/p1712591473744209
missionary operators could be more resilient to callback abuse, e.g. redirecting exceptions to stderr
so these callbacks are not regular clojure functions, they run in some coroutine context that might terminate them early
so a recommended pattern would actually be to just do at the top of the supervision tree?
(task (constantly nil) (constantly nil))
and if you want to do something on success the top-level task would be:
(m/sp (let [res (m/? (actual-task))] (m/? (on-success res)))
and a try/catch for on-error?Thanks! Is that somewhere in the doc/wiki that I missed?
I can write it down in an issue so you can edit/copy it into the docs. Seems like an important detail, isn't it basically how one would use missionary in prod?
@U053XQP4S here it is https://github.com/leonoel/missionary/issues/105 I think there is also other ways for the error maybe something with m/compel Also would unsupected async like timbre log have unexpected behavior as well if they are run outside of m/via? eg:
(defn task []
(m/sp
(let [res (m/? (actual-task))]
(log/info "SUCCESS!"))
I'll refine it through the day.
So yes, even inside the task it's not safe to use timbre logging (at least directly). I was logging the error:
(defn task []
(m/sp
(try
(m/? (actual-task))
(catch Exception e
(log/error "OOPS"))
It's actually the same root cause that was preventing the logs from appearing in the success and cancel continuation functions, log can throw a java.lang.InterruptedException
exception from a deref in:
[taoensso.encore$eval3772$get_hostname__3773 invokePrim encore.cljc 5621]
When that happens the exception is silenced. It's interesting that I finally found it by wrapping the log statement in the catch block with a try catch blockJust to be clear, when you say the exception is silenced you mean due to log/error
throwing another exception the original e
is lost ?
No they are both lost if I don't add a try catch around the log statement
Also I have to run a test I'm starting to wonder if httpkit is causing the interruption as it uses loom threads
@U053XQP4S interestingly without using http-kit newVirtualThreadPerTaskExecutor I don't get that exception anymore. It still take more than 30 sec (sometime minutes) between the print I have right before throwing and the print before the log in the catch statement (so theoritically nothing inbetween) I will try to make a much more minimal repro
but yeah without virtual thread timbre logs work fine within missionary
@U053XQP4S I refined the issue https://github.com/leonoel/missionary/issues/107 writing a repro really helped narrowing down the issue, the repro now only needs missionary
So essentially a blocking call is as much of an issue in the task than it was in the success and error continuation functions, but where I got confused is that it is seems to only really cause an actual bug when these conditions are met: • throwing an exception • from the m/blk executor • in a m/reduce
what do you mean? in the last one in particular, you can see everything prints, but if you remove the try-catch around the deref/future it stops printing after "Printing in future"
That is the correct behavior. deref
throws InterruptedException
, it is not caught therefore it propagates to the sp
task, then you get it in the failure callback
I guess I'm confused that it only throws when all the above conditions are met
I do not consider it a bug the fact that user code can be run on a thread in interrupted state. While missionary generally discourages blocking outside of m/via, it may sometimes be acceptable if the blocking is quick, in this case the blocking call should ignore thread interruption.
that is the case here, it's just printing
log statements arguably fall in this category and I find questionable the decision to make timbre logs interruptible
the error is indeed in the callback but then it will throw again if eg one logs errors with timbre because it will again use deref
I agree with you about that, I don't think it's explicitely interruptible, it happens to use a promise/future/deref
how come the interruptible state is only there when throwing in a blocking task while reducing over a flow? eg this works
((m/sp
(try
(m/? (m/reduce
(constantly nil)
(m/ap (let [s (m/?> (m/seed [1 2 3]))]
(m/? (m/sp (throw (ex-info "BOOM" {}))))))))
(catch Exception e
(println "Catching the exception")
@(future (println "Printing in future"))
(println "This would never print without catching the exception from deref")))
@(future (println "Printing in future outside catch"))
(println "And this prints because the deref above doesn't blow up"))
(constantly nil) println)
how can one safely ignore thread interruption in exception handling code?
(defn uninterrupted [f]
(let [i (Thread/interrupted)]
(try (f)
(finally
(when i (.interrupt (Thread/currentThread)))))))
((m/sp
(try
(m/? (m/reduce
(constantly nil)
(m/ap (let [s (m/?> (m/seed [1 2 3]))]
(m/? (m/via m/blk (throw (ex-info "BOOM" {}))))))))
(catch Exception e
(println "Catching the exception")
(uninterrupted #(deref (future (println "Printing in future"))))
(println "This would never print without catching the exception from deref")))
@(future (println "Printing in future outside catch"))
(println "And this prints because the deref above doesn't blow up"))
prn prn)
this will print past the catchThank you very much for helping me debugging through that!
Now I'm aware of how it work and won't trip on it anymore, but I'm worried that newcomers might be quite confused by the difference in behavior
technically it is documented in via
doc "Cancellation interrupts the evaluating thread."
np, thank you for this feedback I will think about possible ways to mitigate this or at least document it better
yes, but what is not obvious is user code can be called from an interrupted thread even in a non-cancelled process
let me know if you find any quirks with virtual threads, I've not experimented much with project loom yet
trying to write a small issue describing the problem as temporary documentation,
still not sure I understand why this behavior of via only causes the issue when the via is in a flow that is reduced, if its just in a sp
it works fine
((m/sp
(try
(m/? (m/via m/blk (throw (ex-info "BOOM" {}))))
(catch Exception e
(println "Catching the exception")
@(future (println "PRINTING"))
(println "This prints because the above doesn't blow up")))
@(future (println "PRINTING"))
(println "And this prints because the deref above doesn't blow up"))
(constantly nil) println)
oh I think I get it it's because there's only 1 blocking task and it throws so it's not cancelled
Right I can reproduce like this as well:
((m/sp
(try
(println "Running")
(m/? (m/join (m/via m/blk (throw (ex-info "BOOM" {})))
(m/via m/blk (throw (ex-info "BOOM2" {})))
(m/sleep 10000)))
(catch Exception e
(println "Catching the exception")
@(future (println "WONT PRINT 1"))
(println "WONT PRINT 2")))
@(future (println "WONT PRINT 3"))
(println "WONT PRINT 4"))
(constantly nil) println)
but what does interrupting the evaluating thread bring? what do other missionary construct do, just throw missionary.Cancelled exception?
that is the point of process supervision, the pending threads must shut down cleanly to run finally blocks and release resources
Cancelled
is just a special value to indicate the process could not run to completion, like InterruptedException
oh I see the part about finally blocks is mentionned here https://cljdoc.org/d/missionary/missionary/b.36/doc/readme/tutorials/hello-task?q=finally#parallel-composition I guess last time I read through it I wasn't ready to really understand
wouldn't it make sense that compel
also inhibits the interruption and not just cancellation?
not really, all operators rely on the idea that it doesn't matter which thread runs user code (except m/via
, but you can still m/compel
a m/via
to inhibit interruption)
what could make sense is to ensure m/via
always clears interruption flag before running callbacks, but I have to check the implications first
I think the biggest issue is that this can cause the unclean shut down of supervised processes, since the interrupted exception will cause any (accidently) uninterruptible code in a finally block to throw
as in:
((m/sp
(try
(println "Running")
(m/? (m/join []
(m/sp (try
(m/sleep 1000)
(finally
(println "finally task")
@(future (println "preparing cleanup task"))
(println "cleanup task DONE"))))
(m/via m/blk (throw (ex-info "BOOM" {})))))
(finally
(println "finally supervisor")
@(future (println "preparing cleanup supervisor"))
(println "cleanup supervisor DONE"))))
println println)
@U03K8V573EC I released b.37
. The continuation of m/via
now runs after clearing the interruption flag, which means interruptible user code should not be interrupted anymore.
Thanks it worked!
So yes, even inside the task it's not safe to use timbre logging (at least directly). I was logging the error:
(defn task []
(m/sp
(try
(m/? (actual-task))
(catch Exception e
(log/error "OOPS"))
It's actually the same root cause that was preventing the logs from appearing in the success and cancel continuation functions, log can throw a java.lang.InterruptedException
exception from a deref in:
[taoensso.encore$eval3772$get_hostname__3773 invokePrim encore.cljc 5621]
When that happens the exception is silenced. It's interesting that I finally found it by wrapping the log statement in the catch block with a try catch block