This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-04-11
Channels
- # announcements (3)
- # asami (4)
- # babashka (79)
- # babashka-sci-dev (47)
- # beginners (97)
- # biff (12)
- # calva (7)
- # clj-commons (3)
- # clj-kondo (22)
- # clj-on-windows (13)
- # cljdoc (31)
- # cljfx (2)
- # cljs-dev (1)
- # clojure (85)
- # clojure-austin (4)
- # clojure-dev (12)
- # clojure-europe (15)
- # clojure-italy (8)
- # clojure-nl (4)
- # clojure-uk (4)
- # community-development (19)
- # conjure (3)
- # core-typed (40)
- # cursive (9)
- # datahike (21)
- # datomic (1)
- # emacs (7)
- # exercism (2)
- # graalvm (20)
- # graphql (1)
- # honeysql (16)
- # jobs (1)
- # malli (2)
- # off-topic (3)
- # pathom (28)
- # pedestal (3)
- # polylith (7)
- # reitit (14)
- # releases (1)
- # remote-jobs (1)
- # rewrite-clj (4)
- # shadow-cljs (21)
- # sql (21)
- # testing (8)
- # tools-deps (23)
- # vscode (8)
- # xtdb (38)
Under what circumstances can't I cancel a future?
user=> (clojure.repl/doc future-cancel)
-------------------------
clojure.core/future-cancel
([f])
Cancels the future, if possible.
looking at the clojure source https://github.com/clojure/clojure/blob/clojure-1.10.1/src/clj/clojure/core.clj#L7000
[^java.util.concurrent.Future f] (.cancel f true))
this just calls cancel on the Future class.if you read the java docs for that method https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/Future.html#cancel(boolean)
Returns: false if the task could not be cancelled, typically because it has already completed normally; true otherwise
also
This attempt will fail if the task has already completed, has already been cancelled, or could not be cancelled for some other reason
the "for some other reason" is imo because the JVM delegates to the OS so there might be other reasons unknown of when/why a future could not be canceled.
for example if in the future there is a tight loop running it might not be canceled.
Clojure 1.11.1
user=> (def s (atom 0))
#'user/s
user=> (def f (future (while true (swap! s inc))))
#'user/f
user=> @s
344653116
user=> (future-cancel f)
true
user=> @s
701648902
user=> @s
769185846
user=> @s
834034676
user=> @s
894427378
user=> @s
953997173
user=> @s
1003335986
user=> @s
1054499559
user=> @s
1101502513
I was unable to cancel a future with a started ProcessBuilder
cool, if you send some code I can offer more advice π, ProcessBuilder will create an OS process, which is abstracted by the Process class, https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Process.html this class has two methods destroy
and destroyForcibly()
. If this is important for you I would suggest handling the Process object directly. My guess is your'e using the process inside a Future, canceling the Future would not kill/cancel the Process.
Previous discussion on this: https://clojurians.slack.com/archives/C03S1KBA2/p1627320960378900
It seems like re-matches in CLJ is sometimes 10x slower than the same thing in CLJS. E.g. for:
(def regex-matcher #"[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]-[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]-[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]-[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]-[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]")
(re-matches regex-matcher "399d0134-9629-4bc6-8f8e-437de87eaaea")
Indeed, interesting.
Same if you replace repeating patterns with a single copy followed by {n}
and use interop instead of CLJ[S] functions, so it must be a difference between JS and Java.
Stumbled upon this: https://swtch.com/~rsc/regexp/regexp1.html Haven't read it yet but seems relevant.
In JavaScript the multiplicative brace makes 2x speed difference in this case. In Clojure, there is no difference but both are much slower. Seems like there is a lot of potential for improvement somewhere.
The regex engine difference is a classic. The was a Cloudflare outage related to bad regex +- recently (2 years ago?).
Seems like Java doesn't use a suitable algorithm for regex? http://www.amygdalum.net/en/efficient-regular-expressions-java.html
Perhaps it is already well known to everyone here, but Clojure/JVM uses JVM regex matching libraries, and ClojureScript uses whatever JavaScript regex matching libraries it uses, and those implementations are different. Clojure/ClojureScript make no attempts to hide those differences from a developer.
Sure, I was just surprised the performance difference on such a simple test is so strikingly large. It is a useful heads-up for people to know about an order of magnitude difference or not?
What would you consider a proper benchmark for this case - same code across different platforms?
@U050ECB92 Just do a match of a UUID. You can see it performs rather poorly.
I want to write a system that allows users to submit small programs remotely. But harmful programs should not be allowed to execute. They also should not be turing complete. Is it too dangerous to accept chunks of Clojure code as a string, parse it as a data, use some sort of allow-list of special forms and functions, and then run it as Clojure code?
filtering input code based on some threatening level calculated by some heuristics will be a rat race. Instead you could prepare something like a sandbox environment where you can diminish potential harm. For example you can use https://github.com/babashka/SCI with preconfigured "allowed to use" namespaces and vars. Another option is to try GraalVM where you can run Java, Python, Ruby, JavaScript and other less mainstream programming languages in a sandbox.
second that. There are so many ways even with allow lists to circumvent security measures. Best is to run it in an isolated environment. I've seen GraalVM used as a sandbox where IO/threading etc is forbidden as a configuration option.
I see. This is what I was afraid, maybe there will be always a way to circumvent and it will lead to some situation like log4shell. Maybe I can write my own language/instruction on top of EDN. Being non Turing complete is also important for what I want to achieve
This is SCI is what I had in mind, thanks!!!!! @U04V4KLKC
someone please remind me, is there some fixed hard limit on number of entries in an unbuffered (chan)
?
Containing nothing is kind of hand wavy, a channel is basically three places "things" are, a queue of writers, a buffer of values, and a queue of readers
An unbuffered channel mostly has nothing in the buffer of values ever (transducers on channel can mess with this)
The queues of readers and writers on channels are limited, there is a hard coded limit of 1024
A put! is queued as a writer even if you don't pass a callback, which is why most uses of put! are bad (broken flow control that queues writers without bound)
what to use then? >!!
instead? iirc >!!
just does buffer checking and then calls put!
anyway
If you must use put! then you should use the callback arity and pass the continuation of whatever you are doing as a callback
>!! calls put in a way that blocks the current thread until writer is matched with a reader
Bad (loop [] (put! ch (get-work)) (recur))
good ((fn f [_] (put! ch (get-work) f)) nil)
these are mostly polling queues so they tend to be emptied out by some external call every N seconds, or be of a fixed size that is drained over time
I've started to see thread-last pipelines as basically a code smell, telling me that something should be a transducer. I'm curious how common of an opinion that is these days, and if others think there's still a good common reason to prefer a thread-last pipeline (besides maybe readability in performance-insensitive contexts).
Clojure's threading macros (the -> and ->> thrushes) are great for navigating into data and transforming sequences. injest's path thread macros +> and +>> are just like -> and ->> but with expanded path navigating abilities similar to get-in.
Transducers are great for performing sequence transformations efficiently. x>> combines the efficiency of transducers with the better ergonomics of +>>. Thread performance can be further extended by automatically parallelizing work with =>>.
that sounds interesting. Basically a macro that automatically performs a translation for some clojure code in a thread last into a transducer equivalent?
he has some early benchmarks in this post https://clojureverse.org/t/x-x-auto-transducifying-thread-macros-now-with-parallelizing-and/8122
yep, that's what it does
oh, so I see this #(do [])
and I'd like to raise #(-> [])
which also feels pretty good
no prob! π
Disagree with the original post - sequences are totally fine if the size is small or transformations are few or especially, if you don't actually need all the results (in which case transducer is probably slower)
And you may find that timing comparisons won't hold up in future versions
That's good to know, thanks Alex.
I find that transducers are more complicated to use, and have more edge cases. So my personal default is to use sequences unless I specifically need the performance.
I think what can be a code smell is a mixture of ->>
and ->
, because that implies you've lost the laziness, since the ->
will force realize the ->>
, and maybe that's then best to just switch to transducers for it all... but again, even there I think it can sometimes be more complex to switch, so I don't know, I'd probably still use sequences all the time unless I need specific performance
Just add an inline meta to all sequence functions and get operators fusion at compile time π
hi all - trying to further optimize some pretty heavily optimized clojure code, and thought of trying to build a java set and then "wrap" it in a clojure set - is that something that can be done easily like you can do with vec?
In the sense that they are implemented in java and in the sense that they implement java.util.Set
https://github.com/clojure/data.int-map#sets is an example of a more specialized clojure set implemented in a mix of clojure and java
thanks
I have a protocol P
defined in namespace A
, a type T
(record) defined in namespace B
which implements a different protocol there, and a namespace C
where T
is extended to implement P
. The code in namespace C
only defines the implementation of P
; it isn't loaded from anywhere, and when I fail to force it to reload in the REPL I get the error that the type doesn't implement the protocol. When I load it in the REPL, it works as expected. Any suggestions on how to structure things so that C
is properly loaded?
Not sure if it makes sense based on the names, but can you require A, B, and C from another namespace? For example, the entrypoint to your app, or some sub-section of it?
you can either always reload C and redefine all instances of P if A is changed, or use a setup like stuartsierra/component or weavejester/integrant that automates that reloading based on the dependency graph
hmm.. maybe. What's interesting here is that it's already being loaded via mount. It implements another protocol also. However, previously the type extended the protocol in the same namespace, but I've moved the implementation out of the namespace where the protocol is defined.
also, maybe a nitpick, but I find the concept of "code that isn't loaded from anywhere" strange. if it's not loaded it doesn't get run. so in that sense it might help to do as @U08JKUHA9 suggests and make a namespace that does need A, B, and C
OK - if you are using mount and seeing this problem, you are using mount wrong (or it could be a bug in mount, I'm not a fan of that particular lib and avoid it)
mount should know about the relationship between the definition of P and the usage of that P by C
and it should ensure the reloading is done correctly for you
Mount is handed an instance of the type.. but how would it know to load other namespaces where that type implements some other protocol?
I think I will create a new namespace here and see where that path leads. Thanks for the suggestions!
as I understand it, the correct solution with mount is to use defstate
so that mount knows about the dependencies between the definitions, so that the reloading is done coherently https://github.com/tolitius/mount#the-importance-of-being-reloadable
I'm confused, where do you expect to use the type T with functions from the protocol P ? Wherever you expect to do that, that namespace needs to require A, B and C