This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-07-11
Channels
- # announcements (6)
- # architecture (14)
- # babashka (26)
- # beginners (22)
- # calva (11)
- # clj-kondo (2)
- # clj-on-windows (1)
- # cljsrn (10)
- # clojure (116)
- # clojure-europe (5)
- # clojure-uk (1)
- # clojurescript (5)
- # cursive (9)
- # datomic (21)
- # depstar (1)
- # events (1)
- # fulcro (2)
- # graalvm (17)
- # graalvm-mobile (28)
- # helix (3)
- # introduce-yourself (2)
- # jobs (2)
- # lsp (4)
- # meander (1)
- # off-topic (4)
- # pathom (5)
- # polylith (6)
- # practicalli (5)
- # reagent (67)
- # reitit (1)
- # releases (2)
- # shadow-cljs (24)
- # tools-deps (23)
Can you explain what you have in mind? My use case is: a library uses function x in several places, but I would like to monkey-patch x to do something more. But with direct linking, I'd have to patch the "several places" as well.
defn
returns a var. You can alter-var-root defn
to wrap the original defn, which gives you the var, do whatever you want to it, then return it like defn would regularly
You might want to limit this hack's scope by creating something like instrumenting-require
which would instrument defn
, require the file, then "uninstrument" it
So first patch defn. Then "watch" for the var you want to patch, and ignore the rest? And then load the actual lib?
I want to give users the ability to patch eval
in SCI but I don't want to pay any extra performance hit in the normal case
Similar to this: https://github.com/oxalorg/4ever-clojure/blob/e8c8e9ed513d4cf4339938c78c53333be8d51ca8/src/app/sci.cljs#L23-L27
In CLJS it works nicely since you can just patch the function reference directly, there is no "direct linking"
but perhaps the most important environment in which you want to do this is CLJS, since in JVM settings you can just spawn threads and kill stuff
But I sometimes I need this feature myself as well when I want to patch functions for libraries in babashka. And bb is compiled using direct linking
So when I do this, I have to dig for every usage of that function and also patch those functions
user=> (def old-defn @#'clojure.core/defn)
#'user/old-defn
user=> (alter-var-root #'clojure.core/defn (constantly (fn [form env fn-name & args] (prn :fn fn-name) (apply old-defn form env fn-name args))) )
#object[user$eval140$fn__141 0xc3177d5 "user$eval[email protected]"]
user=> (defn foo [])
:fn foo
#'user/foo
You control eval
no? Can't you do something like:
(def eval
(if user-eval-fn
user-eval-fn
sci-eval))
the problem is that dynamic vars are slow and when you call eval 1M times, it will take significantly longer with a dynamic var
But it won't no? Because the if will be executed only once when the lib is loaded, afterward eval will be direct linked with the value of user-eval-fn no?
Or maybe you have to do:
(def eval
(if user-eval-fn
@user-eval-fn
...
To force getting the value out and bind eval not to the dynamic var but to its valueit's slightly awkward in the order of loading stuff though. the user probably wants the normal eval + something extra and by the time you get the normal eval, everything's already loaded
Ya, so I'm thinking it's like a "compile" time dynamic var, the user can set! it before they require the lib
Maybe another way, so it doesn't force the user to have a weird require or calling set! before the call to ns, is you can take a JVM property/env variable. So maybe I can set a JVM property pointing to my custom eval, and your def can see if that property is set it uses that one.
Oh, I see what you mean. Like inside their user-eval they might want to use the sci-eval, and so that one would need to already be loaded... Hum..
What if the user provided a factory for creating the eval function. Something like:
(def ^:dynamic eval-factory)
(defn get-eval-factory []
(or eval-factory
(requiring-resolve (symbol (System/getProperty "eval-factory")))
(requiring-resolve (symbol (System/getenv "eval-factory"))))
(def eval
(letfn [eval ([...] )] ;; This is the normal eval
(if-let [ef (get-eval-factory)]
(ef eval) ;; And you pass the normal eval to the eval factory
eval)
So the user provided a function that will return the eval function which takes the normal eval function as an argument.That way by the time the user eval is compiled everything is loaded, but the eval Var isn't loaded yet, but also the user can reference the normal eval function.
If Clojure had copied Common Lisp packages https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node111.html we would not have forty-line namespaces at the top of every source file. https://github.com/kennytilton/matrix/blob/6300132cc64635922c7b2b484cdb7a52d0c64107/cljs/rxtrak/src/rxtrak/build.cljs#L1 Too late for an RFE? đ¤Ş
Wouldn't we have same 40 lines of use
, import
, and many, many more lines that would have things like tiltontec.webmx.html/dom-ancestor-by-tag
?
between that, consistent naming, and cljr-slash
I feel pretty abstracted away from intricacies. I just use aliases directly in place, with no editing of the ns form
@U2FRKM4TW No, with CL packages I would define in one http://xyz-package.cl source file a package :xyz setting up all the dependencies, aliases, shadowing, whatever and then just code (in-package :xyz)
at the top of source files that rely on those dependencies.
dom-ancestor-by-tag
or any symbol would not require the package prefix tiltontec.webmx.html
unless a symbol were not exported, and then it seems right and proper that I have to advertise my invasion of the package internals. Even then, CL's defpackage
supports so-called nicknames, so that would be (wxhtm::internal-use-only 42)
.
@U05224H0W Agreed, it is time I started writing smaller programs.
"ns forms auto-collapsed", @U45T93RA6? I like it! But when churning out new code greenfield I find it irritating having to scoop up the ns
requirements I need from some existing source.
Fun note: many in the CL community are fans of breaking an app up into packages appropriate to different sets of functions, such as for :ui, :i/o, etc. I did that post hoc once to 40kloc, and surfaced some useful refactoring. A month later I reverted to a monolithic package.
Famous Lisp saying: "It is probably a package problem. It is always a package problem." I can confirm.
I see. If that's your cup of tea, you can already do that with e.g. https://github.com/clj-commons/potemkin with the power of its macros. But personally, I dislike it, at least because of the lack of the tooling support. There were some other negative implications described in the relevant Google Group discussion, but my memory fails me here.
@U0PUGPSFR you could do all of this with a macro of course đ
(def ^:private packages (atom {}))
(defmacro in-package [pkg]
`(do ~@(get packages pkg)))
(defmacro make-package [pkg & requires]
`(swap! packages assoc ~pkg '~requires))
;;
(make-package :xyz
(require '[foo.bar :as baz :refer [glom]]))
(I haven't tested this, but it should be fine đ)I think it will even work in Cljs since the changes to support macros which expand to require
.
You can split your code across multiple files (at least in Clojure, not sure for Cljs) and use (in-ns 'xyz)
So just create a xyz
namespace where you require and alias everything as you want, and then in different files just use (in-ns 'xyz)
at the top and you'll be inside the context of that namespace.
Something else you can do is just create a macro:
(defmacro ns-xyz
[]
`(ns xyz
(:require ...)))
And use ns-xyz
instead for your ns declaration.
You can customize this however you want.Pretty sure that would work with Cljs as well. In fact a.nice way to do it would be:
(defmacro require-xyz
[]
`(require '[foo.bar :as b :refer [baz]]))
Since Clojure Script supports multiple requires at the top of a file. So now you can do:
(ns my-file
(:require ...) ; Require more stuff
(:require-macros ...) ; Require our macro ns-xyz)
(require-xyz)
So I think this will work in ClojureScriptHa-ha, I considered using a macro but then thought, "Nahhh, Clojure and especially the tooling around it would never allow that." Doh!
Ya, ClojureScript is the only one I'm not 100% sure this would work. Otherwise it should be fine with Cider and Calva or other REPL based tooling. With static analyzers there might be issues, clj-kondo you can teach it about your macro, but others like Cursive I don't think you can, so it might cause issue with that one not knowing about your requires.
I asked on the ClojureScript channel, and people seem to think it should work on Cljs as well
suppose I have fibonacci defined like so:
(def fib
(memoize (fn [i]
(case i
0 0
1 1
(+ (fib (dec i)) (fib (dec (dec i))))))))
I need to only take the fibonacci numbers that are less than y = 10,000. Is there a way that works for a general y without knowing for which x that y exceeds 10,000 for the first time?Seeking some help understanding of the defprotocol
implementation in Clojure.
If I create a protocol:
(defprotocol FileSystem
"Methods for interacting with a FileSystem"
(ls [_ path] "List contents of a directory at `path`"))
My understanding is that the emitted method ls
implements AFunction
, and has a .__methodImplCache
property that contains a MethodImpleCache
mapping between known classes and known implementations of the protocol.
If I then create a new implementation and try to execute the protocol method, my current understanding of the code is that it will add the new method to the MethodImplCache
's .table
property, caching the result and updating the ls
function's cache of known ways to dispatch the method.
When running this in the REPL, what I'm seeing is that the MethodImplCache
isn't actually updated. However, if I re-evaluate the -cache-protocol-fn
method, and step through using debug mode (I'm using Emacs/cider) , I actually see the MethodImplCache
being updated.
(defrecord S3 []
FileSystem
(ls [_ path]
{:baz :qux}))
(ls (->S3) nil)
;; MethodImplCache is not updated
(seq (.table (.__methodImplCache ls))
;; => nil
;; Instrument -cache-protocol-fn for debugging
(ls (->S3) nil)
;; After instrumenting for debugging - MethodImplCache is now updated
(seq (.table (.__methodImplCache ls))
;; => (my-ns.S3
;; #object[clojure.lang.MethodImplCache$Entry 0x740df35d "cloj[email protected]"]
;; nil
;; nil)
Could someone help me understand if
(1) I should expect the MethodImplCache
to be updated after calling the ls
method for the first time using a new class (the S3
defrecord in this case)?
(2) Why I get different behavior and the cache IS updated when I instrument the -cache-protocol-fn
for debugging?First off, you are deeeep in the implementation weeds đ
This might be a better question for #clojure-dev.
My understanding is, like you said, when you dynamically add a type to a protocol (e.g. via extend-protocol
), it updates the method cache. However the (defrecord ⌠Protocol implsâŚ)
syntax is different from other syntaxes. When you use this syntax, youâre actually generating a class that implements a corresponding internal interface generated by defprotocol
. Dispatch to those records is implemented by regular JVM dynamic dispatch.
Small example from malli:
-schema?
is a protocol method
(defprotocol Schemas
(-schema? [this])
(-into-schema? [this]))
(defn schema?
"Checks if x is a Schema instance"
[x] (-schema? x))
Decompiles to:
public final class core$schema_QMARK_ extends AFunction
{
private static Class __cached_class__0;
public static final Var const__0;
public static Object invokeStatic(final Object x) {
if (Util.classOf(x) != core$schema_QMARK_.__cached_class__0) {
if (x instanceof Schemas) {
return ((Schemas)x)._schema_QMARK_();
}
core$schema_QMARK_.__cached_class__0 = Util.classOf(x);
}
return ((IFn)core$schema_QMARK_.const__0.getRawRoot()).invoke(x);
}
@Override
public Object invoke(final Object x) {
return invokeStatic(x);
}
static {
const__0 = RT.var("malli.core", "-schema?");
}
}
It looks like the method cache is only updated after the first invocation: https://github.com/clojure/clojure/blob/master/src/clj/clojure/core_deftype.clj#L587-L626
Also looks like any function which wraps a protocol method call caches the first class it's called on
I donât read it like that, but I also donât fully understand whatâs going on in this code.
AMA about this. The inline cache on Class -> impl uses a packed array, then falls back to a map when it grows beyond a certain size, or cannot pack an array unambiguously
direct extenders of the protocol's backing interface do not go through any of this lookup process at all, just invoke interface
Thank you all for the comments. I'll check out the talk, the Malli example and read through your comments a couple times to see if it sinks in for me!
After the Malli example and the Clojure Futures talk (both of which were super helpful!), here's my mental model based on the Malli example Ben gave above:
Invoking the schema?
function (which wraps the -schema?
method) will first check if the argument is in the single-item cache attached to the schema?
function, __cached___class__0
. If the argument's class matches the cached class, then we proceed to get the root of the protocol method var ( -schema?
) and invoke that.
If the argument was NOT the cached class, we have two potential cases. The first one is that the argument actually implements the underlying interface (`Schema` in this case). If that's the case, great! Just use the interface. If not, update the cached class, and then invoke the protocol method var.
If we're invoking the protocol method var instead of the underlying interface, the protocol method var comes with the MethodImplCache
. As we continue to invoke the protocol method with Objects that do not implement the underlying protocol, we will continue to add to the MethodImplCache
.
What I don't understand:
⢠Based on the decompiled code, it seems like the schema?
function which wraps the -schema?
method is responsible for determining whether the argument, x
implements the underlying Schema
interface, or if it does not. Does this imply that all potential consumers of a protocol method are responsible for figuring out whether or not to delegate to the underlying interface implementation, or invoking the protocol method var? So the var containing the protocol method doesn't actually know how to deal with cases where the object implements the underlying interface?
⢠Why is the __cached___class__0
present on the core$schema_QMARK
class? I'm struggling to see how that particular cache gets leveraged for performance. Is (x instanceof Schemas)
extremely costly and optimizing to avoid it saves a lot of cycles?
There are two sets of implementations for a protocol. Classes that extend the backing interface (fast path), and classes that donât, which are looked up in a table. The basic logic is: if target.implements interface call interface target else lookup target class in table but: the calling mechanism remembers the last seen target class when it is a table class, and jumps directly to that impl, instead of checking the interface
Do any of the datalog dbs / libs (datomic, datalevin, datascript, crux, etc) support any kind of "virtual table" mechanism, as in postgres and sqlite? I would like to be able to incorporate information outside of the database into my datalog queries.
This is done with relations. http://www.learndatalogtoday.org/chapter/3
This appears to broadly do the same kind of thing, but (iiuc) in an eager way. I'm looking for some kind of functional interface, where the query engine is able to push down part of the query to the external datasource.
I wonder if I can approximate it using rules and some functions.
I implemented them in Asami, and theyâre incredibly useful, but I didnât know the name đ
They also let you feed the results of one query into another query, giving you a âsubqueryâ mechanism. But unlike some approaches to subqueries, theyâre much more efficient, since they just join into the bindings as if theyâre part of the current query already
Oh that's cool. I want to figure out how to do that effectively.
Iâm also hoping to turn the idea of threading queries into a query syntax (it wonât be compatible with other databases, but :woman-shrugging:) https://github.com/threatgrid/asami/issues/147
Oh, I just realized that this is on #clojure. Thereâs a #datalog channel where this will be more appropriate
It'd be really nice if there was a facility in the various other databases to allow "transactional reads" where more than one query is run at the same time.
Well this is a thread that's not getting sent to the main channel, and the base message is about datalog, so this is probably fine.
Or rather not that multiple queries are run at the same time, but that multiple queries are run without applying new transactions to the second query.
Why canât you run more than one query at a time? The database is a single immutable value (a snapshot in time).
Maybe I'm misunderstanding something, but in datahike for example (with the new read-only peers) the backend might be across the network, and both queries might require a round-trip, so unless I'm missing something about how the facilities of calling the db
function works, the two queries might have a different set of transactions having been "resolved" to the datastore.
Unless calling the db
function (or doing a deref in datahike) "freezes" it to the current max transaction id.
I would need to look at that API to know for sure, but yes, thatâs what the db
function is supposed to do
Okay, cool
So this should be fine then
Thanks!
It returns the current âvalueâ of the database at that point in time. If you do queries against it, then they will be consistent. You also wonât see any new data that is inserted until you ask for a new db
I see lots of people doing queries like:
(q '[:find ..... ] (db my-connection))
Because they want the latest version of the database. Thatâs often going to be OK, especially if youâre the only process accessing it, but itâs a bad habit. If you do that with multiple queries in a row then you can get inconsistent data coming back between the queriesMakes a lot of sense.
I have seen that quite often.
Seems like a reasonable usecase for something like (or *db* (db *connection*))
.
My colleagues have been doing it lately, but I havenât complained because theyâre in ClojureScript (hence, a write canât happen in between), but yes, Iâm deeply uncomfortable with this
Much better to use
(let [the-db (db my-connection)
result1 (q '[:find ....] the-db)
result2 (q '[:find ....] the-db)]
...)
Right, the dynvar approach is only really useful in cases where you don't want to pass things between multiple layers, which is an API choice for sure.
Hi all! Short question: Does anyone know a good simple alternative to quartz
based libraries for cron-scheduling?
I have enjoyed using this java library, it's extremely simple and just needs you to create 1 table in an RDBMS: https://github.com/kagkarlsson/db-scheduler
@UEH6VEQQJ Sadly I have no database available in this application
No database at all? So we're just talking in-memory? What about java's native TimerTask et al?
@U45T93RA6 thanks for the link, interesting perspective
It is an interesting perspective - however I believe that only applies for the phrases like "Please now in xyz time do something". I'm more aiming for "At midnight do this please". For now without much safety - later I'm considering to store execution state (success/error) in a safe kv-store such as etcd to ensure no execution gets lost. @UEH6VEQQJ Thanks for the link! TimerTask might help me out to get more simplicity out of this. Nevertheless I'm not sure if it is helpful for "crontab"-like cases as "At one-o-clock every day" especially if the process (hopefully not) sometimes dies and has to restart.
I just found cron4j
- I guess the duo of TimerTask
and cron4j
would cover me pretty well - TimerTask for short re-occurring timers for update-checks etc. and cron4j
for the "At midnight" case.
Thanks for the inputs!
I wrote this several years ago and would almost certainly change and decouple some things now but it has forward and backward infinite cron sequences: https://github.com/vodori/chronology