Fork me on GitHub
#clojure
<
2021-04-28
>
Yehonathan Sharvit07:04:45

A question related to using Clojure’s multimethod at scale. Multimethod inlcudes a caching mechanism. In my code I called a multimethod with lots of different values (around 20 millions) and it caused a memory leak. Is there a way to disable multimethod caching or to limit the cache size? Is it advised not to use multi methods when there are too many different calls?

Ben Sless07:04:47

I'm not sure if it's possible with multimethods but I think https://github.com/camsaul/methodical supports passing a custom cache

p-himik07:04:19

You can call (.reset multi-fn) on each call. But it's quite a bit strange that, seemingly, MultiFn caches all values that end up in calling the default method.

p-himik08:04:18

And of course I'm wrong about using reset because it also reset the hierarchy. So a workaround would be to call defmethod again - it resets the cache as well.

Yehonathan Sharvit09:04:31

My use case is different. The dispatch value is always the same. And the memory leak is not as easy to reproduce as in the case that your reported

p-himik09:04:00

The dispatch value is the same in my case as well - it's :default.

Yehonathan Sharvit10:04:34

in your case the dispatch value is the by array

p-himik10:04:04

I guess there are two perspectives. :) > The dispatch value is always the same "Same" as in identical? or equal??

p-himik10:04:29

Although, judging by the Java code, you would still have to be hitting the :default branch for the issue to surface.

p-himik10:04:33

Another guess that I can hazard without having a minimal reproducible example is that your objects can be equal? without having their hashes be equal. It would mean wrong hash implementation, and can cause all sorts of problems in general, not just memory leaks when using defmulti.

Yehonathan Sharvit12:04:10

Actually, my case was exactly the same as yours but not on purpose. It took me lots of time to understand it. Here is my buggy code

(defmulti foo :a)
(defmethod foo :default [_ _] nil)

(foo {} (Math/random))

Yehonathan Sharvit12:04:49

My intent is to dispatch on the valu associated to :a in the first arg.

(defmulti foo (fn [x y]
                (:a x)))

Yehonathan Sharvit12:04:05

But what I wrote was equivalent to this:

Yehonathan Sharvit12:04:10

(defmulti foo (fn [x y]
                (:a x y)))

Yehonathan Sharvit12:04:57

As a consequence, each time I called foo with a different second arg, I increased the cache of the multi method

👍 4
Yehonathan Sharvit12:04:47

Because my code behaved as I expected in term of functionallity

Yehonathan Sharvit12:04:30

Anyway, I think that the cache size of the multi method should have a limit

wotbrew15:04:29

Doesn't this represent a potential DOS vulnerability that a lot of people might not be aware of? Presumably the cache entries are not weak-interned, so you could use this to consume all available memory by providing many unique inputs that may happen to be used as dispatch vals? I'm sure many large unique inputs will cause pollution of caches all over the place not just multi-methods in many apps but I would still sleep easier if I knew :default matches were not cached.

wotbrew16:04:35

Not sure how easy it'd be to exploit with most apps but I bet a bunch have multi-methods that take user input as a dispatch val (e.g string "type" json field) assuming that hitting :default comes with no risk.

borkdude12:04:44

I want to import some dynamic var from another namespace but I also want to allow users to define a var with the same name, without breaking (it's on them to rename their own vars in order to be able to access the one from the prelude). E.g. my prelude is

(require '[babashka.tasks :refer [*task*]])
but the user's program, which will be appended, may already have a var named *task* and I don't want to crash their program with
(def *task* {})
IllegalStateException
I can work around this using:
(intern *ns* '*task* ...)
but then the *task* var will not hold the value of the dynamic binding of babashka.tasks/*task*. What can I do?

borkdude13:04:55

I feel it could be useful if clojure had something like a proxy-var, which just proxied everything to another var. This would also solve the potemkin/import-vars stuff.

Alex Miller (Clojure team)13:04:41

Or you could just not ever do that

😂 2
Alex Miller (Clojure team)13:04:20

Somehow I’ve managed to avoid it in 10 years of writing Clojure

Aron13:04:49

There is a logical error here 🙂 And I say this with the utmost respect, hoping you don't mind it. https://en.wikipedia.org/wiki/Denying_the_antecedent

borkdude13:04:42

potemkin/import-vars was just an additional benefit that could be solved at the same time, but not the main issue (FWIW: I have never needed potemkin/import-vars myself, not a big fan, but the problem it addresses could be solved better if clojure supported "forwarding" to other vars). Let me try to explain my use case. This is a config-like program, where I want to be able to expose *task* without breaking existing configs. It is not a typical every day Clojure program, but more a framework/DSL-like setup. Not common.

borkdude13:04:26

I guess I could hack around it by making the dynvar part of clojure.core, but I'm not a fan of that either. https://github.com/clojure/clojure/blob/b1b88dd25373a86e41310a525a21b497799dbbf2/src/jvm/clojure/lang/Namespace.java#L87

bronsa13:04:16

why do you need to refer *task* instead of b.t/*task* ?

borkdude13:04:30

Maybe a better way is to make everything opt-in (or namespaced) but that will create some more boilerplate.

vlaaad13:04:35

maybe it’s fair to require user programs to not define *task* for this system? what’s the problem you are trying to solve?

borkdude13:04:09

> maybe it’s fair to require user programs to not define `*task*` for this system? that is a fair restriction if you can control the future

borkdude13:04:50

there might already configs around that are using *task* for something else today

bronsa13:04:04

if they do define it it's on them to use an alias instead of referring

vlaaad13:04:21

I realised I don’t know enough about the context where this is defined to give advice 😄

bronsa13:04:24

you can also :rename {*task* *bs-task*}

borkdude13:04:45

What I've done so far is this:

(when-not (resolve 'clojure)
  ;; we don't use refer so users can override this
  (intern *ns* 'clojure babashka.tasks/clojure))

(when-not (resolve 'shell)
  (intern *ns* 'shell babashka.tasks/shell))
This works for normal functions. clojure and shell are functions users can just access without any additional boilerplate. I want to do the same for a dynamic var.

bronsa13:04:15

not a fan ¯\(ツ)

2
bronsa13:04:19

is a prefix that bad?

borkdude13:04:43

yes, in the sense that that prefix might also be already taken by the user

vlaaad13:04:58

what problem are you trying to solve?

borkdude13:04:07

e.g. I had tasks as the default prefix, but I reverted this because a user already had taken this :)

borkdude13:04:25

@vlaaad I am making a task runner / Makefile like system where people can define tasks using a DSL

borkdude14:04:56

Did you also comment this in the Mach repo?

borkdude14:04:05

Then I've probably seen it

tvaughan17:04:40

Sorry, I don't know what the Mach repo is

Aron13:04:43

oh, like gulp and grunt in js? 😄

Noah Bogart13:04:48

i think requiring users to change existing code is perfectly reasonable, given the relatively age of babashka's task runner feature

borkdude13:04:29

@nbtheduke The problem is more general: even when stable, it might introduce more of these things in the future.

borkdude13:04:42

I guess the problem is similar to [clojure.test :refer :all] and clojure.test introducing new things in the future which breaks certain programs when upgrading clojure

👍 4
borkdude13:04:51

and if clojure doesn't see this breakage as problematic, maybe I shouldn't either?

borkdude13:04:55

@vlaaad to summarize: the problem I'm trying to solve is: avoid breakage like with the above example

borkdude13:04:54

but maybe my programs shouldn't try to be "don't break"-holier than clojure

Noah Bogart13:04:26

i realize one of the niceties of babashka is that a lot of stuff is imported by default, but could you make this one explicit?

borkdude13:04:26

nothing is imported by default, except in the user namespace it pre-defines a couple of aliases to be used on the command line. e.g. (json/generate-string ...)

vlaaad13:04:21

shouldn’t this also be done with default aliases?

borkdude13:04:51

@vlaaad what do you mean by this?

vlaaad13:04:27

instead of referring *task* you can add default alias to babashka.tasks

borkdude13:04:59

@vlaaad even that was breaking some user's program where the user had already chosen the alias tasks

borkdude13:04:38

we could just make everything explicit via a required [babashka.tasks :refer [shell clojure *task*]] instead.

borkdude13:04:00

or :refer :all at the risk of the user

borkdude13:04:45

I guess we could keep supporting normal fns like previous but additional stuff must be explicitly imported. Perhaps task or current-task could also be a normal fn which derefs *task* which is not uncommon.

Jim Newton15:04:50

In clojure, fn allows me to define a multi-funciton, i.e. a function with several possible lambda lists. at call-time, the most appropriate function is selected by matching the arguments to the appropriate lambda list. Is the mechanism of selection documented somewhere? For example, certain sequences of lambda lists are not allowed. Case in point [a b] and [a [b c]] are not allowed simultaneously. What else is allowed and not allowed?

(defn foo 
  ([a [b c]] a)
  ([a b] (list a b)))
gives the following error
Syntax error compiling fn* at (clojure-rte:localhost:54179(clj)*:133:23).
Can't have 2 overloads with same arity

borkdude15:04:37

The logic is pretty simple basic: it's only based on arg count, regardless of destructuring.

borkdude15:04:48

And there is a rule around varargs.

potetm15:04:51

@jimka.issy [a b] and [a [b c]] are both 2 args fns.

Jim Newton15:04:53

really? only argument count?

potetm15:04:34

yes. arg count only.

Jim Newton15:04:48

@potetm, yes I understand why this case fails. But is this the only such failing case? that’s my question. according to @borkdude that’s the only selection criteria, the arity

ghadi15:04:12

you should macroexpand the function to see what is happening:

ghadi15:04:38

(macroexpand '(fn ([a [b c]] a)
                  ([a b] (list a b))))

->

(fn* ([a p__140] (clojure.core/let [[b c] p__140] a)) ([a b] (list a b)))

borkdude15:04:41

@jimka.issy

user=> (macroexpand '(fn [[a b]]))
(fn* ([p__138] (clojure.core/let [[a b] p__138])))
exactly

ghadi15:04:37

term of art here is "function with multiple arities"

ghadi15:04:53

lambda lists

Jim Newton15:04:35

fn* doesn’t seem to be documented in https://clojuredocs.org/

ghadi15:04:05

fn* is a special form in the compiler

ghadi15:04:15

it's the familiar fn minus destructuring

ghadi15:04:35

http://clojuredocs.org is very useful, but non-authoritative

Jim Newton15:04:18

@ghadi, there seems to be some case about varargs, is the “term of art” really multiple arities even in the case of varargs?

ghadi15:04:39

varargs is a type of arity

ghadi15:04:06

you can have fixed arities + a vararg arity (which must be > than the length of the longest fixed arity)

ghadi15:04:20

[a b] [a b c] [a b c & more]

ghadi15:04:07

for the varargs arity, under the hood there is a mechanism that rolls up extra args and presents it to the arity as a seq bound to "more"

2
Jim Newton15:04:12

ahhh, now we are getting some rules. If http://clojuredocs.org is not authoritative, where is the authority?

Jim Newton15:04:54

The `IFn` interface defines an `invoke()` function that is overloaded with arity ranging from 0-20. A single fn object can implement one or more invoke methods, and thus be overloaded on arity. One and only one overload can itself be variadic, by specifying the ampersand followed by a single rest-param. Such a variadic entry point, when called with arguments that exceed the positional params, collects them in a seq which is bound to, or destructured by, the rest param. If the supplied args do not exceed the positional params, the rest param will be `nil`. [taken from http://clojure.org]

delaguardo15:04:31

it is a copy-paste from http://clojure.org, isn’t it?

Jim Newton07:04:25

indeed it is. and a good description IMO

ghadi15:04:50

destructuring is orthogonal to any of the above

ghadi15:04:17

purely macro sugar to tear apart the arguments

Jim Newton15:04:34

that destructuring is orthogonal makes since, yes, just two feature that work well together

Jim Newton15:04:56

BTW what is the difference between the following two?

(defn foo 
  ([a b c] a)
  ([^Boolean a b] (list a b)))
vs the following
(defn foo 
  ([a b c] a)
  ([a b] (list a b)))

borkdude15:04:21

In that example: in practice nothing, since the type hint is only relevant to Java interop here

borkdude15:04:47

e.g. ^String a (.length a)

Jim Newton15:04:56

does the compile compile a special branch for the case that a is a Boolean? and another for when it is not a Boolean?

borkdude16:04:29

no, the type hint is just leveraged when doing interop and in other cases, just not used

✔️ 2
borkdude16:04:53

there are other hints which can prevent boxed math

Jim Newton16:04:17

someone mentioned above that lambda-list is not a term used in the clojure community. what is the correct word for the vector which specifies the parameters of a function and their semantics wrt position, destructuring, and optionality?

ghadi16:04:22

It’s a vector, but often called the arglist

borkdude16:04:02

@jimka.issy Although not entirely accurate, a Clojure decompiler could be educating to see how the Clojure compiler turns s-expressions into bytecode in these cases https://github.com/clojure-goes-fast/clj-java-decompiler

borkdude16:04:32

arglists is a term commonly used and this is also the name in the metadata on the var

Jim Newton16:04:55

and what is the list of arguments called, if arglist has a different meaning?

Jim Newton16:04:51

((fn [a b c] …) 1 2 3) for me [a b c] is the lambda list and (1 2 3) is the arglist

Jim Newton16:04:38

arglist != list of arguments ???? curious

borkdude16:04:48

user=> (defn foo [a b])
#'user/foo
user=> (meta #'foo)
{:arglists ([a b]), :line 1, ...

Jim Newton16:04:59

A few years back I used one lisp which called (a b c) the formal parameter list and (1 2 3) the arg list

didibus06:04:37

Ya, this would have made more sense, but I think the two are just called arg-list and the context tells you which one is being refered too.

Jim Newton07:04:12

I thought the term lambda-list came from lambda calculus. but an associate of mine who teaches lambda calculus says he’s never heard of the term.

borkdude16:04:00

don't get caught up on names, just accept them, they won't ever change :)

Jim Newton16:04:13

what are the type annotations called in a definition like this: (defn [^Boolean a b] …) are they called type hints, or type annotations or what?

noisesmith16:04:00

they are not annotations, annotations are a bytecode feature

noisesmith16:04:17

(at least I'd like to keep that term specific)

noisesmith16:04:30

type hint is the term I've seen

Alex Miller (Clojure team)17:04:38

the only type hints that affect the actual compilation of the function signature are ^long and ^double

vncz17:04:11

Oh interesting. Is there a technical reason for this?

em17:04:23

For more details and official documentation of this behavior: https://clojure.org/reference/java_interop#primitives

noisesmith17:04:37

see also the definition of IFn - the specializations are only on long and double args, otherwise the sig is always Object https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/IFn.java

vncz17:04:19

Aha that clarifies

Jim Newton07:04:36

@U064X3EF3, you said ^long and not ^Long. Do I understand that correctly?

Jim Newton08:04:13

If I provide a ^double type hint, will the compiler compile a special path for double and another path for everything else?

Jim Newton08:04:47

I’m giving a talk at ELS 2021, I’m going to mention some clojure things, and it will be very easy for me to claim something that’s not true in passing.

borkdude08:04:15

@jimka.issy You can check yourself like this: https://gist.github.com/borkdude/d1bf6a20650862c38ee43c0656d72e39 Note that a decompiler isn't always accurate but it gives some insights.

borkdude08:04:37

It seems like what you are saying is true: it has one method specialized for the double and one for Object

Jim Newton08:04:06

interesting so it does compile a specialized code path for double.

Alex Miller (Clojure team)17:04:53

otherwise they are always Object

borkdude21:04:31

Why is there an assymetry between assoc! and conj!? (conj!) returns a transient vector (assoc!) does not return a transient map (conj! .. 1 2 3) is not varargs (assoc! (transient {}) :a 1 :b 2) is varargs

👍 1
Alex Miller (Clojure team)21:04:39

I am surprised if assoc! does not return a transient map

Alex Miller (Clojure team)21:04:51

how could it work if not?

Alex Miller (Clojure team)22:04:43

I think there is a ticket to make conj! match conj

bronsa22:04:45

he means the 0-arities of conj/conj! are defined

bronsa22:04:52

while of assoc/assoc! aren't

Alex Miller (Clojure team)22:04:13

I know there are some with other proposals that would interact with this one

Alex Miller (Clojure team)22:04:45

ah, sorry - I see you meant specifically those, I think that's covered in this ticket

Alex Miller (Clojure team)22:04:20

I think the answer to the original question is: there is no good reason

ghadi22:04:23

Calling conj! with no arguments is meaningful for its usage as a reducing function

ghadi22:04:57

transduce calls that arity when no init provided

ghadi22:04:22

there is no such kv reducing context that calls the no-arity, so i suspect that there was no pressure in giving assoc! the same treatment

Alex Miller (Clojure team)22:04:49

that ask.clojure issue has 0 votes, feel free to vote if you'd like to see it move up the priority list!

borkdude22:04:32

voted thanks.

Alex Miller (Clojure team)22:04:35

Jira has now force migrated people to their “new issue view” which inexplicably does not contain vote information. All the jira jiras about it are closed and seem to suggest it is available but perhaps only on “next gen” projects, not “classic” projects. To migrate, you have to create a new project, presumably with a different project id, which would break all existing links. Cool.

picard-facepalm 5
andy.fingerhut22:04:39

"All the jira jiras ..." 🙂

Alex Miller (Clojure team)22:04:33

I’m just teeing you up to ask how I voted for that jira jira

andy.fingerhut22:04:51

"How did you vote for that jira jira, Alex?"

Alex Miller (Clojure team)22:04:53

Because they didn’t use the new ui

Alex Miller (Clojure team)23:04:02

Because it sucks

❤️ 2
andy.fingerhut23:04:40

Sounds like a case of "normally we would eat our own dog food here, but in this case our own dog food wasn't good enough."

Alex Miller (Clojure team)23:04:07

Eating your own dog food and holding your nose or something

Alex Miller (Clojure team)23:04:56

I mean like, I get why software is hard. I really get it. But this new ui has been like 2 years in the making and it just moves some fields to the right and got rid of actually useful things (afaict). I dunno.

Alex Miller (Clojure team)23:04:53

Clubhouse (the Clojure project thing, not the audio social media thing) is pretty cool

seancorfield23:04:05

FWIW, if I’m logged into Jira, I can still see the voting button for Clojure issues:

seancorfield23:04:24

(That’s CLJ-2556)

Alex Miller (Clojure team)23:04:59

oh, I might have just missed that completely

Alex Miller (Clojure team)23:04:59

oh, I was on one with no votes there's no number so it's easy to miss

seancorfield23:04:26

I guess it’s a good job we now have http://ask.clojure.org as a proxy for Jira so we can still vote on things by proxy :rolling_on_the_floor_laughing:

😂 2