Fork me on GitHub
#clojure
<
2021-12-08
>
Drew Verlee04:12:31

really loving this clojure video series focused on the idea of a rate limited function https://youtu.be/IsS8ZCSUTUQ Does anyone see another way besides using swap-vals? Is it possible to just use swap! and be thread safe? I can't think of how.

emccue06:12:44

I made an atom thing for Java and i ended up making my own swapComplex that could also return “context” inferred from the swap

emccue06:12:47

(defn swap-complex! [atom f & args]
  (loop []
    (let [old-value               @atom
          {:keys [new-value 
                  derived-value]} (apply f (cons old-value args))]
      (if (compare-and-set! atom old-value new-value)
        {:new-value     new-value
         :derived-value derived-value}
        (recur)))))

emccue06:12:06

so for this example

emccue06:12:35

(let [last-executed (atom 0)]
  (fn wrapper
    [& args]
    (let [now (System/currentTimeMillis)
          {last-executed-changed :derived-value} (swap-complex! last-executed
                                                                (fn [last] ... {:derived-value true :new-value ...}]

Ivar Refsdal09:12:34

I'd warn against using compare-and-set! for primitives (such as longs). See the comment by Favila at clojuredocs: https://clojuredocs.org/clojure.core/compare-and-set!

wombawomba10:12:10

FWIW I wrote a library for rate limiting a while back: https://github.com/aeriksson/kilderkin

👍 2
Carsten Behring10:12:52

I have a question on a pattern for writing a clojure functions, on which the caller can decide "how much information" to return. Lets assume, that this function could return a deeply nested map, including some "binary" stuff, together 1 MB of information. In "one use case" of this function, I call it one time, so the 1 MB is fine and allows me to get all details. In an other use case, I map over something, and would call the function 10000 times. In reality in this use case, it would only need to retain a subset of the 1 MB, maybe just one Float number. , 10000 times. The issue is, that in use case 2 I want that the user of the funcìtion tells as well "which subset" to return, as this could change on every use. Can somebody point me to a "pattern", or give some hints to make this idiomatic

wotbrew10:12:37

I think https://github.com/wilkerlucio/pathom3 might be designed to address this use case. If you don't want to embrace some kind of declarative dsl (kind of necessary to derive graphs that let the computer figure this stuff out) if you only have to deal with this a couple of times, you could just include some (documented) :skip-x, :omit-y type options to the function. Sometimes worse-is-better.

Carsten Behring10:12:01

Without any form of "filter", 10000 * 1 MB would explode the heap.

p-himik10:12:40

Why would it be 10000 * 1 MB? Suppose you have a function like

(defn get-data []
  {:a (generate-huge-random-blob)
   :b 1})
If a caller of that function calls it 10000 times as (:b (get-data)), there won't be 10000 blobs since they are not retained in memory.

Carsten Behring10:12:27

It is true, hat the 10000 calls share "some data" returned, but some not. (The function returns a sequence):

(defn get-data-vector [v]
 (map get-huge-data v)
The data blob contains some byte arrays ( which are different each time), and some other complex records, which are variation of each other, and teh amount of sharing is unclear. And it might get called 1 000 000 times. So I need one form of "ultimate control" for the caller, what to return. The caller knows "how many" (1 vs 10000 vs 1000000), so he should be able to "restrict" as much as needed (and he has RAM). It's a tradeoff: If I know, I pass v of size 1, I don't need to bother to specify the filter, If I know, I pass v of size 1 000 000, I should specify a filter,

p-himik10:12:43

I think there's some misunderstanding between us, but not sure in which direction. JVM won't hold on to the data that's not referenced, right? So, if the caller of that function gets all the data, all the time, and then simply selects the desired data after the fact, the unneeded data will not be retained. The only downside here that I can see is that the unneeded data will still be generated, time and time again - but from your description of the problem only the RAM usage seems to be the problem (which it's not, if unused data is not referenced) and not CPU.

Carsten Behring11:12:10

It is more complex. The calculation is time consuming, overall minutes for each vector element. (so one million would take a week) So I decided to force non-lazy.... But in an other use case, it's only seconds . We talk about training of ML models. (some models are very fast, some tale minutes) for each call.

Carsten Behring11:12:36

Let's say this function has eventually very different performance characteristics, depending on use case. Seconds to days. (short input seq and fast ML model vs long input sequence of slow ML model)

Carsten Behring11:12:52

A "solution" is probably this:

(defn get-data-vector [v filter-fn]
 (map #(filter-fn (get-huge-data #)) v)

p-himik11:12:43

One very easy approach then is to wrap all the costly data in delay - but the downside is that the client of that function will have to wrap all concrete data access in force. So, borrowing from the example above, it will become

(defn get-data []
  {:a (delay (generate-huge-random-blob))
   :b 1})
A function that needs just :a will use (force (:a (get-data))) (or @(:a (get-data)) if it knows for sure that it's a delay there) and a function that needs just :b will use (force (:b (get-data))) (or just (:b (get-data)) if it knows for sure that there is no delay there). If you end up having a large tree of delays that's returned from your function, you can use something like specter to: • Help the caller specify concrete paths in the data object that it's interested in • Automatically resolve all delays along that path

Carsten Behring11:12:29

ok, using delays. Thabgs will think about it.

Carsten Behring11:12:38

My solution feels somhow non-idiomatic, so doing a function which allows to "configure" the return.

Carsten Behring11:12:01

It's still a pure function, but ...

Carsten Behring11:12:00

Justified by "reduction of RAM usage" ... and the "need" of non lazy due to "time consuming". maybe I should go to "actors" instead of a simple "map" function?

p-himik11:12:08

Not sure what you mean by "actors". But the query approach would work (after all, it's similar to SQL, which we have had for ages). Alternatively, if your data generation can be split into tiny parts that could be called separately, that would definitely be a more idiomatic approach, and for good reasons.

Carsten Behring11:12:50

Clojure "agents" I ment. Maybe each calculation should be done by an "agent"

p-himik11:12:10

You can use them but I don't see how it would help. There's nothing about the agents that would make your task simpler. Also, I'm pretty sure that that functionality is left there for backwards compatibility. IIRC, Rich at some point stated that he overestimated their usefulness and that if he were to reimplement Clojure from scratch, he would start with core.async and would never implement agents. But don't quote me on that.

cddr14:12:07

Maybe you can use a higher-order function and pass it to the get-data function. In the case you need all the data, you pass a fn that does all the expensive stuff. And in the case you want a small answer back, the fn omits all the expensive stuff

Joshua Suskalo18:12:20

If your problem is that it's potentially expensive to get each piece of data, then why not just return it as a lazy sequence? Only small chunks will get realized at once, how long it takes is irrelevant, and you can use map or filter to get down to a subset of the data and produce another lazy sequence that when realized fits into the heap. Since the JVM is garbage collected if you iterate over this huge structure and don't retain the head, then you'll produce a lot of GC pressure which will slow things down somewhat, but as long as no single chunk of data blows the heap then you won't have memory issues. And if you map to only a subset of the data where it would fit in memory, you can retain the head without issue as well. @U7CAHM72M

Fahd El Mazouni11:12:00

Hi ! any idea what this does ?

^{ClassName true} someObj

p-himik11:12:55

Sets metadata of {ClassName true} on the var/symbol someObj, depending on the context.

Fahd El Mazouni11:12:07

but why would you do that ?

p-himik12:12:32

Maybe someone mistook it for just ^ClassName, as a type hint. Or maybe there's some tooling that supports it. Or maybe it's processed in the code itself at some point. Impossible to say for sure without having the full context.

Fahd El Mazouni13:12:15

yeah you're right just wanted to check if there was something in clojure I didn't know about. thanks for your help ! I think it was a mistake

👍 1
harryvederci11:12:08

I'd like to store the inputs of a function and its output to a file, so I can reuse it a week later, for example. Evaluating this:

'(1 2 3)
Results in:
(1 2 3)
Which I can use with read-string. Is "spitting", "slurping" and using read-string the way to go here?

cddr11:12:21

@mail089 It would work but there's a note in the docs recommending use of edn/read-string for this type of thing. https://clojuredocs.org/clojure.core/read#example-542692d5c026201cdc327056

harryvederci12:12:21

Thanks @cddr, that seems to work:

(spit "my-file"
      '(1 2 3))
(-> (slurp "my-file")
    (clojure.edn/read-string))

🚀 2
Sam Ritchie12:12:33

TIL that doseq can take multiple sequences like for… somehow I had installed in my brain that this was not the case! Thanks to discussion here teaching me about dorun, I made a PR to convert my (doall (for …)) cases in tests to (dorun (for …))… then suspicion got me to look for a doseq version that acted like for. and of course that was just doseq. 🙂

borkdude13:12:02

to save a tiny bit of performance?

Alex Miller (Clojure team)13:12:43

don't know, seems unlikely perf would be that much of a concern there

borkdude21:12:27

Maybe performance is the reason.

user=> (let [args (seq [1 2 3])] (time (dotimes [i 10000000] (.applyTo ^clojure.lang.IFn + args))))
"Elapsed time: 779.743179 msecs"
nil
user=> (let [args (seq [1 2 3])] (time (dotimes [i 10000000] (apply ^clojure.lang.IFn + args))))
"Elapsed time: 1463.516917 msecs"
Seems to matter almost x2. Perhaps also due to the avoidance of calling seq on the args when using applyTo

wombawomba15:12:24

Is there any way at all to generate a named Java class with public fields in Clojure, without involving libraries for writing custom bytecode?

emccue15:12:47

You could generate java source code with something like javapoet and feed it to javac

Alex Miller (Clojure team)15:12:50

from a philosophical point of view, no by design

Alex Miller (Clojure team)15:12:36

that is not a problem Clojure is trying to solve (arbitrary Java class making). it does have class making capabilities that mesh well with the Clojure approach to datatypes etc

wombawomba16:12:17

okay, thanks

pataprogramming18:12:42

I am pulling my hair out over a Java interop issue. I have not seem anything like this before, and am at the end of my tether. I started with clj-swipl7, and a compiled JAR file (the JPL interface from SWI Prolog), and over the troubleshooting processing have ended up with just the extracted class files from the jar in a directory that the REPL has on its classpath.

% ls org/jpl7/Integer.class org/jpl7/fli/Prolog.class org/jpl7/fli/LongHolder.class
org/jpl7/Integer.class org/jpl7/fli/LongHolder.class org/jpl7/fli/Prolog.class
% javap org.jpl7.Variable | head -2
Compiled from "Variable.java"
public class org.jpl7.Variable extends org.jpl7.Term {
% javap org.jpl7.fli.Prolog | head -2
Compiled from "Prolog.java"
public final class org.jpl7.fli.Prolog {
And then in the REPL:
prologclj.core> (import 'org.jpl7.Variable)
org.jpl7.Variable
prologclj.core> (Variable. "X")
#object[org.jpl7.Variable 0x5fa6d108 "X"]

prologclj.core> (import 'org.jpl7.fli.LongHolder)
org.jpl7.fli.LongHolder
prologclj.core> (LongHolder.)
#object[org.jpl7.fli.LongHolder 0x4fdf2e8d "org.jpl7.fli.LongHolder@1f"]

prologclj.core> (import 'org.jpl7.fli.Prolog)
org.jpl7.fli.Prolog
prologclj.core> (Prolog.)
Execution error (NoClassDefFoundError) at prologclj.core/eval11881 (form-init331134564092188469.clj:615).
Could not initialize class org.jpl7.fli.Prolog

pataprogramming18:12:12

Why on Earth can't the JVM find org.jpl7.fli.Prolog? As a comparison, if I import a class that doesn't actually exist, I get an immediate error (not deferred until I try to use the class).

prologclj.core> (import 'org.jpl7.fli.Prologxxx)
Execution error (ClassNotFoundException) at java.net.URLClassLoader/findClass (URLClassLoader.java:445).
org.jpl7.fli.Prologxxx

pataprogramming18:12:04

I've got this down to as minimal a case as I can, and I'm out of ideas.

dpsutton18:12:23

i think this error doesn’t mean it cannot FIND the class but that it cannot initialize it > Could not initialize class org.jpl7.fli.Prolog

pataprogramming18:12:09

Ah, that's extremely helpful! Thanks, that's undoubtedly it. I'll chase that next.

ghadi18:12:39

always print out the whole error with *e

ghadi18:12:04

you'll probably have seen that the ClassNotFoundException is chained to an ExceptionInInitializerError, or something like that

ghadi18:12:18

which would clue you into the native library issue

dpsutton18:12:38

@U050ECB92 I imagine clj is wrapping the initialization error in a class not found exception? If so, is that worth a bug report and fixing on http://ask.clojure.org or jira?

dpsutton18:12:19

i had this recently with a driver for vertica on the new m1 pro mac

ghadi18:12:53

the JVM is wrapping it

ghadi18:12:09

usually CNFE is chained to something

ghadi19:12:06

out of curiosity, @U054FA810 you should post the whole exception (`*e` ) in a slack snippet (CMD-Shift-Enter)

ghadi19:12:20

that way we all can see the unadulterated data

pataprogramming19:12:28

Though, I get a much more helpful error message if I try to do a lein repl from the command line.

pataprogramming19:12:45

But, I've also got this in my project.clj:

:jvm-opts ["-Djava.library.path=/usr/local/opt/swi-prolog/libexec/lib/swipl/lib/x86_64-darwin"]

pataprogramming19:12:30

So either lein is ignoring that option, or I'm missing something else.

phronmophobic19:12:37

what version of java are you running? do you happen to be running on an M1 mac?

phronmophobic19:12:58

it does look like the java.library path isn't being set. can you share your project.clj? I would double check to see if :jvm-opts isn't accidentally nested in the wrong place

pataprogramming19:12:45

It's pretty rudimentary. I had stripped out everything else, and extracted the clj deps from their JAR into /resources.

pataprogramming19:12:00

It doesn't appear to be nested, but java.library.path absolutely is not getting set.

ghadi19:12:22

how are you confirming that java.library.path is not getting set?

pataprogramming19:12:25

prologclj.core=> (System/getProperty "java.library.path")
"/Users/username/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:."

phronmophobic19:12:43

just to double check something, if you use ["-foo"] as your jvm opts, does it complain?

pataprogramming20:12:17

Hm, no it doesn't.

pataprogramming20:12:02

And not using M1, by the way. Five-year-old MacBook Pro.

phronmophobic20:12:23

it should be unhappy about the ["-foo"]

pataprogramming20:12:41

I just changed the version of Clojure in the deps to 1.10.2, and it picked it up. So it's definitely reading this project.clj.

pataprogramming20:12:59

This seems like it has to be something really basic that I'm missing.

phronmophobic20:12:11

did you copy and paste the :jvm-opts arg from somewhere? sometimes a weird unicode character can sneak in

pataprogramming20:12:20

Well, from the CLI.

pataprogramming20:12:08

I just tried :jvm-opts ["-Dfoo=bar"], and there's no foo system property. So it does look like those jvm opts are not getting applied at all.

pataprogramming20:12:38

% lein version
Leiningen 2.9.8 on Java 17.0.1 OpenJDK 64-Bit Server VM

phronmophobic20:12:51

from terminal, what happens if you run, lein change :jvm-opts set '["-foo"]'

pataprogramming20:12:48

Nothing visible.

% lein change :jvm-opts set '["-foo"]'
% echo $?
0

phronmophobic20:12:34

it should have updated your project.clj

pataprogramming20:12:49

Ah, just saw that, had to revert the emacs buffer it was up in.

phronmophobic20:12:04

does lein repl show an error now?

pataprogramming20:12:49

Nope. The project.clj was changed, but still just the loading error.

phronmophobic20:12:31

have you tried lein clean and retrying?

pataprogramming20:12:34

Still no dice, same behavior.

phronmophobic20:12:10

well, I'm out of ideas. at this point, I would start a brand new project, check to make sure adding system properties work, and the re-add the dependencies again

pataprogramming20:12:34

Setting LEIN_JVM_OPTS and checking the process list does show the library path being set on the Lein JVM.

pataprogramming20:12:20

OK, so, setting the JVM_OPTS environment variable does, in fact, get the java.library.path set on the REPL's JVM.

pataprogramming20:12:54

But that would be really annoying to try to do with Cider...and this should just work in Lein.

phronmophobic20:12:30

yea, super weird

pataprogramming20:12:37

And, incidentally, with the library path set it does fix the loading issue, and I can instantiate org.jpl7.fli.Prolog just fine.

phronmophobic20:12:11

is there a way to get lein to print out the project it ends up seeing?

phronmophobic20:12:49

I guess one other thing to check is your ~/.lein/profiles.clj and see if there's a profile that overrides the :jvm-opts

pataprogramming20:12:05

Hm, that's a pretty likely candidate, in fact.

phronmophobic20:12:36

it's weird that it would have precedence over your local project, but we did check a bunch of other potential causes

phronmophobic20:12:35

and it probably isn't relevant, but :java-opts is an alias for :jvm-opts

pataprogramming20:12:52

Well, there is indeed an ancient :jvm-opts override in there...probably more than five years old, probably followed me from another machine in my dotfiles. It was in there to fix a problem with Overtone at the time, according to the comment.

pataprogramming20:12:43

profiles are a menace. Thanks, that was a brilliant thought.

1
pataprogramming20:12:50

Yep, that seems to have sorted it! The problem was too surgically specific, and had to be something to do with my own machine's setup. Brilliant, thanks for the support.

phronmophobic20:12:10

glad it got resolved!

peterdee19:12:18

Is CamelCase okay in namespaces. I don’t see anything against it in the Clojure style guide.

Alex Miller (Clojure team)19:12:18

not considered idiomatic but not restricted

Cale Pennington19:12:10

I’m a complete beginner when it comes to core.logic (and a relative beginner to clojure). I have a veery set of logic constraints, and am trying to figure out how to speed it up, any suggestions on where to read up?

hiredman20:12:50

the order of your goals can make a large difference to peformance

hiredman20:12:46

a core.logic program is sort of two things, a generator that creates a search tree, and constraints which prune the tree

hiredman20:12:24

a logic program that generates a big search tree and then prunes it all at the end will generate the correct result, but be slower than a logic program that prunes as immediately as it can

hiredman20:12:16

that is my general advice for a first pass speeding up core.logic programs, I don't know of any write ups

hiredman20:12:51

tabling, which I don't really understand, but is some kind of core.logic specific memoization thing, might help as well

Cale Pennington20:12:46

inside core.logic, which things are generating vs constraining? Right now, it’s a big tree of ands and ors where the leaves are membero calls

Ben Sless20:12:29

fresh variables generate linearly , disjunctions (or, conde clauses, etc) generate forks in the search tree. conjunctions (clauses in the same fresh or the same conde group) constrain. membero is a constraint, too

hiredman20:12:31

membero is also a linear search, slow

hiredman20:12:00

everything is both

hiredman20:12:11

membero generates branching searches as well

hiredman20:12:36

if the member is not ground but the list is, it will branch the search where the member is unified to every entry in the list

Ben Sless20:12:16

my mistake. It's a conde

hiredman20:12:23

the whole thing with relational programming, and why things can be run forward or backwards, is everything is both generating and constraining

hiredman20:12:28

depend on what the inputs are

hiredman20:12:02

if the inputs are all ground, most goals are constraining, if the inputs are not ground most goals are generating

hiredman20:12:18

https://clojurians.slack.com/archives/C0566T2QY/p1636397873008300 is a discussion in #core-logic a while back about speeding up a program that was using membero

Cale Pennington21:12:27

Based on that thread, seems like having lots of memberos is probably what's slowing down