Fork me on GitHub
#clojure
<
2021-06-17
>
Kuldeep Sengar05:06:03

hey everyone, What are the common use-cases where we should (should not) use defmacro expanding to a function like below:

(defmacro n-defn [fn-name fn-args fn-body]
  `(defn ~fn-name ~fn-args
     (if (not (check-some-condition?))
       (setup-few-things))
     ~fn-body))

(n-defn conditional-function 
        []
        (println "Did something"))

seancorfield06:06:01

Please do not cross-post between #beginners and #clojure

🆗 2
quoll06:06:57

I’d probably avoid the macro:

(defn fn-checker
  [f]
  (fn [& args]
    (when-not (check-some-condition?)
      (setup-few-things))
    (apply f args)))

(defn normal-function
  []
  (println “Did something”))

(def conditional-function (fn-checker normal-function))

quoll06:06:57

This makes normal-function easier to test in isolation of the setup, and it lets you compose on existing functions. It probably has other advantages that I’m forgetting because it’s 2am and I should be asleep instead of looking at my phone 😉

Kuldeep Sengar06:06:05

thanks @quoll it makes sense. a detailed explanation through your mobile at 2 am 🙌

Adam Helins13:06:03

Criterium experts (or other benchmarking tool), I would like to benchmark some code that needs some preparation prior to each run but I don't want that preparation to be factored in. What would you recommend? 🙂

Ben Sless13:06:33

What kind of preparation? you need to set up some state or just big complicated inputs?

Ben Sless13:06:51

Generally, jmh is good at setting up a variety of states and inputs

Adam Helins13:06:56

Essentially prepare a Java object that cannot be reused. It has to be created anew before each run but that preparation is not relevant to what needs to be measured.

Adam Helins13:06:53

I have never used JMH but I believe it requires generating a class? In this instance I need something like Criterium where you can measure things dynamically.

Ben Sless13:06:03

it needs to be recreated before each call or before each benchmarking run?

Ben Sless13:06:15

jmh can be used dynamically as well

Adam Helins13:06:46

Before each call

Adam Helins13:06:47

Whereas from what I see Criterium only accepts a function. Unless I am mistaken, it doesn't accept anything around that function.

Ben Sless13:06:59

I'm not sure jmh lets you do that, either

Ben Sless13:06:22

what I'd do in this case is measure function+setup then only setup and substract

Ben Sless13:06:27

silly but works

borkdude13:06:42

exactly. perhaps tufte could also be used, it's a profiling tool where you can bucket things

Adam Helins13:06:55

Yup I thought about doing (setup + fn) - setup but it sounds really suboptimal and can potentially be inaccurate when setup takes significantly more time than fn.

Adam Helins13:06:07

I used Tufte in the past and it might serve the purpose since it's more of a "profiler". However I don't know how "good" it is for a task that is closer to "benchmarking" (just one expression). Eg. I expect quite a lot of GC, which Criterium tries to mitigate, I don't think Tufte does that kind of work.

Ben Sless13:06:27

Another option is to create a pool of objects to be used in advance then consume it, but you'll need a lot of them

Adam Helins14:06:49

A huge amount! Funny no one had this issue before.

Ben Sless14:06:31

usually the setup is not one-off. can you describe the case a bit more?

Adam Helins14:06:34

It is about allowing users to benchmark code written in another Lisp. It also runs on the JVM and the "runner" is written in Clojure. Prior to evaluating the benchmark code, a "context" object must be copied (that's the setup) so that each call is run as-if for the first time.

borkdude14:06:17

what is the other Lisp? just curious

Adam Helins15:06:57

Convex Lisp from https://convex.world/ It's a distributed, immutable Lisp closely modelled on Clojure. Still pre-alpha so no big fuzz about it (concept is from the creator of core.matrix 😉 ).

andy.fingerhut15:06:19

I am sure people have had the issue before, at least in the form of misinterpreting the results of criterium benchmarks :-)

andy.fingerhut15:06:02

The creation of many objects before the benchmarking runs is the only way I can think of that would not require modifying criterium code

Adam Helins15:06:52

Naively, I guess modifying Criterium to have such a "preparatory" phase might mess with its base design (the work is does with overhead estimation, GC, ...)

andy.fingerhut19:06:19

If the run times of the thing you want to measure is, for example 1/10 of the time required to set up each one, then your preparation time will be 10x longer than the time measured during Criterium measurements of the code you want to measure.

andy.fingerhut19:06:50

Criterium is very flexible on how long you can spend doing measurements, but by default it spends about 10 seconds running the code you want to measure, as many times as it can run during that time, just for its 'warm up' period where it tries to ensure your JVM code has been JITed, and then 30 or 60 seconds of repeatedly running it after that. It discards the stats from the warmup time and reports based only on the (usually) longer measurement period that happens after taht.

tvaughan13:06:43

I have a long running server process that needs to execute shell commands. I'm running into the problem where these calls don't return, i.e. https://clojuredocs.org/clojure.java.shell/sh#example-542692d6c026201cdc3270e7. However I can't call shutdown-agents or exit which would terminate the server process. I've tried other approaches (babashka.process, conch, clj-shell-utils) without success. I need to set environment variables too so turtle wouldn't work as-is, for example. What have others done in the same situation? Thanks

borkdude13:06:25

@tvaughan babashka.process lets you set environment variables with :extra-env, clojure.java.shell has this too (`:env`, sets the entire env). babashka.process doesn't need to spawn a thread for a process (if used correctly), so it would not affect shutdown-agents.

tvaughan13:06:30

What does "if used correctly" look like? I wasn't able to get babashka.process to terminate

borkdude13:06:48

It depends on your use case

borkdude13:06:24

Feel free to make a repro and I can probably say more about it. It's hard to speak in general terms.

tvaughan13:06:26

A long-running server process that needs to run a command and capture stdout

borkdude13:06:22

E.g. when doing @(babashka.process/process ["ls"] {:inherit true}) , there is no future used.

tvaughan13:06:34

I'll give inherit a try. I missed that in the docs. Thanks

tvaughan13:06:33

I also used babashka.process/check instead of a deref

borkdude13:06:00

With :inherit you're not able to capture the string though. check also does a deref, but throws when the exit value isn't 0

tvaughan13:06:53

Ah ok. That won't work then

borkdude13:06:55

what you can do is:

(slurp (:out (babashka.process/process ["ls"] {:err :inherit}))
This won't use a future either, and will return the output

borkdude13:06:09

and this will forward stderr to System/err, so you can see if anything goes wrong

tvaughan13:06:24

Cool. I'll try that. Thanks

borkdude13:06:50

E.g.:

user=> (slurp (:out (babashka.process/process ["ls"] {:err :inherit})))
"CHANGELOG.md\nLICENSE\nREADME.md\ndoc\nproject.clj\nresources\nsrc\ntest\n"

tvaughan13:06:20

So omit check or an explicit deref?

borkdude13:06:57

yeah, the slurp will block the process until it finishes

tvaughan15:06:01

This appears to be working now. Thanks for your help @U04V15CAJ

wombawomba14:06:27

I'm trying to come up with a (clean) way to join a set of possibly overlapping intervals into an equivalent set of nonoverlapping intervals (e.g. #{[1 2] [3 5] [4 6]} -> #{[1 2] [3 6]}). Any ideas?

Endre Bakken Stovner15:06:20

The problem is called interval merge in bioinfo at least. Sort values on their start/ends. As long as interval i overlaps with i+1 extend interval i with the end of i+1. If i does not overlap start a new one. Never solved that problem functionally though.

wombawomba15:06:56

yeah that's what I came up with

wombawomba15:06:12

I'm having a hard time writing it out functionally though

Endre Bakken Stovner15:06:48

A reduce where you keep track of the current interval you are extending and the list of intervals up until now?

Endre Bakken Stovner15:06:09

I have little time today, but will write it out tomorrow

Ed15:06:33

(let [i #{[1 2] [3 5] [4 6]}]
    (->> i
         (sort-by first)
         (reduce (fn [r [a b :as n]]
                   (let [[x y] (last r)]
                     (if (and y (< a y b))
                       (conj (vec (butlast r)) [x b])
                       (conj r n))))
                 [])))
maybe something like that? That'll probably get slow with all the last/`butlast` calls, you could use subvec for that or something instead?

😎 2
wombawomba15:06:11

nice thanks 🙂 it's pretty similar to what I came up with:

(defn merge-intervals
  [is]
  (if (empty? is)
    []
    (let [is' (sort-by first is)]
      (reduce (fn [xs [a b]]
                (let [[a' b'] (last xs)]
                  (if (<= (dec a) b')
                    (conj (vec (butlast xs)) [a' (max b b')])
                    (conj xs [a b]))))
              [(first is')]
              (rest is')))))

wombawomba15:06:58

and yeah, I'd like to find a way to get rid of the (conj (vec (butlast xs)))

wombawomba15:06:59

...although I suppose it might not matter unless I have lots of intersections (which I don't)

Ed15:06:14

(let [i #{[1 2] [3 5] [4 6]}]
    (->> i
         (sort-by first)
         (reduce (fn [r [a b :as n]]
                   (let [c     (dec (count r))
                         [x y] (get r c)]
                     (if (and y (< a y b))
                       (conj (vec (subvec r 0 c)) [x b])
                       (conj r n))))
                 [])))
using subvec instead?

🙏 3
wombawomba15:06:58

yeah okay good idea

wombawomba15:06:09

you can remove the vec call there as well

emccue14:06:11

@tvaughan there is always just straight ProcesBuilder

tvaughan14:06:14

Thanks. That's been my backup plan. I didn't see a way to set environment variables with this, so I've been reluctant to reinvent the wheel if there's a ready made solution out there

borkdude14:06:47

babashka.process is just a thin layer over processbuilder

borkdude14:06:27

it also exposes the underlying process as :proc so you can do anything you want without being blocked by a "wrapper"

kennytilton15:06:44

So a client questioning our choice of Clojure just challenged me: "I did some googling. The error messages suck." :rolling_on_the_floor_laughing: Took me ten minutes to fight that one off. Thanks, Clojure!

2
Ben Sless15:06:23

I know it might be a hot take, but "no they don't"

12
Ed15:06:30

how did they feel about using jboss or some other jee type app server? I'm pretty sure I recall the error messages being way worse in Java land.

2
p-himik15:06:35

And in particular, Clojure 1.10 has introduced a lot of improvements in this area. Random 5 minute googling might easily give results from older versions.

hkjels16:06:14

Well. They are not great, but getting better Also a few libraries that try to improve the situation

kennytilton20:06:43

@UK0810AQ2 https://www.youtube.com/watch?v=Yt4zQqndLdQ Every annual Clojure survey trashes the stack traces. Java? Excellent choice of bar to clear! 👏 Sadly, the real bar is set by Common Lisp. @U2FRKM4TW Yes, old Google hits should be vacuumed, agreed. The weird thing? No one asked how I saved the sale. Or congratulated me on doing so. You yobbos never change....

kennytilton06:06:25

FYI, part of the win was figwheel's error reporting, right down to the line number and code, IIRC. Sweet. Should be the default for Clojure.

3
👍 3
Kyle Ferriter15:06:32

One person's "error messages suck" is another person's "error messages encourage more robust fault handling"

🥑 2
dabrazhe20:06:08

How can I add calculation on a value of the keyword within this function?

(map (juxt #(get-in % [:SELL :safetyP]) :s-premium :margin ) collection ) 
eg. (* :margin 1000) ?

seancorfield20:06:00

@dennisa It’s not really clear what you’re asking here. map .. juxt is going to produce a sequence of vectors. What input/output are you talking about here?

seancorfield20:06:03

Perhaps you’re looking for (comp (partial * 1000) :margin) (instead of just :margin)? Or #(* 1000 (:margin %))

✔️ 2
dabrazhe20:06:00

Indeed, I am looking for #(* 1000 (:margin %)) For some reason my function: ) #(* 1000 :margin %) did not work 🙂

seancorfield21:06:03

Not sure where you are in your Clojure journey but perhaps #beginners might be a more helpful channel for questions like this, since folks there have opted in to doing more than just answering Qs — they’re happy to help teach folks about the reasons and alternatives and idioms…

dpsutton20:06:09

do you understand why it doesn't work? Not clear if that means you caught your mistake or you are still confused

dabrazhe21:06:22

I think I do now. Strangely, I did not have any issues with using get-in in the same context in juxt

dpsutton21:06:45

its not the context that is incorrect. (* 1000 :margin <anything here>) will always blow up because you cannot multiple 1000 by the keyword :margin

Joshua Suskalo22:06:47

I would like to be able to validate in-flight data in a production service against runtime-generated schemas. I know spec2 can do this, as can malli, and potentially a few other libraries. Is there an established and robust way to do this that's preferred? Is spec2 or malli a good solution despite both being marked as alpha software?

hiredman22:06:17

there is always "spec1" as well, which ships with clojure

emccue22:06:07

schema still exists

hiredman22:06:29

in my mind the question is how and and to who these schemas are going to be communicating (assuming they are some sort of interface documentation)

hiredman23:06:27

is the audience (human and machine) all going to be able to understand the schemas?

hiredman23:06:51

do the tools you want to use understand them (maybe swagger, etc)

Joshua Suskalo23:06:03

The audience is just machines. The intended purpose of this is to detect when in-flight data stops conforming to an existing spec (because the data is provided by external sources which may experience deployments that cause changes). The reason to disprefer spec1 is that it relies so heavily on macros and has little way to produce specs that can be altered at runtime easily.

ghadi23:06:19

Akita Software is doing something in this space ^

ghadi23:06:06

(I’m not affiliated, just think it’s interesting. There will be a talk at Strange Loop about it)

Joshua Suskalo23:06:35

I'll take a look at that, thanks!

Joshua Suskalo23:06:53

It seems like they may be working on a different part of this problem than I'm dealing with, but I'll research some more.

hiredman23:06:55

spec (even spec2), bottoms out on predicates, functions, arbitrary blobs of code. it is an open world, so to deal with new things you'll need to create new code. depending on the data format you might be better off with something like json schema, as the set of things (json values) is closed, so you just mix match and combine those

3