Fork me on GitHub
#clojure
<
2018-07-23
>
curtosis00:07:20

I’m looking at someone else’s component code and I’m a little confused: their start method doesn’t assoc anything to the component, but dependencies seem to pick it up … except when I change it.

curtosis00:07:01

I didn’t think it would work at all without associng your runtime state in, but the original code seems to belie that.

didibus01:07:36

I was wondering about the performance of lazy sequences. Say you do: (->> [1 2 3] (map inc) (map #(* 2 %)) (filter odd?)). If this was eager (and was not using transducer machinery and its loop fusion logic), it would require three loops over the collection. Where clearly, it could all be done within a single loop. When using lazy sequences, and the computation actually only happen when the next value is read, that's a kind of loop fusion too no? In that, my understanding is that if you take 1, it will increment it, double it, and filter it in one go. So effectively, the lazyness kinda rewrites the computation as if it had been written in one loop where all three computation happen one after the other. Am I correct to believe that?

mfikes01:07:39

@didibus Yeah; if you evaluated that in a REPL you could imagine a single loop / recur in the REPL printing logic that loops over the sequence in order to print it, and realizes that sequence by “pulling” on it.

mfikes01:07:03

There is overhead with the need to realize the intermediate sequences involved, but often that is reduced drastically if chunked sequence processing is involved (essentially doing things in batches of size 32 at a time).

mfikes01:07:27

In short, I agree with your assessment: It is like fusion, with just some overhead mixed in.

didibus01:07:11

@mfikes Can you tell me more about this overhead, I've heard of it before, but I'm not able to really understand, what requires creating intermediate sequences?

mfikes01:07:03

Let’s say that you wrote a simple REPL that is going to print the result, and in your printing code, assume you aren’t doing things like first checking whether the sequence satisfies chunked-seq?, and the very first thing your REPL code does is calls first.

mfikes01:07:46

You are calling first on the lazy sequence produced by the call to filter… which is itself operating on a lazy sequence produced by the map #(* 2 %), and so on, perhaps calling first down the chain…

mfikes01:07:44

So perhaps to directly answer your question: (map inc ...) is producing a lazy sequence, and so is (map #(* 2 %) ...) and so is (filter odd? ...).

didibus01:07:30

Right, so we're literally talking about the object creation overhead of wrapping all things in a lazy sequence?

didibus01:07:53

Ok, and, every time you "pull" one more value, does that mean every function create a new lazy sequence which is now +1 realized and a pointer to get the next value? So, would this overhead grow as the size of coll grow? Or is it more constant in term of how many intermediate lazy operations there are before reaching the real collection?

mfikes01:07:10

I’d say it is O(n). If you take a look at this line it might explain a lot: https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L2747

mfikes01:07:18

That is part of map

mfikes01:07:58

So it has to create a cons, do a call to first (which is just a fn call, no big deal), and then call map on the rest, which makes another lazy-seq.

mfikes01:07:35

Really, one way to look at this is just some allocation overhead as it marches down the sequence.

didibus01:07:40

Oh, right I see, so effectively a new lazy sequence returned by map on the rest is created on every "pull". And maybe this is where the chunked seq comes into play, and only have this happen every X chunk size?

mfikes01:07:12

Yep. If you look in that code right above that line, you can see a dotimes that rips through 32 at a time.

mfikes01:07:41

In short, this gets a “chunk” of 32, does a a quick loop / recur over it, and then gets the next chunk of 32, thus reducing the overhead drastically

mfikes01:07:37

It is quite analogous to a buffered reader, or any sort of buffered I/O, that does stuff in larger chunks, instead of a character at a time.

didibus01:07:02

Cool, I think it all make more sense to me now. So in a way, if you increased that chunk size to be equal to the size of the collection, you'd get pretty close to the eager behavior of transducers's loop fusion

mfikes02:07:33

Yeah, that’s an interesting thought: If the chunk size was not 32 but, say big enough to hold all of the collections, then perhaps in your example, you’d end up with 3 large chunks, being processed eagerly at each stage.

mfikes02:07:12

Maybe it is the same as if you were to put vs on the end of each of those in your example

(->> [1 2 3] (mapv inc) (mapv #(* 2 %)) (filterv odd?))

mfikes02:07:01

With transducers, there are no intermediate sequences involved 🙂 Just a bunch of function calls going to town on the [1 2 3]

didibus02:07:19

Hum, actually, maybe not, with a v, then you'd process all elements and inc them, then double them all, and then filter them all, I believe. Maybe that's also what a big chunk would end up doing.

mfikes02:07:57

Yeah prove it to yourself with

(->> [1 2 3] (map (fn [x] (prn 'm1) x)) (map (fn [x] (prn 'm2) (* 2 x))) (filter (fn [x] (prn 'f1) (odd? x))))

mfikes02:07:56

Change the [1 2 3] to (range 37) and that’s instructive.

didibus02:07:05

Hum, it looks like even with the chunks, every element is processed through all computations one by one. Or am I reading the result wrong

didibus02:07:23

Oh, also running it in CLJS probably does not help

didibus02:07:01

Not sure if cljs has chunked seq

mfikes02:07:22

Hah, yeah, in ClojureScript (chunked-seq? (range 37)) is false

didibus02:07:47

Eh, but now it does make sense. So actually, with chunks of 1, or no chunks. Lazy-sequence naturally do loop fusion, but incur the overhead of wrapping the rest in a lazy-sequence. With chunks, its kind of a hybrid, it trades away some of the fusion to lower on the lazy-sequence wrapping overhead.

didibus02:07:57

With the v variant, it loops three times, one after the other.

didibus02:07:02

No fusion at all

didibus02:07:19

And transducers give you perfect fusion

(into [] (comp
           (map (fn [x] (prn 'm1) x))
           (map (fn [x] (prn 'm2) (* 2 x)))
           (filter (fn [x] (prn 'f1) (odd? x))))
          (range 37))

mfikes02:07:51

Yeah, and, importantly, if you are in ClojureScript, no sequences pile up in RAM (because ClojureScript hold head; it has no locals clearing)

mfikes02:07:24

(That is about a few more functions becoming directly reducible)

didibus02:07:52

Oh, interesting. I didn't know cljs retained head. How is that even practical, like I'm surprised with all the sequence operations out in the wild, and retaining head, that cljs script even performs that well

mfikes02:07:24

Well, if you are processing stuff in a browser, I suppose you generally have all of your data “with you” anyway.

didibus02:07:52

True, I guess no one is really doing big data with cljs. This would probably be a bigger problem as Node adoption and Cljs grows

mfikes02:07:00

This is not true if you are using ClojureScript in Node.js to process a lot of data, or, say, doing Advent of Code problems in ClojureScript, processing large sequences 🙂

mfikes02:07:29

Even though ClojureScript doesn’t have locals clearing, a “secondary” benefit of the fact that transducers eliminates sequence is: There is no sequence to hold the head of 🙂

didibus02:07:40

What's the constraining factor for locals clearing? How is that handled on the JVM, which can't work over JavaScript? (its great to know, will use transducers more often in cljs)

mfikes02:07:05

Clojure explicitly clears certain things; ClojureScript just hasn’t taken on the complexity to do the same. It could, but there has been no huge need. https://dev.clojure.org/jira/browse/CLJS-705

didibus02:07:47

Ah, fascinating, happy to know its not a technical restriction, so if it ever becomes a bigger problem, its just a matter of man effort to bring it over.

didibus02:07:32

Thanks for good learning session

mfikes02:07:35

If you are in a corner on it, you can do something like the hack here http://blog.fikesfarm.com/posts/2016-01-15-clojurescript-head-holding.html

didibus02:07:12

Oh, cool article, I'll have to go over it a few times to really grok it

mfikes02:07:37

Yeah, it’s a hack, written before I grokked that transducers are often a cleaner solution. But even then, you run into the fact that a function like repeatedly doesn’t produce a directly reducible result.

didibus02:07:42

Another thing that's been bothering me to better understand. Clojure reads code on load, and then eval it on compile. A function thus receives evaluated code as arguments, and whatever it returns is evaluated on return. So code is first a data structure, and after being evaluated, its turned into values. A macro will take in code as a datastructure, so the text has been read into code as data, and will return code as data, which will then be macroexpanded further until there's nothing more to expand, and then it will evaluate the resulting code.

didibus02:07:18

Now, I think this is all correct. And please anyone correct me if I'm wrong.

didibus02:07:53

Where I'm confused is wit reader literals. It used to be my understanding that they would take in evaluated code, and return evaluated code. So for example, if a symbol was included in the input to it, the symbol would get resolved, as it would be evaluated first. But, now I see I was wrong, it gets as input code as a datastructure, so the text is just read, and sent to your data reader.

mfikes02:07:13

Sounds like a good summary of the machinery behind functions vs. macros.

didibus02:07:01

But, what does a data reader return? Does it act as a macro, and return code as data, which gets further macro-expanded and afterwards evaluated?

Alex Miller (Clojure team)06:07:07

Data reader functions are called during the read phase. They take read but unevaluated values. They return a read value of your choice. You can think of them as a way to do eval during read phase.

Alex Miller (Clojure team)06:07:30

Because they happen at read time (before macroexpansion), macros will receive the values returned by data reader fns

didibus02:07:25

This sentence confuses me: "by invoking the Var #'my.project.foo/bar on the vector [1 2 3]. The data reader function is invoked on the form AFTER it has been read as a normal Clojure data structure by the reader."

didibus02:07:50

Does that mean it behaves like a macro, or a function, or something different?

mfikes02:07:57

I think it is just saying that the reader turns

#foo/bar [1 2 3]
into
(#'my.project.foo/bar [1 2 3])

didibus02:07:23

So, if the function was called before the form had even been read, its my understanding we would be receiving a string of text, or is there a stage that I'm unaware of

mfikes02:07:52

Since you are in ClojureScript, try (def x 3) and then evaluate

'#queue [1 x 3]
to see what you’d get

mfikes02:07:31

Oh, your function my.project.foo/bar would be receiving a vector to operate upon

didibus02:07:41

#queue [1 3 3]

mfikes02:07:03

If you include the quote you can see that you get this

(cljs.core/into cljs.core.PersistentQueue.EMPTY [1 x 3])

mfikes02:07:30

But perhaps that is a different example of what you were asking about… I was thinking of the case when you have a symbol in the vector.

didibus02:07:13

Right, ya that's a good example. So it appears then that it behaves like a macro, since it would take code as data, and it returns code as data that is later evaluated in the context

mfikes02:07:06

Yeah, I think in other Lisps they have similar things: “reader macros,” which matches what you are describing

didibus02:07:51

That's kind of what I'm trying to understand, is the difference with a reader macro. I guess its just that you have to prefix it with a #

Alex Miller (Clojure team)06:07:44

Data readers in Clojure still get passed read Clojure data so are much narrower than what reader macros can do in Lisp. Even if you don’t have a data reader function installed, Clojure can still read it as a tagged literal. That’s not necessarily true in Lisp afaik.

mfikes02:07:06

To be honest, the case where there is a symbol in the mix is something I haven’t thought about

didibus02:07:18

Ya, I guess it means you can do things like: #infix(a + b + c) if you wanted

didibus03:07:32

Hum, okay, so it seems in other Lisps, reader macros actually receive code as text. And that's why they are more powerful. So I think that sentence really means, you are not receiving code as text, it has been read into a datastrcuture already and that's what you get. Since I never imagined that something would ever give me raw code as text, I thought that meant it would behave like a function, and the form would be evaluated beforehand, but its just a normal macro.

didibus03:07:58

The biggest difference seems that because of that, you can not create new delimiters. So your DSL must be wrapped in existing delimiters.

didibus03:07:24

Don't find them particularly useful to be honest, not any major difference between (queue [1 2 3]) and #queue[1 2 3], but they have sometimes a slightly nicer visual flare. Best part is probably that they print to themselves and can be read back as themselves I guess, so really make sense for EDN

benzap03:07:30

Do the print functions use a protocol to determine how a particular class should be printed out?

didibus04:07:44

They use a multi-method

didibus04:07:17

print-method is for human readable printing, and print-dub is for machine readable printing.

didibus04:07:12

@benzap So the long functions like print, println, print-str can be extended with adding a defmethod to print-method, and the short ones like pr, prn, str can be extended by adding a defmethod to print-dup

didibus04:07:47

Well, actually, I always get confused which one to extend for what, you might have to try it out

didibus04:07:47

I think you can extend print-method, but then you have to check for print-readably, and if true, print in a way that can be read back, if false, print in a human pleasant fashion.

didibus04:07:19

Or, you can extend print-dup, and make it print in a readable way, and then bind it to true when you print.

benzap04:07:16

Thanks, i'll check it out

henrik06:07:40

Can someone recommend a deps.edn/Clojure tooling tutorial that assumes a very high level of stupidity/inexperience on my part?

dominicm06:07:41

What kind of thing are you looking for?

henrik06:07:41

The very basics. I run lein repl, and running (nlp/classify input) gives me the expected output of…

[{:category ["People & Society" "Religion & Belief"], :confidence 0.99}]
I run clojure -A:dev, and try to change into the proper namespace, (in-ns 'exp.core), to try the same function. Run (nlp/classify input). Alas, No such namespace: nlp. Just to check, (str "hello" "world"): Unable to resolve symbol: str. Alright, I’m not allowed to change into namespaces. Restart clojure -A:dev and just paste the code from exp.core into it, I guess. Seems to work. Run (nlp/classify input) to check that namespaces are loaded etc. No dice, returns [], which is wrong.

dominicm06:07:27

@henrik you may need to require before in-ns, I think lein does that for you 🙂.

henrik06:07:58

Ah, thank you! However, (nlp/classify input) returns the wrong stuff.

dominicm06:07:57

does input return the right thing?

henrik06:07:03

Yeah, input is just a def with a lot of text that I pass to a classifier.

henrik06:07:25

Hang on, it doesn’t seem like the Clojure repl is picking up the same environment variables as lein repl.

dominicm06:07:53

ah! 🙂 custom ones or some kind of standard?

henrik06:07:54

It wouldn’t be able to authenticate to the nlp service.

henrik06:07:20

Got it, I need to run export [variable] before starting Clojure repl.

henrik06:07:51

Apparently also something done automagically by Leiningen, perhaps.

dominicm06:07:49

That should be the same as normal, do you have something in profiles.clj to define environment variables?

dominicm06:07:59

Or rather, how were you defining them otherwise?

henrik06:07:08

Well, I do have a :dev {:env {…}}, but I thought that was solely for the benefit of the environ library.

dominicm06:07:16

just to check, your nlp code doesn't use environ, it does (System/getEnv "BLAH")?

henrik06:07:10

I’m calling out to Google Cloud, which couldn’t give a beep about environ 🙂

dominicm06:07:14

Now I'm double suspicious about the environ stuff.

henrik06:07:02

I’ll rip out environ, just in case. It ended up in there when I was trying to get the REPL to pick up on the environment variables. I set the variable in about 15 different places in MacOS, trying to communicate my intention to Leiningen. One of them randomly worked. I’m sure it wasn’t environ.

dominicm06:07:34

oh, maybe you have the environment variables in one terminal already?

henrik06:07:01

That would be silly of me. Most likely true.

henrik06:07:06

Goodness. OK, I’ve put off setting up a proper dev environment for too long. Can you recommend an editor/environment, given that I need to interact with the deps.edn stuff? Atom clearly and proto clearly won’t do anymore.

dominicm07:07:18

I'm a hardcore vimmer 🙂 So that's my advice to all, of course

dominicm07:07:56

What's wrong with atom?

henrik12:07:07

To be honest, nothing is wrong with Atom, it has served me well. If it has a problem, it’s primarily that it doesn’t seem to be as widely used as the alternatives.

dominicm13:07:08

Yeah, I think it could really benefit from a round of Clojurists Together.

dominicm13:07:14

But otherwise I think it's pretty great

henrik13:07:23

For me, it’s the only environment that has struck a balance between approachability and power. I’m using proto-repl, along with parinfer, and it has been working great for the level I’m at. I know there’s a lot more power to be had by going vim/emacs + paredit and the rest of it. I’m grateful that Atom & friends have taken me this far.

dominicm13:07:46

I use parinfer in vim ¯\(ツ)

👍 4
henrik14:07:54

I was trying to think, though, if I were to recommend a setup for someone coming from, say VSCode and JS-land to Clojure, where do I point them?

henrik14:07:49

Asking them to make an up-front investment of the hours needed to understand advanced tooling seems a bit rich.

dominicm14:07:20

ProtoREPL is my go-to right now, although I comment that emacs is better trodden.

dominicm14:07:34

VSCode does have a really neat integration though however, so that advice is out of date.

dominicm14:07:26

The author is in #editors and seems quite active

henrik18:07:24

I had a wack at VSCode earlier this week, but was unable to get the REPL up and running. Also, the parinfer plugin is a bit out of date. No smart mode.

dominicm19:07:13

Might be worth asking @U0ETXRFEW. I think there's two plugins.

pez21:07:55

There are four, at least. 😀 Which one did you try with, @henrik? I've heard the parinfer extension is getting some attention now, btw.

henrik04:07:11

@U0ETXRFEW Clojure Code. But wow, there’s certainly a lot more happening than last time I checked! Which admittedly was some time ago. What setup do you recommend for VSCode?

henrik04:07:43

I’m glad to hear that an updated #parinfer is coming. I love that thing.

pez13:07:47

I use Calva, which I also build and maintain. Please give it a try. I'd love myself some feedback. 😀

henrik09:07:12

Ooh, gotta check that out. Thanks!

henrik07:07:15

Exactly what I was looking for, thank you

👌 4
iGEL08:07:31

Hi there! Can someone provide me with good pointers about analyzing memory leaks in production? Background: We run small clojure services with ring/jetty/uberjar in Docker in production and some show always increasing memory usage - it almost never drops, while running for days and weeks. From staring at the code, I fail to find any reason why not all the objects should be freed after the end of a request

👀 4
chrisblom09:07:34

i’d use the jvm profiling tools jvisualvm/java mission control to make a dump of the heap and analyse it with https://www.eclipse.org/mat/

👀 4
iGEL09:07:30

Thanks. I haven't watched it completely yet, but when I tried the commands, I realized that the processes see the free memory of the machine, not the limited memory of the container. I will investigate whether this causes the problem: The jvm sees a lot of free memory and thus doesn't run the GC

alex31415912:07:20

Hello, I have a problem with a future using CPU after it's finished. Looks a lot like the issue reported here: https://stackoverflow.com/questions/11520394/why-do-cancelled-clojure-futures-continue-using-cpu/14540878#14540878 In a nutshell: ` (def a (atom 10)) (def c (future (while (pos? @a) nil) @a)) (reset! a -1) ` The above works in REPL e.g. CPU usage goes down to 0 after the last line. In my real app while the future finishes, with the right value, CPU usage stays at 100%. Here's another weird issue that seems linked to the former: ` (def a (atom 10)) (def c (future (while (pos? @a) nil) @a)) (future-cancel c) ` Last line returns true but CPU usage remains 100%. Any idea what to do? Thank you,

noisesmith17:07:08

cancelling only works for specific methods, your while doesn't call any cancellable method

pgarrett13:07:12

Any recommendation for a clojure library that wraps java 8 completablefuture and also supports specifying the executor for individual operations? Promesa does the first but not the second (you can set one executor for everything).

borkdude14:07:21

The core.async example on this page https://clojure.org/guides/core_async_go is a bit weird: it uses an async http client, but the requests are not executed in parallel?

henrik14:07:55

Well, kind of. It’s a bit weird in that it’s a recursive function that doesn’t use recur. But beyond that, you would typically have a sequential handler like that, which hands off the incoming stuff to a channel as quickly as it can. You can then have multiple gizmos reading from the channel in parallel.

pgarrett17:07:13

Unfortunately I don't think this applies, since it targets java 1.6 (no CompletableFuture)

bwstearns18:07:58

does anyone have something a bit higher level for issuing sh commands than clojure.java.shell/sh? I'm in escaping hell trying to get optional args and args with escapes in the values working.

noisesmith18:07:08

be aware that clojure.java.shell/sh doesn't use the shell, if you want something more intuitive you probably want to invoke "sh" "-e" "other args go here"

noisesmith18:07:52

for example to get globs, subshells, etc. etc.

bwstearns18:07:59

ah ok. I might just go that route for now.

noisesmith18:07:28

for a lower level interface, ProcessBuilder can be used to create a Process

noisesmith18:07:59

there's also the conch library, but there's a lot of gotchas to that

bwstearns18:07:26

yeah, that's the only one I found but it didn't look quite like what I was looking for.

noisesmith18:07:30

honestly for lower level stuff I find Process easier to use directly compared to conch, because the way conch mixes laziness and IO is problematic

bwstearns18:07:16

I kinda was hoping to find something like a macro that allows you to do something like this (though now I'll make one at some point):

;; => (sh-plus name-of-resulting-fn
;;             ["some-program" "command"]    ;; Literals
;;             [--env --foo --bar]   ;; Required args 
;;             [--some-optional-arg] ;; Optional args
;;             [--something])   ;; boolean flags

ghadi18:07:40

there's no getting around strings needing to be quoted

bwstearns18:07:30

true but if I can solve it once in the macro then my code won't be littered with ///s

ghadi18:07:14

macros can't invent new reader literals

bwstearns18:07:26

I don't think a new reader literal is required.

noisesmith18:07:28

if you want a new string escape added to an existing string, you don't even need a macro for that (string/replace s #"\\" "\\\\") etc.

bwstearns18:07:06

the motivation is to be able to generate the tedious wrapper functions by specification instead of by hand. the call above would result in something like

(defn name-of-resulting-fn [args]
  (apply sh (filter #(not (= "" %))
    ["some-program"
    "command"
    "--env" (:env args)
    "--foo" (:foo args)
    "--bar" (:bar args)
    .... etc

bwstearns18:07:02

with some function wrapping the input args to sort out whether they need escaping etc.

noisesmith18:07:41

also, since this isn't passed to sh, there's little to no escaping needed

bwstearns18:07:57

It's the recipient CLI tool that needs it

noisesmith18:07:16

OK - that's dependent on what the tool expands

noisesmith18:07:47

just saying, no need to look out for $ or * etc. unless the command you invoke looks for those

noisesmith18:07:20

user=> (sh/sh "echo" "$(ls ) *")
{:exit 0, :out "$(ls ) *\n", :err ""}

noisesmith18:07:29

usually people have the opposite problem - they want those expansions (thus my suggestions to use sh above)

bwstearns18:07:35

Yeah, it's currently limited to dealing with passing in " literals and \' literals, it just makes for some ugly reading.

bwstearns18:07:08

"\"\\'foo\\'\"" isn't great to read

noisesmith18:07:22

right, but clojure.string/replace can fix that easily

noisesmith18:07:47

if you know which substrings need escaping

noisesmith18:07:47

what command are you using that treats ' and " as special btw?

bwstearns18:07:56

It's an internal CLI tool that interacts with some PG dbs

bwstearns19:07:07

someone built it for manual use and now someone wants it tied in to an automated process so rather than reinvent the wheel I'm just going to have it call it directly.

bwstearns19:07:31

btw the (sh "-e" command) appears to be working. Thanks for the pointer.

noisesmith19:07:38

oh, if it does what you want then there you go - I assume that's

(shell/sh "sh" "-e" "...")

bwstearns19:07:55

ah yeah, typo manually typing

bwstearns19:07:28

was generating the full string of the command for debug purposes anyhow so that was an easy switch lol

lisovskyvlad19:07:04

Hey guys, do you have prepared answer on dumb question? How to start learn clojure? There is a best way - try to do something on it, but may be first you can recommend some tutorial?

seancorfield19:07:47

@lisovskyvlad Probably a good place to start is with Clojure for the Brave and True which is both a book and a free online website.

👍 4
manutter5119:07:00

Also join the #beginners channel here

8
pyr21:07:07

@bwstearns not sure if I understood what you were going for exactly, but this might help: https://github.com/pyr/stevedore

bwstearns15:07:31

Thanks for pointing me to this. It looks interesting (though not really documented enough to feel comfortable using it in my team but I'll dig into it later).