clojure 2017-11-29 | Slack Archive

noisesmith00:11:09

(ins)user=> ,(def foo:bar 1)
#'user/foo:bar
(ins)user=> foo:bar
1
(ins)user=> {::a foo:bar}
#:user{:a 1}

bfabry00:11:40

stop this madness 😱

noisesmith00:11:50

@bfabry the docs specifically say it’s allowed… but it definitely feels weird

gdeer8100:11:49

That's the thing about having a library that processes other people's code, it has to handle any possibility

tomelkin00:11:14

Hey all. If anyone else is interested in having official New Relic support for Clojure, please upvote my support ticket. 🙂 https://discuss.newrelic.com/t/feature-idea-current-state-of-clojure-instrumentation/53105

gonewest81800:11:27

I’d prefer to see this done via their http://opentracing.io compatibility. The opentracing api is pretty easy to deal with via java interop. I have been messing with a clojure wrapper as well, but pretty much it all converges to a with-open that opens the trace context, and some (still being debated) machinery to propagate tracing contexts in-process across threads and such.

gonewest81801:11:29

https://blog.newrelic.com/2017/09/13/distributed-tracing-opentracing/

tomelkin01:11:08

@gonewest818 Interesting. Thanks, will check it out

gonewest81801:11:29

The downside to opentracing is how early it is. Elements of the api are not settled, like the in-process propagation thing I mentioned, but basic tracing is in place and multiple backends are available.

localghost02:11:42

warning: clojure n00b here. i'm trying to process data from a gzipped file that has JSON documents concatenated together in it (no delimited, not even a newline). so far I have two functions that can gunzip the file and produce a reader. i've been trying to find ways to parse this into JSON. one idea is to process the character stream from the reader, keep track of curly braces and emit/print when the i'm at the appropriate place ("}{"). but I'm having a hard time keeping track of state in absence of mutable variables. another idea is to use something like re-find but it works only on strings. I don't want to read in the while file since the data I have is too large and I'd rather process it lazily since i plan to run this in aws lambda (limited memory). the functions I have so far (apart various broken experiments):

(defn in [f]
  (->
   ( f)
   (java.util.zip.GZIPInputStream.)
   ()))

(defn char-seq
  [^java.io.Reader rdr]
  (let [chr (.read rdr)]
    (if (>= chr 0)
      (cons (char chr) (lazy-seq (char-seq rdr))))))

i could make the process and algorithm work in python (very slow) but would prefer to use clojure. i've spent too much time (2 days) trying to figure out a way but no luck so far.. any pointers/advice at all would be greatly appreciated.

ghadi02:11:26

You're on the right track @clojurian! You have the reader all ready to go. Let's focus on the other half of the problem: parsing a series of JSON values (i.e. concatenated)

ghadi02:11:08

Take an example of concatenated JSON and represent it not as a stream but as a string:

ghadi02:11:19

(def example "{}{}{}")

ghadi02:11:33

What you want at the end is a seq or vector like so:

ghadi02:11:45

(def expected [{}{}{}])

ghadi02:11:11

i.e. a clojure sequence of 3 empty, clojure, maps.

ghadi02:11:10

To parse the example string, we want to turn this back into a stream, because ~~the~~ most JSON parsers take a stream (i.e. java Reader) as their input. And you also said you want to parse this lazily.

ghadi02:11:56

So import cheshire (a common JSON reading library):

ghadi02:11:42

user=> (require '[cheshire.core :as json])
nil
user=> (doc json/parse-stream)
-------------------------
cheshire.core/parse-stream
([rdr] [rdr key-fn] [rdr key-fn array-coerce-fn])
  If multiple objects (enclosed in a top-level `{}' need to be parsed lazily,
  see parsed-seq.

localghost02:11:52

cheshire is ready 🙂

ghadi02:11:08

I've elided some useless docs, and it's pointing us to parsed-seq for your/this use case.

ghadi02:11:27

user=> (doc json/parsed-seq)
-------------------------
cheshire.core/parsed-seq
([reader] [reader key-fn] [reader key-fn array-coerce-fn])
  Returns a lazy seq of Clojure objects corresponding to the JSON read from
  the given reader. The seq continues until the end of the reader is reached.

ghadi02:11:37

OK, we got wheels.

ghadi02:11:26

Note that because it returns us a lazy seq, you'll have to consume it fully before closing the underlying Reader.

ghadi02:11:55

Usually you close the reader when it leaves lexical scope, using with-open

ghadi02:11:13

(with-open [r (some-stream)] (do all the stuff))

ghadi02:11:17

(This doesn't mean you have to load the whole stream eagerly into memory, we'll still be mindful of efficiency)

ghadi02:11:21

Typical ways to consume the seq fully are: 1. do some collection operations using the seq, and wrap that code in doall 2. reduce / transduce (collecting accumulation) 3. doseq

ghadi02:11:02

4. (into []) or vec wrapping collection operations (variation of 1.)

ghadi02:11:05

Ok... so to make this concrete let's parse that example above. (I'll leave the task of glueing it to the GZipReader to you)

ghadi02:11:22

user=> (with-open [r (java.io.StringReader. example)] (vec (json/parsed-seq r)))
[{} {} {}]

ghadi02:11:18

user=> (= *1 expected)
true

localghost02:11:02

@ghadi thank you so much! this is very educational. that parsed-seq function is great. that combined with one of the 4 things you listed should do the trick. i'm trying to digest what you said and seeing some light. i'll come back after a little while after I've tried a few things.. thank you again.

ghadi02:11:24

ya @clojurian. clojure is stupidly fun

ghadi02:11:41

enjoy

localghost02:11:56

it is but it's incredibly humbling too.

localghost02:11:54

i really like how you built that small example to test and show the idea.

localghost02:11:35

my brain needs to be disinfected of procedural thinking. unlearning is hard.

localghost03:11:51

i don't know whether to be happy or be sad... after hours upon hours of frustration, all of it works in just a couple of lines.

localghost03:11:06

you're awesome! @ghadi

seancorfield03:11:31

@clojurian I think one of the things with Clojure is that simplicity is king. If you're struggling with a problem and your code seems overly complex, then there's probably a much simpler, more idiomatic solution 🙂

localghost03:11:27

i'll keep that in mind.

localghost03:11:26

on that a related question. i don't have any local user group or colleagues who do clojure. i was super frustrated, had a headache on, dozens of tabs open and what not.. and had no one I could ask. normally I hesitate to bother but i find that my progress is also super slow. when is it appropriate (or inappropriate) to come ask for help here?

eriktjacobsen04:11:22

@clojurian #beginners

localghost04:11:17

sounds like the place for me. thanks @eriktjacobsen

hawari09:11:37

Edit: Sorry I notice that there's a dedicated aws-lambda channel, I'll move my question there

dpsutton17:11:31

i see that there's brandon bloom's backtick lib, but is there any way to make clojure not impose namespaces on syntax-quoted forms?

bfabry17:11:19

qualifying symbols with a namespace is half of the point of syntax quote

bfabry17:11:07

you can always regular quote the symbols you don't want syntax quoted

`(~'a)
=> (a)

dpsutton17:11:25

i need it without it to send to datomic

dpsutton17:11:48

so all of the unification symbols have to be marked. quite a pain and ugly

dpsutton17:11:26

`[:find [(~'pull ~'?task ~tasks/task-shape) ...]
    :in ~'$ ~'?docref
    :where
    [~'?task :property1 ~'?docref]
    [~'?task :property2 ~'?tc]
    [~'?tc :tags "tag"]]

bfabry17:11:19

why not use regular quote?

dpsutton17:11:47

the tasks/task-shape. I'm trying to put the query for what a task looks like in one place and then let others query against this common shape however they like

bfabry17:11:50

oh I see you're using one var in there

dpsutton17:11:52

so i need to resolve

dpsutton17:11:53

yeah

bfabry17:11:29

probably easiest to build it up in pieces I guess, but yeah bit annoying

[:find [`(~'pull ~'?task ~tasks/task-shape) ...]
    :in '$ '?docref
    :where
    '[?task :property1 ?docref]
    '[?task :property2 ?tc]
    '[?tc :tags "tag"]]

dpsutton17:11:08

yeah good point. use the vectors and keywords to my advantage. still not sold on it. gonna roll it around a bit

noisesmith17:11:49

@dpsutton it seems like it would be easier to just not use ` at all there

[:find [(list 'pull '?task tasks/task-shape) ...]
  :in '$ ?docref
  :where
  '[?task :property1 :docref]
  '[?task :property2 ?tc]
  '[?tc :tags "tag"]]

rauh17:11:57

@dpsutton I just use https://github.com/gfredericks/misquote/blob/master/src/misquote/core.clj for that.

dpsutton17:11:58

thanks. that's looking the cleanest so far. just makes me nervous that this stuff just silently and happily fails on datomic if its namespaced. i'm deciding which is worse the super important syntax or the duplication of the structure

dpsutton17:11:38

thanks @rauh i saw bblooms backtick as well. any reason to choose one over the other that you know of? I think i saw brandon handles gensyms but not sure how important that is for my purposes

rauh17:11:34

@dpsutton Not sure, but I use the above for exactly that purpose: Creating datomic queries. It works well. Though I changed it slightly to make Cursive happy

dpsutton17:11:49

well thanks for the suggestion. gonna mull on it

rauh17:11:04

@dpsutton Here the full example: https://gist.github.com/rauhs/d2575e77e6e063ae94abbd9f1bca226d

misha19:11:04

are type hints (in clojure) used only to avoid introspection?

tanzoniteblack19:11:46

yes, type hints are only to avoid reflection

misha19:11:03

or is there some way to enforce them, apart from {:pre [(instance? MyRecord arg)]}?

tanzoniteblack19:11:36

if you're interested in enforcing types, you should probably consider spec or plumatic schema

tanzoniteblack19:11:00

or pre/post/raw assertions, if you're only using them on occasion

misha19:11:47

I use spec, but I also have a bunch of defrecords hanging around, wanted to squeeze some extra value out of those

misha19:11:48

right now I am getting familiar with a module in an app, and sprinkling type hints all over fn signatures helps to understand teh mess, also IDE's "show usages"

misha19:11:38

but it seems like specs is the ultimate way to go, if records are not really used for extending protocols, etc.

ghadi19:11:14

Spec is the way to go with this stuff. You can turn it on and off. I think you are going to regret extensive manual assertions and hints.

misha19:11:30

getting decent spec coverage and "infrastructure" is a project in itself, though

ghadi19:11:51

it can be a lot of work, but pays off immensely

misha19:11:57

yeah, agree on manual assertions

ghadi19:11:01

esp. wrt generative testing

misha19:11:04

true

qqq19:11:29

A :: map B :: set I want to remove all (k,v) pairs from A where k is in B is the best way to do this (apply dissoc A B) or is there a more efficient way ?

nathanmarz19:11:34

@qqq (reduce dissoc A B)

bronsa19:11:30

reduce won't perform better here

nathanmarz19:11:11

user=> (def A {:a 1 :b 2 :c 3})
#'user/A
user=> (def B #{:a :c})
#'user/B
user=> (time
  #_=>   (dotimes [_ 1000000]
  #_=>     (reduce dissoc A B)
  #_=>     ))
"Elapsed time: 120.139233 msecs"
nil
user=> (time
  #_=>   (dotimes [_ 1000000]
  #_=>     (apply dissoc A B)
  #_=>     ))
"Elapsed time: 335.459923 msecs"

bronsa19:11:32

interesting

bronsa19:11:19

I would imagine that with a larger sized B apply would evenually win, as the vararg version can iterate over the coll w/o having to pay an invocation price

bfabry19:11:55

yeah seems heavily dependent on the size of A and B

bronsa19:11:59

but maybe the seq path is that much slower than the native reduce of sets

ghadi19:11:43

from the peanut gallery over here: neither choice will/should dominate in your codebase

qqq19:11:18

(def s1 (range 1000000))

(def s2 (map #(* 2 %) (range  500000)))

(def m (into {}  (for [x s1] [x x])))

(time  (do (apply dissoc m s2)
           nil))

(time (do (reduce dissoc m s2)
          nil))

I'm not getting a noticable difference

ghadi19:11:18

you're measuring cold code on the JVM

ghadi19:11:37

need some JITing action to make a better benchmark

qqq19:11:51

yeah, using criterium bench now

qqq19:11:16

i'm still getting used to the concept of having to 'warmup' the system first

qqq19:11:00

(def s1 (range 1000000))

(def s2 (map #(* 2 %) (range  500000)))

(def m (into {}  (for [x s1] [x x])))

(cc/quick-bench 
 (do (apply dissoc m s2)
     nil))

(comment
              Execution time mean : 456.384541 ms
    Execution time std-deviation : 69.565195 ms
   Execution time lower quantile : 411.034707 ms ( 2.5%)
   Execution time upper quantile : 545.198448 ms (97.5%)
                   Overhead used : 2.943666 ns)




(cc/quick-bench 
 (do (reduce dissoc m s2)
     nil))

(comment
              Execution time mean : 469.385020 ms
    Execution time std-deviation : 83.522059 ms
   Execution time lower quantile : 393.009549 ms ( 2.5%)
   Execution time upper quantile : 546.717419 ms (97.5%)
                   Overhead used : 2.943666 ns)

qqq19:11:03

what surprises me most: (apply dissoc ...) doesn't cause a stack overflow for having so many args on the 'stack frame'

bfabry19:11:27

I can explain that

bfabry19:11:50

https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/IFn.java

bfabry19:11:00

after 20 arguments they just get shoved in an array

qqq20:11:08

lol, there's something hilarious about that code

nathanmarz20:11:52

with criterium not seeing any cases where apply version outperforms reduce version

nathanmarz20:11:26

still seeing reduce version out-perform for small inputs by ~10%

qqq20:11:19

https://github.com/clojure/clojure/blob/clojure-1.9.0-alpha14/src/clj/clojure/core.clj#L1501-L1513

qqq20:11:26

I'm disappointed the apply version doesn't end up using transients

bfabry20:11:25

transients do actually give the reduce version a significant edge

(criterium.core/quick-bench (apply dissoc m s2))
Evaluation count : 6 in 6 samples of 1 calls.
             Execution time mean : 336.386302 ms
    Execution time std-deviation : 19.057339 ms
   Execution time lower quantile : 319.084193 ms ( 2.5%)
   Execution time upper quantile : 360.243222 ms (97.5%)
                   Overhead used : 1.529080 ns
=> nil
(criterium.core/quick-bench (reduce dissoc m s2))
Evaluation count : 6 in 6 samples of 1 calls.
             Execution time mean : 330.780186 ms
    Execution time std-deviation : 30.145095 ms
   Execution time lower quantile : 291.699187 ms ( 2.5%)
   Execution time upper quantile : 360.677617 ms (97.5%)
                   Overhead used : 1.529080 ns
=> nil
(criterium.core/quick-bench (persistent! (reduce dissoc! (transient m) s2)))
Evaluation count : 6 in 6 samples of 1 calls.
             Execution time mean : 226.415054 ms
    Execution time std-deviation : 7.032915 ms
   Execution time lower quantile : 217.464830 ms ( 2.5%)
   Execution time upper quantile : 233.430482 ms (97.5%)
                   Overhead used : 1.529080 ns

bfabry20:11:46

(for large inputs, for small they incur a small cost)

qqq20:11:42

@bfabry: you beat me to it, was just about to post transient code

qqq20:11:24

(def lst (for [x (range 1000000)] [x x]))

(defn massoc! [o [k v]]
  (assoc! o k v))


(cc/quick-bench (into {} lst)) 
(comment
              Execution time mean : 591.524037 ms
    Execution time std-deviation : 68.112686 ms
   Execution time lower quantile : 555.200521 ms ( 2.5%)
   Execution time upper quantile : 709.664574 ms (97.5%)
                   Overhead used : 2.943666 ns
Found 1 outliers in 6 samples (16.6667 %)
	low-severe	 1 (16.6667 %)
 Variance from outliers : 31.1510 % Variance is moderately inflated by outliers)




(cc/quick-bench (persistent! (reduce massoc! (transient {}) lst))) 
(comment
              Execution time mean : 515.178533 ms
    Execution time std-deviation : 69.452690 ms
   Execution time lower quantile : 482.675140 ms ( 2.5%)
   Execution time upper quantile : 635.692003 ms (97.5%)
                   Overhead used : 2.943666 ns
Found 2 outliers in 6 samples (33.3333 %)
	low-severe	 1 (16.6667 %)
	low-mild	 1 (16.6667 %)
 Variance from outliers : 31.7413 % Variance is moderately inflated by outliers)

Why doesn't clojure core use transients by default ?

arttuka21:11:47

@qqq: into does use transients

arttuka21:11:44

see here (relevant part of into source from core.clj):

(defn into
  ([to from]
     (if (instance? clojure.lang.IEditableCollection to)
       (with-meta (persistent! (reduce conj! (transient to) from)) (meta to))
       (reduce conj to from))))

josh.freckleton21:11:26

I have a long list of validation checks, and steps that happen interleaved among them. The validation checks can throw errors: right now I have:

(let [_ (check)
      a (f)
      _ (check2)
      b (g)
      ...]
  z)

basically, those validations/throwing errors are a way to short circuit the logic the main alternative I see is a series of if each with a nested let is there a cleaner way of achieving this sort of Either monad pattern?

qqq21:11:16

@arttuka: good call, thanks for the correction

manutter5122:11:21

@josh.freckleton some-> ?

noisesmith22:11:25

yeah, I usually use some-> or some->> for that kind of thing

noisesmith22:11:51

but reduce, then returning a reduced as appropriate can work too

noisesmith22:11:30

(reduce (fn [acc v] (if (borked? acc) (reduced acc) (frob acc v)) init coll)

manutter5122:11:47

The nice thing about reduce/`reduced` is you can return something other than nil if any of the steps fail.

noisesmith22:11:53

indeed

josh.freckleton22:11:55

re some->, kind of, but I need to report details of where it failed re reduced, interesting, I hadn't considered that, I kind of like that

noisesmith22:11:29

your coll is likely to be a vector of checks to run in that case

josh.freckleton22:11:29

I'm not really moving across a collection though...

josh.freckleton22:11:34

manutter5122:11:39

It’s that weird fp stuff, instead of reducing a function over a collection of args, you reduce an arg over a collection of fns. 😉

noisesmith22:11:09

so (reduce (fn [context check] (let [checked (check context)] (if (OK? checked) checked (reduced checked)) {} [check1 check2]) sort of like this

noisesmith22:11:42

and then the hash map can hold a, b, etc. under keys

josh.freckleton22:11:57

and then the collection I'm accumulating would be my otherwise let bound vars. ya, the context

noisesmith22:11:13

though at this point loop/recur might be easier to read?

noisesmith22:11:21

your call

josh.freckleton22:11:44

@noisesmith that's a genius pattern, I'll see if it works for this, thanks

noisesmith22:11:53

:thumbsup:

noisesmith22:11:02

it’s something I have used to success in my own code

Alex Miller (Clojure team)22:11:04

you might also want to look at the new halt-when transducer in 1.9 (although it’s a little tricky to use well)

the2bears23:11:33

Has anyone used a clojure wrapper around git's cli? There are some that wrap JGit, but as far as I can tell JGit does not yet support 'clone --mirror ...' and I need that functionality.

noisesmith23:11:34

@the2bears I’d generally look at doing interop before opting for a wrapper anyway - if the API is a total pain I then check for a wrapper, but often just using interop is fine

the2bears23:11:07

agreed, in that I don't need any more functionality other than 'clone --mirror' and then 'push --mirror'

the2bears23:11:28

maybe even running the shell command...

the2bears23:11:35

shell interop 🙂

noisesmith23:11:47

that’s straightforward to use, if it’s good enough

the2bears23:11:50

plus authentication

the2bears23:11:25

yeah, I'll try that out. Thanks! And by the way, you're absolutely one of the busiest sources of help here... really appreciated.