clojure 2017-03-26 | Slack Archive

qqq00:03:57

@joshjones @noisesmith : I failed to clarify this in my original question, but noisesmith is right: I don't care about arbitrary modifications The sitaution is: I have some state (say a DB). I want to modify the state (say get next available counter). So I want a way to simultaneously modify the state + return some aux data. When other people are modifying the state/DB, I don't care.

qqq00:03:46

@tbaldridge: again, I failed to mention this in orignal question, but that looks like it may be expensive if my data is an eav store with indexes 🙂

noisesmith00:03:29

but swap! is implemented in terms of compare and swap

noisesmith00:03:40

so it saves you the expense of a second atom or volatile

qqq00:03:23

so I think I'm going to add an extra field:

(defn eav-new []
  (atom
    {:eav {}
    :ave {}
    :next-id 1
    :ret nil}))

then, when I use swap!, I put the ret value in :ret , and since swap! returns the value swapped in, I can just :ret it to get the return value

qqq00:03:16

https://clojuredocs.org/search?q=swap! <-- whoa, was not aware until now, that f may be called multiple times and need to be side effect free; unaware of this optimistic concurrency

tbaldridge00:03:20

Ick, don't do that please. Just use compare-and-set! Swap is just a wrapper for CAS. So write your own variant of swap that does what you need. Don't conflate the needs of a different return value with your data model.

qqq00:03:14

@tbaldridge : if I do CAS wont I have to loop and/or decide how long to wait on contention ?

qqq00:03:26

that seems highly non trivial to get the right performance on

tbaldridge00:03:09

Um...that's literally what swap! is: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Atom.java#L75-L87 And no, it's actually quite fast, CAS on a x86 machine compiles to a single CPU instruction. CAS is the underpinning of every single synchronization construct, locks, atoms, refs, vars, namespaces, etc. they all use CAS.

Oliver George00:03:30

I've have a half baked idea relating to error messages. It's one of those "surely someone's done this" things but i'm finding it difficult to google.

Oliver George00:03:58

Aim: see/map the range of errors a programmer may encounter as a way to measure how specific/friendly/nasty they are. This might provide data suitable to score for how good errors are. Something objective and which we can improve on.

Oliver George00:03:13

Method: take a working program, randomly change one thing. compile and record if error occurs. run tests and see what errors occur. (vaguely inspired by the way spell checking happens but backwards)

Oliver George00:03:54

"Change one thing" might mean a single character change (add/remove/replace/swap) or sexpr based (raise/barf/change brackets/swap). "Error score" might relate to specificity (NPE would score low) and proximity (did it report near the file/line/col we changed).

seancorfield01:03:58

Sounds a bit like Mutation Testing https://en.wikipedia.org/wiki/Mutation_testing @U055DUUFS

Oliver George01:03:35

Thanks Sean.

Oliver George01:03:07

"Test suites are measured by the percentage of mutants that they kill." sounds like the plot to an X-men movie.

Oliver George01:03:58

And even has a quote from The Watchmen "Who will guard the guards?"

Oliver George00:03:31

Thus ends my plea to the #lazyweb

Oliver George01:03:32

Heh, I guess you could score your own program code on how error friendly it is with that technique. Adding specs and tests would improve your score (and thus developer happiness)

tagore01:03:22

Hmm- I'm interested in generative testing, but...

tagore01:03:38

I'm not sure how we'd measure that..

tagore01:03:39

So for instance if an error message was all like "Your Momma so big when she sits around the house she sits around the house" we'd mark that as unfriendly? Or friendly?

seancorfield01:03:55

I think that falls foul of the Code of Conduct… you might want to delete it.

tagore01:03:33

(I have been tempted to return that as an error message, in cases of great stupidity, but have so far manfully resisted doing so.)

qqq01:03:02

@tbaldridge : this part I don't understand: how can CAS be fast if there is a comparison involved?

qqq01:03:24

@tbaldridge: if I give you two maps, how can you check whether they're equal or not with just one x86 instr ?

qqq01:03:33

this seems like it should take worst case O(n) time

tagore01:03:33

@qqq are you under the impression that your proccesor knows about maps?

qqq01:03:56

@tagore: no; I'm saying clojure equality has to be expensive

qqq01:03:09

I'm familiar with machine arch / have written x86 assembly before.

tagore01:03:36

@qqq It's not at all easy to follow things through to the machine, but...

tagore01:03:16

In the case of clojure, since data structures are immutable a simple pointer comparison is an equality check, right?

joshjones01:03:19

@qqq there are usually cpu ops for registers (x86 is CMPXCHG), but obviously when comparing something like two maps, this will not be a single op

qqq01:03:40

@tagore @joshjones: I should clarify my question.

qqq01:03:49

Does CAS just do pointer comparison or does it check for values?

qqq01:03:10

i.e. as far as CAS is concerned, is [1 2 3] and (let [x 1 y 2 z 3] [z y z]) the same or different ?

qqq01:03:27

because if they're the same, it seems like CAS equality checks can be very very expensive

qqq01:03:39

especially if it's a large map and I'm only updating one key

tagore01:03:51

I don't know, and that's why the stuff I write that absolutely has to be as fast as possible is written in C.

tagore01:03:21

I'm not saying it's not a good question.

qqq01:03:09

ne more thing, by "CAS" I mean clojure-CAS< not x86-cas

tagore01:03:41

I am saying that I have written some software that must be as fast as possible, and I didn't use Clojure for it.

tagore01:03:47

If you are concerned with perfomance...

tagore01:03:20

First of all, how are going to measure and fix cache misses?

tagore01:03:02

And how are you going top measure and fix branch misprediction?

qqq01:03:07

Actually, I' just curious whether it's wort ase O(1) orworstcase O(n)

tagore01:03:48

O(something) tends to obscure more than it reveals in many scenarios.

tbaldridge01:03:55

@qqq it's an (identical? ...) check

tbaldridge01:03:19

it's literally a pointer comparison so it's the same cost as doing an equality check on a 64bit int

qqq01:03:35

@tbaldridge : that clarifies it; thanks!

tagore01:03:41

I',m essentially unwilling to talk low-level performance without examples, because real performance questions are very specific, and have to do with looking at how cache is used, etc.

joshjones01:03:54

@qqq to flesh it out, the javadoc for AtomicReference.compareAndSet says:

Atomically sets the value to the given updated value
     * if the current value {@code ==} the expected value

... just confirming what tb said, yes, it is a pointer comparison (`==`), and if you follow that to the Unsafe java class, it is a native method, which would lend credence that it would use a cpu op made specifically for that

tagore01:03:58

If you're not down to the level of measuring branch mis-prediction and cache performance, you should likely be working on your algorithms instead.

tbaldridge01:03:18

@tagore read the conversation history and it will make a bit more sense

tbaldridge01:03:37

this all started because @qqq wanted different return semantics than swap! provided. My assertion was that writing your own loop and using compare-and-set! was not significantly slower than swap!

tagore01:03:48

Hmm...

tagore01:03:06

Have you measured?

tbaldridge01:03:48

I don't need to, I understand how the JVM works.

tbaldridge01:03:18

Swap calls CAS in a loop, I'm suggesting that @qqq call CAS in a loop. If anything his code might be faster due to the removal of the .apply in the middle of his loop, perhaps leading to better inlining. But in the end, any work being done inside the loop is going to dominate the execution time.

tagore01:03:21

But you say "not significantly slower"

tagore01:03:25

Hmm- I'l defer to your knowledge of the JVM, as I know little about it, but...

tbaldridge01:03:29

Especially, if any work is being done with PHM. Persistent collections are "fast", but way more expensive than any overhead of doing a CAS loop in Clojure.

tagore01:03:23

That sort of ordering certainly matters when writing C, especially when cache is a consideration.

tagore01:03:04

I can't say I've ever written anything for the JVM that had to be all that performant though.

tbaldridge01:03:17

Well one of the biggest problems with C is it doesn't have a memory model. So there isn't much to compare between the two.

tagore01:03:53

Indeed- on the other hand that is one of C's greatest strengths 😉

tbaldridge01:03:49

heh, some would disagree. Lock free structures are really hard when the compiler is allowed to shoot you in the foot.

tagore01:03:56

C is very good at being what it is. It's too bad so many people treat it as something else.

tagore02:03:17

Hmm- where I want to lay things out very precisely in memory, in order to do lots of vector math and not incur any overhead, I'm hard-pressed to think of a reasonable alternative to C.

tagore02:03:26

Especially if I want to measure lots of things about cache-misses, branch-mis-prediction, etc.

tbaldridge02:03:59

Well if you're interested in all of that, have you listened to talks by Martin Thompson?

tagore02:03:06

Nope.

tagore02:03:51

I've just spent a lot of time optimizing things to the bone.

tbaldridge02:03:15

https://www.youtube.com/watch?v=MC1EKLQ2Wmg

tbaldridge02:03:34

He talks a lot about this stuff and the vast majority of his work is on the JVM.

tagore02:03:45

Ah, I see...

tbaldridge02:03:18

so anyway, good link for anyone here interested in high performance code.

tbaldridge02:03:28

but I think we may be a bit #off-topic now 🙂

tagore02:03:06

I guess what I'd say is that I spent something like 6 months optimizing one particular algorithm for Intel CPUs.

tagore02:03:09

I think that is sometimes worthwhile- but only rarely.

tagore02:03:07

I like your link, but I'm familiar with what he's talking about- modern pipelined architectures are complicated, but if you want to get every cycle from them...

tagore02:03:34

OTOH, I might be missing something here, but I don't see how you can do so if the JVM is in the way

tagore02:03:19

That said, I think it is very rare that people need to be as fast as possible

tagore02:03:22

And it's likely that, unless you are willing to put months into measuring things, a smart com;ier will produce faster code than you can.

tagore02:03:50

But, if your code must run as fast as possible, humans are still able to contribute something, I think.

tagore02:03:07

I can't write assembly anywhere near as efficient as a C compiler does, but if I'm willing to put months into optimizing things I can use my knowledge of the problem, the compiler, and the machine to produce something that is very efficient, and that relies on my understanding to be so.

tagore02:03:12

This has to do, mainly with how things are laid out in cache, etc.

tagore02:03:18

And how I avoid branching.

tagore02:03:11

One amusing consequence of modern architectures is that caching calculated values is often a loss, where it would have been a win 20 years ago.

tbaldridge02:03:52

I'd look into some of Martins work, he delves into stuff like CPU pinning, sending messages between sockets in the patter best supported by the motherboard, RAM pinning, all on the JVM.

tbaldridge02:03:41

You can do most of what you describe on the JVM, it just takes some understanding of the machine

tagore02:03:15

Perhaps- it seems easier to do it in C though, if you really care about cache misses and branch mis-prediction.

tagore02:03:54

And it has been my experience that you either do or you don't.

tagore02:03:50

I think it is really very rare that you ought to care,

tagore02:03:22

And in most cases things that are slow are slow for entirely different reasons (bad string handling is a common suspect, and that's a long way from bare-metal)

tagore03:03:28

Bad algorithms too...

cristobal.garcia08:03:23

Hi, I have an issue with an small application I am creating using the reloaded workflow, component and the reloaded.repl package. Basically, when I issue a (reset) function, which refreshes namespaces and re-starts the component again, old function definitions are still in place. The component includes an immutant web server whose ring-handler is created by a function. What I am seeing is that the function that creates this handler is not being refreshed despite of (reset) being issued. I am running all this from emacs (cider + scratch buffer). Any ideas where I might look at? Thanks in advance!

noisesmith15:03:05

The idea with reloaded is that when you shut down your component, the web server should be stopped. The old definitions are literally destroyed - as in the namespace is deleted and a new one created. The problem here is that if the old code is still running, it's still using the old definitions, and as the old namespace was deleted and recreated, the new definition is something the running process will never find.

deadghost11:03:15

anyone used the debug interceptor in re-frame?

deadghost11:03:36

the output https://www.refheap.com/134063 doesn't look all too helpful to my eyes

deadghost11:03:54

"The output produced by clojure.data/diff can take some getting used to, but you should stick with it -- your effort will be rewarded."

mikethompson11:03:04

Do you have https://github.com/binaryage/cljs-devtools installed ?

deadghost11:03:31

nope, I've been using firefox as my main browser

deadghost11:03:40

I'll set it up on chromium

mikethompson11:03:02

The comment about taking time to get used to the output is related to the use of https://clojuredocs.org/clojure.data/diff

mikethompson11:03:26

when showing you the difference between state before the event, and state after.

mikethompson11:03:39

Yeah, having cljs-devtools installed is pretty essential for all clojurescript work

lmergen13:03:20

is anyone using weavejester’s new integrant library already ? i’m a bit unsure where to put my entire system config — since it’s now just data, should this perhaps be in resources/config.edn ?

dominicm14:03:13

@lmergen You can integrate it with aero if you're inclined 😉

dominicm14:03:00

@lmergen https://github.com/weavejester/integrant/issues/12 is of note to you 🙂

lmergen14:03:36

@dominicm that is exactly what i did!

lmergen14:03:56

or rather, i did this:

(defmethod aero/reader 'ref
  [{:keys [profile] :as opts} tag value]
  (ig/ref value))

lmergen14:03:50

i was just wondering whether that approach made sense, and/or whether it could be improved

dominicm14:03:19

@lmergen If you're calling it 'ref you're overriding aero's built-in ref!

lmergen14:03:03

yeah, you’re right, i should namespace it

dominicm14:03:20

I liked the name ig/ref 🙂. Short, but namespaced.

lmergen14:03:13

makes sense

yonatanel15:03:02

How did I miss the Integrant lib? Randomly checking #clojure proves beneficial.

dominicm16:03:33

@yonatanel the trick is to channel everything straight to your brain instead of trying to filter 😁

qqq18:03:59

What's a good blob store db to use with clojure? I'm lookin for something that would serve a similar role (need not be API compatible) as S3 or GCP/CloudStore

noisesmith18:03:57

you can use s3 from clojure

qqq18:03:09

I need to clarify: I need something that I can run locally. I want a blob store. It'll serve as similar role (need not be API compatible) with S3 / GCP/Cloudstore.

qqq18:03:33

This is a high volume local setup. I don't want to deal with S3 bandwidth costs.

lmergen19:03:02

I would go for https://github.com/replikativ/konserve with the file store

tbaldridge19:03:49

@qqq so there's this thing called a file system... 😀

joshjones21:03:06

question about testing using lein -- i have this working, but am guessing there is a much better way: I have some setup/teardown code that must run before/after ALL tests, not just tests within a namespace. the goal is to run lein test and have it run the setup and teardown only once. so, use-fixture won't help. Using a simplified version of what's at https://stuartsierra.com/2016/05/19/fixtures-as-caches , I created a new namespace called my.app.setup-tests, and in there put a single (deftest ^:special-tag setup-runtests-teardown ... and in there I call my setup code, call run-all-tests, then call teardown code. I get the results of run-all-tests which will tell me if there were failures or errors, and then call (is true) or (is false) to propagate the errors/failures to originally calling lein test. In my project.clj, I add :test-selectors {:default (fn [m] (:special-tag m))} to have lein test only call my special deftest, and it works. Now, I'm sure there's an easier way. Thoughts?

noisesmith21:03:23

there's some simplifications possible -for example, if a function invokes is, it becomes a success/failure if a deftest invokes it (even indirectly)

noisesmith21:03:00

so you could have a "single test namespace" that actually loads test definitions from all the others, and has the apropriate fixtures

joshjones21:03:17

cool - how would I "load test definitions from all the others" ?

noisesmith21:03:10

joshjones they are namespaces, require them, and call the functions

noisesmith21:03:23

if the function invokes is, it shows up as a test case

joshjones21:03:35

excellent, this was what i was looking for -- i suppose i can get all functions in the namespace, and then see if they have :test as metadata on them, in order to know they're tests and call them. or is there an easier way you can think of? @noisesmith

noisesmith21:03:01

I mean - that's what clojure.test itself is doing...

noisesmith21:03:20

there might be something even simpler

noisesmith21:03:00

for example, what about a leiningen plugin that loads are your test files (like clojure.test does) runs your setup, runs clojure.test/run-all-tests, then runs your tear-down?

noisesmith21:03:45

then you don't need to do anything unusual in your tests - finding all files under test/ and then running clojure.test/run-all-tests is not all that complicated

joshjones21:03:08

well, i'm already calling run-all-tests in my current iteration

noisesmith21:03:31

right, so why not just do the file-seq / load-file of your test directory too? seems simpler than the other options

joshjones21:03:41

sorry, i don't follow

noisesmith21:03:53

you don't even need lein test at this point - call (file-seq "test/") and for everything that comes back with a name ending in ".clj" call load-file on it; then run your pre-test-hook, then call (clojure.test/run-all-tests) then call your post-test-hook

noisesmith22:03:18

then your test namespaces don't have to be weird - they can be normal test namespaces

noisesmith22:03:37

you avoid complex higher order code that tries to figure out how to construct tests

joshjones22:03:54

well, we have a travis script that calls lein test

noisesmith22:03:11

then write a lein plugin that does this instead, it's like 15 lines of code max

noisesmith22:03:40

and it's a lot simpler to make that than it is to make N indirect test namespaces with automatically generated code

noisesmith22:03:25

correction: it's (file-seq ( "test")) - that gives all the eligible files, just select the ones that have names ending in clj

noisesmith22:03:08

or you can do your other plan, I just think making an alternate test runner is simpler

joshjones22:03:31

alright, will give both a shot and see which is less dirty 🙂 thanks for your help

joshjones22:03:30

one more thing though -- if I do the lein plugin, can i hook into lein test and have it run?

noisesmith22:03:47

it would be more like lein mytest

noisesmith22:03:02

you would have replaced lein test altogether (in order to have a wrapper around it)

joshjones22:03:14

well, i need lein test to work

noisesmith22:03:52

you can't ask CI to run lein mytest ? - lein test isn't magic, it just runs a file-seq, selects the clj files and loads them, then uses clojure.test

joshjones22:03:34

maybe so, but we're open-sourcing this at some point and really would rather have the standard commands work as people are familiar with them

noisesmith22:03:19

make a PR so lein test accepts a global test wrapper 😄

noisesmith22:03:29

(as an optional config of course)

noisesmith22:03:18

it just seems like it's asking for trouble to do logic that should surround test running, from underneath inside the test runner

lvh22:03:53

So, I find myself literally never using letfn, and instead always using let. Is that just me? I know about mutual recursion; I just can’t remember the last time I did mutual recursion that it wasn’t an explicit exercise in attempting to do mutual recursion...

seancorfield22:03:17

@lvh I'm in the same place as you on letfn. I don't think I've ever needed to use it.

lvh22:03:23

I guess maybe a recursive descent parser counts as mutually recursive, but I’m guessing I’d use toplevel production rules instead of a letfn. (Not to mention, for anything nontrivial that needs parsing: a parser generator.)

mobileink23:03:16

you never need to use it, it's just a convenience op, no? it does nothing you could not do in other ways, but it could in principle make your code more expressive? tons of stuff like this in clojure.

noisesmith23:03:50

it specifically does something you can't do in other ways: it allows direct mutually recursive calls of anonymous functions

mobileink23:03:59

anonymous fns? you have to name your fns in letfn, no?

noisesmith23:03:43

functions can be anonymous and named - anonymous as in not vars interned to a namespace

noisesmith23:03:32

😄 maybe we need a different word to reflect what these functions really are

mobileink23:03:08

huh? if you can refer to then by name, who cares what the mech is? my pt is that the fnspec in letfn requires a name. it's not let*.

noisesmith23:03:49

OK "it specifically does something you can't do other ways: it allows mutually recursive calls of first class functions"

noisesmith23:03:25

it's the only thing in clojure that allows doing this directly without using deref or mutation of some sort

mobileink23:03:53

ok. but i guess that would count as other ways, obviously.

noisesmith23:03:53

sure, if you allow indirect methods of doing something, nothing in cs is unique - it's the only thing in clojure that implements this functionality without hacks or the necessity of namespace level bindings

noisesmith23:03:08

~~this has a historical importance, because common lispers complained about clojure not having generalized tail call optimization~~

noisesmith23:03:40

~~letfn and trampoline both attempt to address this~~

noisesmith23:03:51

sorry wait that's totally wrong

mobileink23:03:35

well my point was just that it is a convenience op. i don't see why thstvis controversial. can you give an example where use of letfn could not be replaced? iow it is not primitive.

mobileink23:03:19

beenthwredonethat

noisesmith23:03:21

mobileink can you show an example of locally bound functions that mutually call one another without making a namespace level binding or using deref?

mobileink23:03:21

i asked you first! 😉 lemme thunk about it.

mobileink23:03:46

ok i thought about it. you can do that without using letfn?

noisesmith23:03:20

that's something you can't do without letfn or def or mutables

noisesmith23:03:50

and sometimes def or mutables aren't good solutions

mobileink23:03:53

ok, but i did not say "letfn or x or y or z". i said letfn. you've just acknowleged my point.

noisesmith23:03:59

(ok, self nitpick, you could do it with a promise, which is not as bad as a mutable, but at least as ugly)

noisesmith23:03:34

mobileink you could do everything with asm.java, you don't need anything else that is in clojure.jar

noisesmith23:03:40

the thing is, it would suck

noisesmith23:03:53

not being able to do an algorithm without def, sucks

bcbradley23:03:15

I want to ask for some directional advice: I began working on a function that would convert the lwjgl javadocs into a data structure that describes the library with the intention of using that to generate a thin clojure wrapper over the library (with a few changes, like stripping out extraneous 'gl-' and 'm-' and other prefixes) but i've got a non-program problem i'm being faced with-- how should i package it? I had planned to make each class its own namespace; java doesn't have namespaces so it abused classes to be namespaces in lwjgl. C doesn't have namespaces, so it prepends junk to the front of each function name. Clojure has them, so I figured it would be wise to just use namespaces where lwjgl is using classes. However, if I do that then it naturally seems like each package (class container) would be a namespaces container (library). But there are over 30 packages in lwjgl. Should i just make 30ish libraries?

mobileink23:03:24

it might suck, but that's not televant. i mean relevant, but i'm kinda liking "televant" ;)

noisesmith23:03:00

mobileink - so you're agreed that clojure could just ditch everything but asm.java and we'd be fine?

mobileink23:03:57

get a grip, dude! where the hell did that come from? all i said is that letfn is a convenience op.

2017-03-26

Channels