beginners 2022-05-11 | Slack Archive

Jon Olick03:05:29

I don’t understand why atoms and refs are separate thing in clojure? Why can’t STM work with atoms (which are refs), and then refs just go away entirely?

hiredman03:05:25

atoms are significantly simpler than refs

Jon Olick03:05:43

go on

seancorfield03:05:52

In production Clojure you will almost never see ref but you will see some atom instances.

hiredman03:05:36

For refs to work in transactions they need try locking and ordered ids

hiredman03:05:10

Which is more complicated then just an atomic compare and exchange

Jon Olick04:05:43

I see

phronmophobic04:05:56

Part of the reason refs are less common is that they target a use case somewhere in between "embarrassingly parallel" and amdahl's law. Even for the use cases where refs might apply, I think the fact that they're less commonly used makes it harder to apply them since there aren't many examples or references.

Jon Olick04:05:57

so basically just technical details with implementation of STM and atoms

devn04:05:26

I don't have the view personally that what you see in a typical production app is what you should aspire to. Sean is correct that it's common to not see refs or agents but of course that really doesn't say whether they're suitable to your problem space. When I read your question, I took it as you asking why there isn't some universal abstraction for concurrency primitives. Is that what you're asking?

Jon Olick04:05:13

it just seems that atoms and refs could be merged, but it seems just not the way that Clojure implements them all on the back-end

Jon Olick04:05:15

not that they can’t

Jon Olick04:05:18

just that they don’t

Jon Olick04:05:42

I just want to make sure I’m making the right call

Jon Olick04:05:49

with regards to merging them in jo_lisp

Jon Olick04:05:01

and I believe from this conversation that it is indeed the right call

devn04:05:27

What are the trade offs in making them distinct versus a shared abstraction?

Jon Olick04:05:33

none as far as I can see

Jon Olick04:05:41

thats the root question though

devn04:05:56

I feel like hiredman described a very real trade off.

Jon Olick04:05:08

yeah, the trade-off it seems is more of an implementation detail inside clojure than language design

devn04:05:25

I don't view it that way, personally. It is not an implementation detail but rather a very specific set of guarantees.

👆 1

devn04:05:42

If you want an STM that does it all with a single concurrency primitive, you're going to need to describe why it's not great for X workloads, how to avoid contention, etc.

phronmophobic04:05:38

Atoms and stm have different use cases and make different trade-offs. For a similar example, while you could use lists as your only datastructure and forgo having maps and vectors, that's probably not a great idea in practice.

Jon Olick04:05:07

I think they are fundamentally diametric, in that if your program uses STM, you really don’t want Atoms anywhere, as any use of Atoms inside STM transactions will happen multiple times, so it seems that they are, at least for the most part, and either-or scenario

Jon Olick04:05:28

However,

Jon Olick04:05:08

if Atoms worked properly inside a STM transaction, then they could be intermixed freely without concern for using atoms inside a transaction - as thats OK and not a side effect

Jon Olick04:05:53

thus, I would conclude that if they could be worked together, they should be - to make a more coherent language design

Jon Olick04:05:32

however, if you have atoms working properly inside a STM transaction - then whats the point of refs?

Jon Olick04:05:48

and now we are back to my question 🙂

devn04:05:31

I feel like your supposition is that when you can unify two disparate guarantees you should, and that this is the hallmark of “good” language design. I don't agree.

phronmophobic04:05:59

You could make the same argument for unifying all data structures to be backed by a list implementation, but then you just end up with really slow maps and vectors.

Jon Olick04:05:43

your presuming you cannot, with good performance, merge atoms and refs

phronmophobic04:05:03

I am

Jon Olick04:05:22

cools 🙂 then we are on the same page then

Jon Olick04:05:29

if it can be done, with good perf, it should be

phronmophobic04:05:20

even if you could, the different types inform quite a bit for the reader of the code about how to use them. With an atom, there's no question whether you should make changes within a transaction or how transactions should be grouped. With refs, that's an important part of the design

devn04:05:23

there's no presumption of impossible, only highly improbable, speaking for myself. I hope that gives you motivation!

Jon Olick04:05:12

well I think its worth a shot

Jon Olick04:05:22

I may not be able to get there, but I think its worth trying

devn04:05:44

I'm all for an experiment.

phronmophobic04:05:05

One fundamental difference between atoms and stm is that stm keeps some history for readers and atoms do not. That will fundamentally change your memory and computational characteristics.

Jon Olick04:05:08

This thread was mostly just double checking my logic and making sure I wasn’t missing something dumb here

devn04:05:11

@U7RJTCH6J could you explain what you mean by history?

Jon Olick04:05:16

atoms inside a STM transaction would function exactly the same, from the transaction’s perspective

Jon Olick04:05:47

just that their changes get reverted if the TX needs to retry

Jon Olick04:05:18

(and shoved in the TX history buffer if I get your meaning to give consistent views of the atom from the outside)

Jon Olick04:05:32

and the inside

devn04:05:27

https://github.com/cgrand/megaref

devn04:05:54

This isn't exactly what you're talking about but it might be interesting given this conversation.

👍 1

devn04:05:00

Though just to be clear, that description is awfully specific, and imo for good reason.

phronmophobic04:05:48

All the reference types have functions for set and update (eg. reset! and swap!). It's common to come to clojure and try to provide a reference agnostic interface. I'm sure there's a discussion somewhere that does a better job explaining why that hasn't been done.

seancorfield04:05:20

@U036UA9LZSQ Just to clarify, you're creating an alternative "Clojure-like" Lisp but it won't have compatible semantics?

seancorfield04:05:47

Ah, this is it, right: https://github.com/Zelex/jo_lisp/blob/main/README.md

Jon Olick04:05:30

Correct, initially intended to just rewrite clojure totally

Jon Olick04:05:40

I made some changes to IO and Atoms/STM

Jon Olick04:05:51

IO was lacking imo

Jon Olick04:05:11

I don’t know how people get by without more io

seancorfield04:05:13

Because you don't like Java's I/O ecosystem?

Jon Olick04:05:44

I guess that’s what people do, yeah, just not use clojure itself to do IO and instead interact directly with Java classes

Jon Olick04:05:51

seems the only sane way anyway

Jon Olick04:05:22

since this isn’t hosted on Java, and instead on C/C++, I needed something more self contained

seancorfield04:05:44

Clojure is designed as a hosted language -- on the JVM, on the JS engines, and on the CLR, and now on Dart.

seancorfield04:05:11

I think there have been attempts to write Clojure-to-C in the past -- have you looked at prior art?

Jon Olick04:05:16

yeah

Jon Olick04:05:22

large work-in-progress, don’t use for any real purpose yet pls

devn04:05:40

too late, deploying to prod

Jon Olick04:05:46

doh!

seancorfield04:05:57

I won't use it for anything if it doesn't have compatible semantics to Clojure on other platforms, don't worry 🙂

Jon Olick04:05:04

deal

Jon Olick04:05:31

I’m specifically trying to innovate in the concurrency/parallelism department (eventually)

Alex Miller (Clojure team)04:05:42

the tradeoff between atoms and refs is perf. atoms are half implemented in hardware and super fast particularly if uncontended. refs give you coordination, but are way more complicated.

💯 1

👍 1

devn04:05:21

an alex miller appears

seancorfield04:05:55

I can sort of imagine a Clojure implementation not implementing ref at all, but implementing atom using as close to the metal features as possible.

Alex Miller (Clojure team)04:05:23

that's what atoms are now

devn04:05:36

@U064X3EF3 what do you mean half-implemented in hardware?

seancorfield04:05:42

I can't imagine the opposite approach -- trying to make atom work in the larger, coordinated, more complex (and slower) context of ref...

seancorfield04:05:08

compare-and-swap under the hood right? So machine code level for the JVM.

Alex Miller (Clojure team)04:05:37

clojure atoms rely on java Atomic objects, which rely on spin locks implemented in Java intrinsics that are really reaching hardware stuff almost directly

Jon Olick04:05:17

ah, where-as the C++ hosted language just like does the assembly commands to do it

devn04:05:19

I knew about the spin loop but didn't know it reached so deep.

Alex Miller (Clojure team)04:05:58

we're just stealing existing the 1000 person years of effort Java has already done here

Jon Olick04:05:04

x86 uses

lock  cmpxchgq

Jon Olick04:05:14

so can leverage that directly

Alex Miller (Clojure team)04:05:17

so does clojure :)

Jon Olick04:05:21

cools!

Jon Olick04:05:22

🙂

Alex Miller (Clojure team)04:05:47

but refs are not that

Jon Olick04:05:53

no, they aren’t

Jon Olick04:05:01

the way I do it is something like

Jon Olick04:05:24

if(in_transaction) { do thing but store to TX buffer} else do normal CAS

Jon Olick04:05:30

roughly

Alex Miller (Clojure team)04:05:42

and that if is way slower than the cas

Jon Olick04:05:44

its more complicated than that

Jon Olick04:05:52

not really

Jon Olick04:05:13

the if(in_transaction) should pretty much always be correctly predicted by the CPU, so speculative execution should take care of that

Jon Olick04:05:37

in fact, the cost might be almost entirely hidden

Alex Miller (Clojure team)04:05:34

if there is any lesson to learn from Clojure's design, it's about taking things apart, this is just yet another case of it

Jon Olick04:05:55

except, in this case, I don’t think thats a benefit

Jon Olick04:05:16

it means the programmer has to choose between STM or Atoms, and not intermix, and god help you if you use libraries made by others

Jon Olick04:05:31

thus, I believe this is in this case a language flaw

Alex Miller (Clojure team)04:05:54

and that is your design prerogative :)

devn04:05:03

I prefer that choice, personally.

Jon Olick04:05:21

I believe I could correctly argue that its a design flaw (a small one, but a design flaw none-the-less)

devn04:05:26

I'm not a big fan of floor wax + dessert topping scenarios.

Jon Olick04:05:35

Clojure in generally is a really great language

Jon Olick04:05:46

if this is the only thing I’m complaining about, thats really really good

Jon Olick04:05:54

you should hear me complain about C++ design choices

Jon Olick05:05:05

Just think about it a bit and keep an open mind 🙂 The purpose of STM in a language and how it interacts with everything in a consistent way is whats important to make programming simpler. (using a rich term)

seancorfield05:05:38

"you should hear me complain about C++ design choices" -- would love to hear... after my eight years on the ANSI C++ Standards Committee (and three years as its secretary) 🙂

👍 1

Alex Miller (Clojure team)05:05:32

I look forward to seeing how it turns out. Making new choices is how we get new (and occasionally) better things.

🙂 1

seancorfield05:05:58

@U036UA9LZSQ Do you have any docs describing the I/O approach you're taking/what functions you're providing as an alternative to Clojure's Java-based stuff? (and of course file I/O is different in cljs on Node.js and on the CLR etc because the underlying libraries are different)

seancorfield05:05:47

I created #jo_lisp in case anyone is interested in following up on this -- seems worth a channel for folks who might want to "kick the tires" on jo

didibus05:05:32

I think the main distinction is performance of atoms over refs. The other thing I can see is the programmer ergonomics, refs are just a whole lot more complicated. When do you use ensure, commute, alter? Atoms are simpler to understand, it'll retry the change until it wins the commit, it's eventually consistent, you use it with swap! to be able to atomically read something and update it. Or you use it with reset if you can get away with overwriting whatever is currently there. Also, they are more distant, are you sure your function isn't called in a transaction? Better not accidentally add IO or non-idempotent behavior since your function could be retried.

didibus05:05:36

And in practice, I feel STMs are even slower than a lock, so you might as well just use locking around your two or three atoms when you care for that.

didibus05:05:04

Having said that, if you could have atoms work as refs inside of transactions, so you'd use atoms most of the time, they'd be just as fast, and only if you want to atomically make changes to multiple atoms instead of locking you could put that in a dosync, and if arguably that's faster than locking, ya it be cool.

didibus05:05:09

Question to you though @U036UA9LZSQ what happens if I'm calling reset! or swap! outside a transaction at the same time I'm also making use of the same atom inside a transaction?

seancorfield05:05:32

Maybe a better Q for #jo_lisp at this point @U0K064KQV?

seancorfield05:05:04

(this thread has gone way off-topic for #beginners at this point!)

didibus06:05:36

I wonder if atoms could have an extra function like swap! that also takes a rollback function, and if the transaction could use that to rollback if it failed.and is retrying.

didibus06:05:30

Oh sorry, ya, can you move a thread? Or should I repost, though my last comment was directed at Clojure actually, but ya this whole thread seems too advanced for beginners

seancorfield06:05:17

No, we can't move threads here. Feel free to have that discussion in another channel, at this point. Or not. 🙂

oly07:05:08

Another java translation query, Method getFaultInfoMethod = exception.getClass().getDeclaredMethod("getFaultInfo", new Class[]{}); how does this bit translate to clojure new Class[]{} ?

oly08:05:03

(.getDeclaredMethod (.getClass e) "getFaultInfo" (into-array Class [])) from a bit of googling I have come up with that code as the translation.

👍 2

Noah Bogart15:05:11

Is there a way to set *warn-on-reflection* that only checks my code and not external dependencies?

Alex Miller (Clojure team)15:05:11

in short, no - it's a dynamic binding scope so will anything on the thread that gets loaded

Alex Miller (Clojure team)15:05:11

but one common thing to do is to put it in your own namespace, after the ns declaration. because of how the stack is handled during ns loading, that does what you want in most cases

Alex Miller (Clojure team)15:05:42

you'll see this in many clojure core/contrib namespaces for example

Noah Bogart15:05:46

ah so it only applies to the written code and not the required code. that's clever.

dpsutton15:05:48

one thought though is that reflection is a penalty and possible error at runtime regardless of its origin. So there’s not much difference in “it’s slow but the cause is in a 3rd party dep”

Noah Bogart15:05:06

sure, but I can't control the code in those dependencies, so getting the warnings requires either forking and fixing it myself (and maybe or maybe not upstreaming the fixes), or filling my terminal/log with unhelpful messages

👍 1

seancorfield16:05:09

Opening issues on those libraries is probably worthwhile (and would be less effort than forking/fixing) and would encourage library maintainers to avoid reflection.

seancorfield16:05:57

(at work, we have a checker that flags any source file that doesn't have (set! *warn-on-reflection* true) after the ns form and we consider reflection to be a "must fix" issue in general)

👍 1

popeye17:05:47

How can i dissoc keys from each map if value of that key is a collection, here :person-address has the collection as value

[{:person-name "john"
    :person-id 1234
    :person-address ["Holand"]}
    {:person-name "hari"
    :person-id 3456
    :person-address ["NY"]}]

to this

[{:person-name "john"
    :person-id 1234}
    {:person-name "hari"
    :person-id 3456}]

seancorfield18:05:09

dissoc doesn't care what the value is, it just removes the key (& whatever value it has)

seancorfield18:05:26

(mapv #(dissoc % :person-address) data)

popeye18:05:26

thanks for the response @U04V70XH6, but the keys will be dynamic here , requirement is if the value of that key is collection, then ignore that

seancorfield18:05:17

Ah, misunderstood your question. So, for a given hash map, you can do:

(reduce-kv (fn [m k v] (if (coll? v) m (assoc m k v))) {} my-map)

seancorfield18:05:06

Then you can mapv that across your data:

(mapv #(reduce-kv (fn ...) %) data)

seancorfield18:05:36

Instead of reduce-kv, you could do

(into {} (remove (comp coll? val)) my-map)

(untested but I think that will work)

seancorfield18:05:07

(def data  [{:person-name "john"
    :person-id 1234
    :person-address ["Holand"]}
    {:person-name "hari"
    :person-id 3456
    :person-address ["NY"]}])
#'user/data
user=> (mapv #(into {} (remove (comp coll? val)) %) data)
[{:person-name "john", :person-id 1234} {:person-name "hari", :person-id 3456}]

popeye18:05:34

yeah that worked! but (remove (comp coll? val)) wanted to understand how keys are also getting removed here?

popeye18:05:40

Thanks for the help @U04V70XH6, Will check your answer tomorrow and respond, Going to the bed now!

seancorfield18:05:41

When you apply a sequence function to a hash map, you get a sequence of MapEntrys that look like 2-element vectors. key and val are the functions that pull the key and value respectively from a MapEntry. (comp coll? val) is an anonymous function equivalent to #(coll? (val %)) so it tests if the value in the MapEntry is a collection. So remove leaves just the MapEntrys where the value is not a collection, and then into {} "pours" that sequence of MapEntry back into a hash map.

👍 1

popeye18:05:55

i see, Thank you for your detailed explanation, This will help to explore more on the hashmap

2022-05-11

Channels