Fork me on GitHub
#clojure
<
2020-11-17
>
zhuxun200:11:41

After I made a clojure.spec.alpha/fdef, there doesn't have to be any run-time validity checks. Do I have to put the same specs for args and ret again into :pre and :post?

didibus00:11:04

You need to call instrument from clojure.spec.test.alpha

didibus00:11:35

(require '[clojure.spec.test.alpha :as stest])
(stest/instrument)

👍 3
zhuxun200:11:49

Got it. Thanks.

didibus00:11:58

That will instrument all functions that have an fdef so when called, they auto-validate.

didibus00:11:38

Its not recommended to have them instrumented in prod (because it slows down performance to have it everywhere). So in prod, it is better to selectively use calls to s/valid? in your code (could be in a :pre or just inside your function as you please). Where you'd only validate user/client/DB input and data that you output to a user or DB or some other system like a client.

👍 3
didibus00:11:03

But at the REPL and in tests, do call instrument.

didibus00:11:54

Also, small caveat, this won't instrument return values, just input args. Clojure assumes you will assert valid output for a given valid input using generative testing. But because that can be inconvenient to setup for all functions sometimes, there's a lib that has a drop in replacement for instrument that will instrument return as well: https://github.com/jeaye/orchestra

👍 6
zhuxun201:11:44

Thanks for the tips. These are very helpful! @U0K064KQV

dgb2301:11:05

I also use orchestra for repl sessions. It is very simple and convenient.

zhuxun200:11:06

Is there a way to use the fdef directly as a run-time validity check?

Alex Miller (Clojure team)01:11:00

due to the perf costs, no. fdef specs are for test-time checking.

defa14:11:27

Is there a shorter form for testing if v is a list/vector or LazySeq?

(or (vector? v) (list? v) (instance? clojure.lang.LazySeq v))

defa14:11:18

🙈 of course... thank you! I got confused with seq? which is of course something completely different.

souenzzo18:11:16

seqable? also a thing

dev-hartmann14:11:52

hey folks, is anyone aware of if it is possible to dynamically add routes to a running server without restarting the application?

dev-hartmann14:11:26

what I'm trying to achieve is the following

dev-hartmann14:11:24

create a minimal service running one post endpoint that accepts a enpoint path and a json file to be returned to dynamically create fake endpoints

dev-hartmann14:11:44

thought of using mount to restart api component after endpoint recieved post with valid data, but maybe there is a more elegant way

delaguardo14:11:32

or you could have two endpoints: one POST as you described, another (with lower precedence) to catch all other requests. Something like /:path with stateful handler which will hold an atom (as an example) with all routes posted via first POST. This is relatively easy to achieve with reitit. I’m saying that with high confidence because I have a project that does that 😉

borkdude14:11:35

@dev-hartmann This is pretty simple using no routing library at all and just ring. Just look at the :uri and :method keys of the request and act accordingly.

dev-hartmann14:11:42

@delaguardo thought about that too, good to know someone else already validated that 😄

dev-hartmann14:11:27

@borkdude will look into that, ultimately that would remove two dependencies, which for a minimal dev tool would be awesome!

dev-hartmann14:11:36

thanks for the replies folks

borkdude14:11:43

@dev-hartmann you can even do this with babashka only

dev-hartmann14:11:00

that's what I was ultimately eyeing at

borkdude14:11:01

just one small script

Clojavu16:11:37

Hi, I am migrating my application from lein to clojure cli. I have two questions 1. Would deps.edn support profiles? I have a few in my project.clj. Is there a reference for all the available options for deps.edn? 2. How can I call clojure functions from my deps.edn file. I used to do this with a ~ in my project.clj Thanks

borkdude16:11:01

@dejavuakku 1) deps.edn aliases would be the closest equivalent to lein profiles. 2) you can't.

Clojavu16:11:50

Okay. Thanks @borkdude

👍 3
hipster coder18:11:15

can someone tell me if they have tested Clojure and any concurrency models against Golang CSP, Communicating Sequential Processes

hipster coder18:11:42

I am going into a consultation… and need to debate if we use another concurrency model… and use the JVM with possibly Clojure

hipster coder18:11:55

The CSP model is not fault tolerant. for 99% uptime, for web dev

phronmophobic18:11:29

isn't CSP the basis for erlang's concurrency model as well? based on the http://erlang.org/download/armstrong_thesis_2003.pdf, I would say that fault tolerance doesn't come from having a queue of backlogged messages, but having building a principled system that can tolerate partial failure

hipster coder20:11:07

I think of fault tolerance as.. a message broker queing up messages if the other side (receiver) fails for a time… then that que (channel) can continue sending messages to other receivers, or just wait… CSP does not offer this by default

hipster coder20:11:36

if the channel fails on CSP… the receivers fails… and there’s no que like GraphQL offers. Not that I am aware of.

hipster coder20:11:25

GraphQL would be slower, over http/tcp… not as fast as a message broker or concurrency model… but it’s fault tolerant

hipster coder20:11:22

. Unfortunately, Golang does not support distributed execution of goroutines on clusters or distributed systems. In this paper, we extend the concurrency capabilities of Golang to a distributed cluster by providing a library called Gluster that is simple and easy to use

hipster coder20:11:52

They had to use Gluster library to achieve distributed computing… so if a machine fails… there’s another machine… That’s what I mean by fault tolerant.

hipster coder18:11:13

when channels fail, the receivers fail, and there’s no que of backlogged messages

hipster coder18:11:53

Netflix hit this problem with Go… and had to use Hystrix

noisesmith18:11:26

@nathantech2005 sounds like what you want is something that spans multiple processes / services, where CSP is usually done inside the scope of one process (we have core.async which is also CSP)

hipster coder18:11:02

Yes. And I favor the actor model, for web dev… it’s like Elixir… Very fault tolerant.

hipster coder18:11:20

CSP only uses 1 system process?

phronmophobic18:11:29

isn't CSP the basis for erlang's concurrency model as well? based on the http://erlang.org/download/armstrong_thesis_2003.pdf, I would say that fault tolerance doesn't come from having a queue of backlogged messages, but having building a principled system that can tolerate partial failure

noisesmith18:11:40

clojure doesn't offer any inter-process / inter-service model out of the box, that's outside the language design scope, but various libraries can be used

hipster coder18:11:06

I’d be willing to use a library for Clojure for actor model

hipster coder18:11:13

or mix Clojure with other JVM langs

noisesmith18:11:18

@smith.adriane right, CSP isn't fault tolerant, it's about guaranteeing that certain concurrency errors within one process won't occur

noisesmith18:11:24

it's a different problem

hipster coder18:11:38

@smith.adriane Elixir is based on Actor Model… almost as similar to CSP

noisesmith18:11:54

actor model and CSP are very different, they address completely different problems

hipster coder18:11:59

so CSP is better at controlling 1 process

noisesmith18:11:01

and the models are totally different

hipster coder18:11:15

being able to monitor the 1 system process, for problem

hipster coder18:11:31

opposed to sending off messages to actors on other machines… and not knowing what happened

hipster coder18:11:07

I can’t think of a domain problem… CSP is great for… off the top of my head? any ideas?

phronmophobic18:11:40

I thought erlang was "communicating sequential processes?"

hipster coder18:11:44

maybe like in a wharehouse… controlling the flows of goods, machnes

noisesmith18:11:56

CSP solves async coordination problems (deadlock, livelock)

hipster coder18:11:58

Golang can monitor 1 process… over many steps… in a wharehouse

noisesmith18:11:08

the CSP model is sync

noisesmith18:11:19

(in a specific way...)

hipster coder18:11:33

CSP is sync… that’s news to me too… did not realize that

noisesmith18:11:45

"sequential" processes

hipster coder18:11:04

so you can control the order, steps

noisesmith18:11:13

it has an official order that the events were processed, unlike erlang which can have different orders for different observers

hipster coder18:11:29

ok, thanks, yes, that is totally different problem sets

noisesmith18:11:58

the CSP paper isn't super difficult and it clears up a lot: https://www.cs.cmu.edu/~crary/819-f09/Hoare78.pdf

hipster coder18:11:12

yep, I printed that… was reading that one

noisesmith18:11:18

oops, I said "concurrent" above, and it's "communicating"

hipster coder18:11:40

hahaha… you clarified alot

hipster coder18:11:49

when you explained, ordered vs non ordered

hipster coder18:11:01

because I was under assumption, CSP was nearly like Actors

phronmophobic18:11:21

> A consequence of ignoring time is that we refuse to answer or even to askwhether one event occurs simultaneously with another. When simultaneity of apair of events is important (e.g. in synchronisation) we represent it as a single-event occurrence; and when it is not, we allow two potentially simultaneousevent occurrences to be recorded in either order.

noisesmith18:11:58

what's the source ther?

noisesmith18:11:42

perhaps I'm misunderstanding what "sequential" means then...

hipster coder18:11:46

are you saying… you think CSP ignores order?

hipster coder18:11:59

because everyone I talked to, using GO, told me, they can control the order

phronmophobic18:11:20

sequential refers to the "events" within a single process.

phronmophobic18:11:55

if you're focusing on fault tolerance, I highly recommend the erlang paper

phronmophobic18:11:26

one of the major ideas in the erlang paper is that you can't have true fault tolerance on a single machine since machines can fail

hipster coder18:11:30

yes… because it’s 2 different domains 1. IoT devices, that can crash 2. 99% web uptime

hipster coder18:11:56

yes, I like the erlang model, how it can mirror processes onto different machines

phronmophobic18:11:01

if you have multiple machines, than you can't really gaurantee the latency of message between them

hipster coder18:11:03

so if one crashes, another machine just steps in

hipster coder18:11:15

ya, so you can’t control the order

hipster coder18:11:22

which is ok, for say… a chat system

hipster coder18:11:48

I read… Whatsapp has 13 main engineers, managing billions of messages

hipster coder18:11:55

that’s how well the actor model worked for them

hipster coder18:11:46

here’s my question though… how do I scale beyond 1 process, on Go…. haha

hipster coder18:11:51

I will research that next

phronmophobic18:11:15

if you're interested in how to build system, I also highly recommend https://www.youtube.com/watch?v=ROor6_NGIWU

noisesmith18:11:25

distributed systems introduce a whole set of failure modes that CSP doesn't handle, including the one you mentioned where a backlog of unhandled messages are dropped

hipster coder18:11:20

k, watching now… Rich Hickey gives incredible talks

phronmophobic18:11:24

I'm not sure what you mean by doesn't handle. you can certainly model a system that drops a backlog of handled messages using CSP, but you have explicitly model that part of your system

hipster coder18:11:34

honestly, I always go to Clojure community, for discussing these types of problems

hipster coder18:11:56

he means, CSP does not handle dropped channels, by default

hipster coder18:11:12

if a channel fails, all the receivers fail

hipster coder18:11:27

so the data is lost, the messages are lost

phronmophobic18:11:42

that's how CSP is implemented, but I'm referring to the CSP model from Hoare's paper

phronmophobic18:11:47

which is more abstract

noisesmith18:11:27

@smith.adriane what I mean is that CSP isn't designed for distributed failure tolerance, that's not what it's even for

👍 6
noisesmith18:11:00

not that a distributed CSP system couldn't be made (maybe it could), but it would rely on something else for failure states, retries, monitoring, time outs, etc. etc. and you'd need a layer that removed those things conceptually

noisesmith19:11:43

also, going back and reading the context, and reading the quote more carefully: > A consequence of ignoring time is that we refuse to answer or even to askwhether one event occurs simultaneously with another. When simultaneity of apair of events is important (e.g. in synchronisation) we represent it as a single-event occurrence; and when it is not, we allow two potentially simultaneousevent occurrences to be recorded in either order. the distinction I was making is that in an actor model, you don't even have a global order the way CSP defines it. there are various theories and rules about ordering with actors, but it is not well defined or simple the way it is in CSP https://en.wikipedia.org/wiki/Actor_model_theory

noisesmith19:11:35

in an actor model, actors have orderings of events (relativistic), in CSP events have ordering

phronmophobic19:11:10

but couldn't you explicitly model that within CSP?

phronmophobic19:11:52

for example, if you made the sending/receiving/dropping of a message an explicit process in CSP

phronmophobic19:11:42

like CSP doesn't inherently have buffers, but you can model buffers using CSP. you just have to make the buffer an explicit process

noisesmith19:11:19

what I'm saying is that in order to have one CSP system that's distributed, you need something that pretends that you can't read a message then be disconnected indefinitely from the rest of the system

noisesmith19:11:32

that's an distributed failure mode that CSP doesn't model

noisesmith19:11:21

you can of course have multiple CSP systems and distribute them, but CSP doesnt' deal with any of the distributed failure mode, that's not what it does or is for

phronmophobic19:11:32

why can't that be modeled?

phronmophobic19:11:22

you have a process A (the sender), the process B (the sending) , and process C( the receiver). process B conditionally forwards the message from A to C.

noisesmith19:11:48

you can model anything in a CSP system, what I'm saying is that none of the CSP features address distribution, it's not a distributed correctness model

noisesmith19:11:25

in order to model distributed processing, you'd need to implement some other model designed for that use case

phronmophobic19:11:58

yea, I wouldn't call it a distributed correctness model

noisesmith19:11:49

so you could a) build CSP on top of an already distrubuted system (it might help with coordination?) b) use individual CSP systems to implement something like raft (would it help here? not sure) but you could swap out CSP with "state machine", or "continuation passing" or "petri net" and get the same result

phronmophobic19:11:35

I guess i'm not sure what you mean by "build" CSP

phronmophobic19:11:41

CSP is an abstract model

noisesmith19:11:59

implement a system which follows the rules of CSP

noisesmith19:11:52

in order to follow those rules, the failure modes of distribution that CSP isn't designed to account for need to be abstracted away

noisesmith19:11:08

in other words, something else needs to solve those problems before you start doing CSP

noisesmith19:11:28

the thing that's important here to me is that we don't confuse the "async programming" problem with the much larger, and harder, "distributed programming" problem

phronmophobic19:11:57

I agree that fault tolerance has many facets to it.

phronmophobic19:11:59

I also think that reading and trying to understand CSP will help you understand erlang better

phronmophobic19:11:55

and learning CSP (the abstract model) can also help you reason about and model distributed systems in a more rigorous way

noisesmith19:11:24

I'd argue that petri-nets are closer to erlang than CSP is (in a petri net there's no shared clock - two units don't need to operate in any specific timing, which better models network distance)

noisesmith19:11:39

but even petri-nets are not a distributed state model

phronmophobic19:11:36

CSP can help you model erlang, but erlang processes aren't 1:1 matches to CSP processes

noisesmith19:11:46

erlang doesn't do CSP - but I do agree that learning more models improves understanding

hipster coder20:11:49

yes, that’s also what I am getting at… I think financial trading systems need distributed systems, so transactions can be sent across machines, to different AWS centers.. this is an international system

hipster coder20:11:19

I don’t see CSP, limited to the machine processes… and needing libraries and tons of work… to get it setup for distribution, a good fit for international trading system

hipster coder20:11:38

I see Go CSP a better fit for say… 1 big warehouse

noisesmith20:11:46

I think this is an a/b problem, neither csp nor actors solve distributed state, and what you need is distributed state

hipster coder18:11:27

Actors work great for Whatsapp. chat…

hipster coder18:11:34

Go works great for robotics, wharehouse control

potetm19:11:30

From a practical perspective (as opposed to theory), processes and actors are in a similar solution space: Shared resource management in user space.

potetm19:11:07

Neither model inherently addresses distributed failure modes.

noisesmith19:11:48

@potetm it's worse than that though: you can't make the CSP guarantees across a network

noisesmith19:11:07

(without solving distributed state first, outside CSP)

potetm19:11:08

Erlang built in networked messaging in order to extend the model to multiple machines.

noisesmith19:11:19

"the model" - which one?

potetm19:11:25

the actor model

potetm19:11:54

But the actor model doesn’t inherently address network failures.

noisesmith19:11:57

right, the actor model makes very few guarantees, so distribution doesn't inherently break it in that way

potetm19:11:53

This is true, but from a practical perspective, you still want those guarantees under normal circumstances.

potetm19:11:10

You have to construct the application in a very particular way to get particular guarantees.

potetm19:11:53

This is true, but from a practical perspective, you still want those guarantees under normal circumstances.

potetm19:11:35

So, tl;dr — I think @nathantech2005 is asking for something that isn’t built-in to either model.

potetm19:11:54

Nothing magically removes the need to think about various failure modes.

noisesmith19:11:59

right - but I can call multiple systems on multiple hosts one actor model - that remains coherent

potetm19:11:47

I’m not sure what distinction you’re trying to draw here. This depends on the actor impl iiuc.

noisesmith19:11:01

I'm saying it's possible

potetm19:11:07

it happens that erlang is async i/o by default, so this is roughly true

noisesmith19:11:11

the rules for what CSP is make the equivalent impossible under its model

potetm19:11:28

but in practice you can swamp the nic and be in the same boat as CSP — some msgs get delivered, some don’t

potetm19:11:44

you can also use dropping buffers in csp to emulate that behavior

phronmophobic19:11:13

I think when you refer to CSP, you're assuming a specific implementation running on a set of computers rather than an abstract model

noisesmith19:11:15

I'm not talking about using CSP to model network failure, I'm talking about implementing one CSP system that spans a network

noisesmith19:11:18

I'm talking about a system that fulfills the rules to be a CSP system, so it offers the guarantees that CSP offers

phronmophobic19:11:23

it's like the difference between prolog and horn clauses

noisesmith19:11:30

you can't do that in a way that spans a network without solving distributed state first

potetm19:11:39

@U051SS2EU What difference do you perceive between CSP and “processes + queues”?

potetm19:11:09

And how does modeling a networked queue as a dropping buffer not emulate the actor system?

👆 3
potetm19:11:14

i.e. “All writes to the network queue go through this dropping-buffer channel. The writer process will handle i/o.”

noisesmith19:11:03

once again, I'm not saying "CSP can't model x", I'm saying "you need to solve x before you can have CSP"

noisesmith19:11:35

CSP describes a set of operations that are possible, they don't really stay coherent across distribution

phronmophobic19:11:30

there's a difference between the abstract concept of "turing machine" and the implementation

phronmophobic19:11:08

sure, you can't implement a turing machine because there's no such thing as infinite memory, but the turing machine model is useful for thinking about computers

potetm19:11:33

So, just to be clear: You’re saying that, as long as i/o is async, actors need not consider the network directly. You have to do something (like what I just suggested) to get that behavior in CSP.

noisesmith19:11:43

but, pragmatically speaking, we use CSP because it gives us guarantees about async behavior (preventing deadlock, livelock, and race conditions), and network failures undermine all of those guarantees

noisesmith19:11:22

which is why I'm emphasizing that I can't have one CSP system that spans a network

phronmophobic19:11:17

i'm saying "one CSP system that spans a network" isn't even a concept

potetm19:11:21

Another way of saying that is: You cannot have reliability behind a queue without ACK semantics?

noisesmith19:11:24

(without first creating an abstraction that lets us pretend network failures don't exist)

potetm19:11:28

(for example)

noisesmith19:11:15

@potetm it goes beyond that - what do you do with a message that is accepted but the node fails or gets disconnected before any result is calculated

potetm19:11:37

right — ACKing solves this, right?

potetm19:11:51

(which formal CSP does not have, I agree)

potetm19:11:29

Wait to be even more clear: ACKing + Retries by the channel

noisesmith19:11:33

if what you mean by "ACK" includes timeouts, retries, and fail states, sure

potetm19:11:38

right, exactly

potetm19:11:53

yeah I mean, it’s a total apples to oranges comparison tho?

👆 3
potetm19:11:17

Do actors include those sorts of semantics somewhere?

noisesmith19:11:51

@smith.adriane I think we've been talking past each other, I wasn't trying to disparage CSP, just point out a very common mistake (one OP was making) of thinking that a distsys failure would somehow be solved or addressed by CSP (or that a failure to handle a distsys state problem was a problem with CSP)

noisesmith19:11:04

yes, actors include timeouts, retries, etc.

potetm19:11:25

I genuinely don’t know about formal actor models. I have a passing knowledge of Erlang’s model (mostly from reading Armstrong).

potetm19:11:56

erlang does?

potetm19:11:04

or are we not including that as a formal actor model?

noisesmith19:11:18

I'm actually not an expert on actor systems... I should be more careful about the claim I'm making here

noisesmith19:11:46

I'm saying that you can do actor system, and add things like retries and timeouts, and still meaningfully use an actor model

noisesmith19:11:00

partly because the actor model makes very weak guarantees

potetm19:11:19

Okay I think I’m following you. But the end result is roughly the same either way: You have to deal with the problem at every call site somehow. It’s not magically solved.

potetm19:11:47

But you’re point is different: Yes, you have to deal with it everywhere, but the semantics of the system hold. (e.g. the actor is in a HOLD state until it receives the ACK)

phronmophobic19:11:48

i guess my point is that CSP is meant as a way to model concurrent processes. so the fact that CSP can model a distributed system means that CSP can handle distributed systems. the fact that the model itself doesn't magically turn into a working distributed, fault tolerant system running on AWS is moot

potetm20:11:42

@smith.adriane I mean, if you’ve ever tried to build in acking semantics to a core.async program, you’ll know that it’s a real PITA. I think noisesmith has a fair point there.

potetm20:11:09

Like, it’s def worth noting if you’re gonna compare the approaches.

noisesmith20:11:32

right, I once tried to make core.async in one service communicate with core.async in another, and it was a huge waste of effort and source of unneeded complexity

phronmophobic20:11:35

i'm thinking of CSP as an abstract reasoning tool, not a library implementation that uses the concepts of CSP

noisesmith20:11:51

the failure modes and problems across services are not the ones CSP help me with

phronmophobic20:11:15

it's like lambda calculus vs. clojure

noisesmith20:11:31

@smith.adriane right and that's why I waited until my first specific example to mention a specific implementation

noisesmith20:11:59

the source of complexity wasn't "core.async isn't networked yet" it was "the things CSP does for me didn't help any more at the network boundary"

potetm20:11:09

@smith.adriane The problem isn’t with a specific impl. The problem is in the model. Namely, it doesn’t include resiliency semantics.

phronmophobic20:11:10

CSP is to lambda calculus as core.async is to clojure

phronmophobic20:11:30

you can model processes that fail

noisesmith20:11:46

that's why I've been comparing it to actors, something that does mix well with resiliency constructs

potetm20:11:28

@smith.adriane You cannot model channels that ack.

💯 3
potetm20:11:46

Because they are a primitive in CSP and acking is not included in the definition.

phronmophobic20:11:48

is it impossible to model a "sending process" that acks?

noisesmith20:11:53

@smith.adriane I do agree that you can construct a go block that sends then times out and retries / fails, but I found these sorts of things were a bad match for inter-system failures, and the complexity of this logic bleeds

noisesmith20:11:22

you can do that - make a channel op that doesn't return until you get an ack

noisesmith20:11:29

but you can't do that in a go block

phronmophobic20:11:53

i'm not talking about an actual implementation, I'm talking about using the stuff from Hoare's paper

noisesmith20:11:55

(right that's an implementation specific problem, but it's indicative of the sorts of mismatches you run into)

noisesmith20:11:37

all I'm saying is that CSP isn't designed for modeling these sorts of failure states, and other abstractions fit them quite well

noisesmith20:11:53

not that CSP can't be used in these cases

phronmophobic20:11:21

like church numerals would be completely impractical for implementing a math library, but you can do arithmetic with church numerals

potetm20:11:21

I’m not sure what you’re getting after. The point isn’t that you couldn’t rig something together that approximates the behavior you want. The point is that doing so runs through your entire program.

potetm20:11:49

So it’s not like, “build a CSP thing, tack on some finishing touches.”

potetm20:11:04

No one’s saying it’s impossible. It’s a fair point that resiliency is not included in CSP by default — it’s an exercise for the reader.

phronmophobic20:11:18

the point of learning CSP to me isn't only for making a library that maps closely to the semantics in the paper (like core.async), but also for formal reasoning. For example, Leslie Lamport built on and referenced Hoare's work for reasoning about distributed systems. Just because the model doesn't magically turn into a library that spits out distributed systems doesn't mean it's not useful for thinking about distributed systems. Having good mental models and being able to reason about systems is important too!

potetm20:11:46

did anybody say “do not learn it, it’s garbage”?

potetm20:11:34

I don’t know what you’re responding to. The comment was, “It has these shortcomings.” The comment was NOT “it’s total crap. use actors n00b.”

phronmophobic20:11:44

I didn't try to imply that. I'm sorry if I did. > I’m not sure what you’re getting after. I was just trying to clarify what I was getting after

noisesmith19:11:26

I can't call multiple systems on multiple hosts one CSP model without solving distributed failure first

noisesmith19:11:30

to me that's a big difference

noisesmith19:11:59

(of course this doesn't mean the actor model is that much better than CSP - it also doesn't make many promises...)

hiredman19:11:45

(which isn't inherently distributed or fault tolerant)

noisesmith19:11:30

absolutely - the distinction I'm drawing is that unlike CSP it doesnt' make promises that are voided once you distribute your algorithm

hiredman19:11:03

the actor model doesn't handle unreliable message sends

jimmy19:11:05

Just wanted to inject in here that many problems don't need csp, the actor based model, or any thing like it. Many problems work perfectly fine in a stateless model. And if you can get away with that you should. Don't under estimate the usefulness of a stateless api layer, a persistent database, and some external queue/log for communication between subsystems. Debating which concurrency model is better can often take you down a rabbit hole that wasn't really needed. I ran into this as a consultant before where there was a real obsession with elixir's really cool model. But for something that was 100% stateless.

hiredman19:11:35

the actor model and csp are both early sort of process formalisms

hiredman19:11:49

the pi calculus or petri nets are newer

noisesmith19:11:22

yeah sorry this has gone way off topic, and I'm not even saying the actor model is better, I'm just trying to assert a distinction (one I was perhaps wrong about...)

hiredman19:11:35

well, I guess the pi calculus is contemporaneous with csp, but continued to evolve

hipster coder19:11:46

I was only asking… because I am talking to CoinBase… About Go (CSP) versus Actors on the JVM. That’s all. if it gives you better context of the domain problems.

hiredman20:11:35

that is a false dichotomy because there are both csp libraries and frameworks for the jvm, and the possibility of using go to implement actors

hiredman20:11:25

the big reason why actors and the actor model get talked about for distributed systems, is in the actor model you don't have channels/queues/whatever as values, you have process addresses, and it is far easier to distribute a simple value like a process address then it is to distribute something as complex as a channel

hipster coder20:11:25

yes, I just read a paper, Go does not have ability to distribute across machines, by default… have to use a library to do it

noisesmith20:11:30

right - to distribute CSP you'd need a coherent channel whose behavior / state crosses a network boundary, and that's a distsys complete problem, once you can do that your problem with distribution is solved already anyway

hipster coder20:11:36

this is also part of what I am getting at with fault tolerance

noisesmith20:11:42

where actors just need data to cross a network, much simpler

hipster coder20:11:06

There is a library called Gluster… to do that

hipster coder20:11:23

it seems like a ton of work, to make Go fault tolerant

hipster coder20:11:38

it seems like the only advantage… is the control of the order of the workload

hipster coder20:11:04

example: an assembly line on a factory floor

mpenet20:11:21

one big plus of csp is easier flow control management. With actors you have to deal with it backwards, have producers get to know if they can continue to send messages or not via some more complex mechanism (stg like credit based flow) than just blocking or channel op ret check; unbounded mailboxes make it all more convoluted imho

hipster coder20:11:29

so if I was building a crypto trading platform… peer to peer selling… which model?

hipster coder20:11:33

CSP or Actors?

didibus22:11:42

They are equivalent in terms of what you can do with them. So the choice is more about expressiveness, usability, etc. Which is a bit of a personal preference.

didibus22:11:05

In each model, you have independent processes doing work independently (and thus possibly in parallel). In CSP its called a Process, in Actor its called well an Actor.

didibus22:11:50

So first think, you can have "things that do sequential work on something on their own." In CSP these are called Processes. Each process is a thing doing some work sequentially. In Actor, it is an Actor. Each actor is a thing doing some work sequentially.

didibus22:11:36

That's the building block, and at this point, the two models are the same (minor their different names).

didibus22:11:55

Now, here comes the major part. What if you need to have those things work together collaboratively? What if one thing depend on another? Now you need a way to orchestrate their interaction together so that they are coordinated in what they are each doing. It means they are not fully independent anymore. For example, if one thing is "Sending Emails for each new customer registration" And another thing is "Registering new customers". You can see that the thing sending emails needs to wait for the thing registering customers to tell it that a new customer was registered.

didibus22:11:15

So where CSP and Actors differ, is in how they will coordinate those seemingly independent processes/actors when their operation is dependent on another process/actor

didibus22:11:49

In the case of Actors, they will communicate with each other directly through messages.

didibus22:11:27

In the case of CSP, they will communicate with each other indirectly through a Channel (a queue of messages).

didibus22:11:10

One big difference is that a Channel is a many to many communication. Many processes can put messages in the same channel, and many processes can read from the same channel. You can't do that with Actors, Actors always communicate one on one. So you'd need an Actor who itself would be responsible for fanning out a message from one actor to many others if you wanted a similar kind of one to many or many to many communication.

didibus22:11:18

Another big difference is that Channels can be used to handle back-pressure. Basically, they bugger the communication between Processes. So if one Process is sending messages faster than the other can consume them, you can have the Channel tell the sender to back off. So the sender could wait for the Channel to have more room, before sending more message, or it could choose to go do something else and come back to trying to send messages later.

didibus22:11:31

With Actors you'd need to implement this yourself. So you'd need to either introduce an intermediate Actor that basically replicates this functionality, and it gets a bit more complicated.

didibus22:11:55

Ok, so now I hope you understood why CSP is called: "Communicating Sequential Processes". As you have these Processes that can each be doing their own sequential work, and you have a means to have them communicate with one another if they need to coordinate themselves of exchange data between each other through the Channel abstraction.

didibus22:11:22

Now, you might think... Ok I've only mentioned pros for CSP, and they just seem like they have more features than Actors? So what gives? Well, its true, CSP is more featured. That also means it is harder to implement CSP. Since the processes communicate through Channels, you need to implement Channels, and as Channels have a lot of features: many to many and back-pressure handling. It is much more difficult to build an implementation of Channels, then the direct async fire and forget message passing of Actor communication.

didibus22:11:46

Especially challenging if the Processes are running on different machines. Now you need to implement a robust cross-machine distributed Channel, and that's harder than building a distributed direct message passing that Actors use.

didibus22:11:31

So in general, CSP is used on a single machine, and Actors are used across machines. That said, some people say that taking Actors and distributing them, while "easier", for any robust implementation, will also need to implement on top of them a distributed back-pressure mechanism, and possible distributed many to many messaging. And doing that above Actors might not be any easier than implementing a distributed Channel.

didibus22:11:53

Another thing is, while Channels are more feature-full, if you don't need those features, they can be more cumbersome. Having to go through a Channel indirection if all you need is one to one fire and forget communication is just more convoluted.

A -> B
vs
A -> C <- B

didibus22:11:59

So finally, my advice to you... If you really need to go distributed, what you want isn't Actor vs CSP, but a solid framework that already implemented all the difficult part for distributing your work, may it be based on Actors, CSP or anything else.

didibus22:11:39

This is where Erlang shines, because the Beam VM has already focused on all these hard problems, and the Erlang standard library is full of things specifically to handle distributed systems. And the Erlang community is full of best practices and all just for that problem.

didibus22:11:39

The JVM not so much, but it doesn't mean it has nothing. Akka is quite well built, and a lot of effort have gone into it already to handle a lot of the hard problems of distributed programming for example. Depending what your problem is, Spark or Storm are also good options.

didibus22:11:10

Going solo, and not leveraging an existing framework with years of man power behind it is probably going to bite you.

didibus22:11:38

So when it comes to core.async in Clojure, it is not for distributed systems, but for single machine concurrency. And offers nothing to be able to distribute your Processes and Channels accross machines.

didibus22:11:13

Hope this helps!

herald18:11:40

This is the clearest CSP vs Actor model comparison I've read, and I hope you consider turning it into a blog post/article so other's can find it as well! @U0K064KQV

didibus20:11:43

Hum... Good idea. I'll add it to my TODO, hopefully I'll get around to it sooner rather than later.

👍 3
mpenet20:11:03

Then usually on the jvm you rely on other means for fault tolerance. Plenty of dist systems on the jvm do just fine without actors

hipster coder20:11:08

do I really care about controlling the order of transactions? I am thinking no… unless I offer a premium level, complete your order first

hipster coder20:11:55

yes, that’s also 1 thing I totally love about JVM… I have access to every possible concurrency model there is

hipster coder20:11:08

plus, I can run almost any language on it

hipster coder20:11:26

so all this talk about Java becoming the next Cobol… blah blah

hipster coder20:11:57

maybe I don’t write so much Java… Write Clojure or some Kotlin… which I love

hipster coder20:11:29

then… look at Go… I’d rather write in a JVM lang… This makes my eyes bleed

hipster coder20:11:06

Clojure 🙂 🙂 🙂

hipster coder20:11:13

my intuition is telling me, use Actors, distributed computing, for a trading system

hipster coder20:11:27

sending transactions across the world, trading cryptos

hipster coder20:11:30

I am afraid… if I commit to using the Go CSP model… and this fails… this will damage my career… big decision coming up

potetm20:11:50

Actors vs CSP is basically akin to “what language should I use?” It’s important sure. It has potentially substantial knock-on effects (e.g. hiring filtering and ecosystem). But this decision alone will not sink you. There are about a bajillion more important decisions that are much more likely to sink you.

hipster coder20:11:16

so I can either make CSP work more like Actors (distributed)… but not make Actors work more like CSP?

potetm20:11:37

You can do both or neither.

potetm20:11:18

(You can make a Queue Actor and ape queue semantics.)

potetm20:11:44

(This doesn’t magically solve problems of backpressure, but you can do it if you choose.)

hipster coder20:11:16

ok, good to know… that’s what I was worried about… if I go down one path, like CSP… and it totally fails… too rigid

potetm20:11:49

This is all pretty vague and amorphous if you ask me. “too rigid” “sending txns across the world”

3
potetm20:11:35

There is a world where you can be quite confident that your approach will suffice.

potetm20:11:50

What does it take for you to get to that world?

hipster coder20:11:14

just knowing that I can adjust the concurrency model, or even switch to a different model

hipster coder20:11:23

and not being locked into one

hipster coder20:11:52

yes, it’s amorphous, because I can’t predict the future that far

hipster coder20:11:30

I am pretty new to concurrency, you geniuses have more experience from the field with this

potetm20:11:11

There is nothing that I know about CSP that you cannot figure out yourself in short order. (e.g. just by reading and experimenting)

hipster coder20:11:30

yea, I would just write two proofs of concept… to test it

hipster coder20:11:43

I am going to ask the architects if they have done that

potetm20:11:18

But there is something you know that I could never tell you: The particulars of the problem you’re solving.

3
ghadi20:11:16

Maybe take this conversation to #core-async or #architecture

☝️ 12
potetm20:11:38

That’s where the meat is. The computer stuff will come as needed, and it will come quickly compared to the particulars of your problem.

hipster coder20:11:56

k, thanks, I joined #architecture