Fork me on GitHub
#architecture
<
2020-11-12
>
mbjarland16:11:30

In the java world there seems to be a trend with a lot of buzz around reactive extension and implementations like rxjava/reactor. I have a proprietary and fairly large java api which uses this "reactive pattern" and I would like to write a clojure version of it. For context, the java api is really just a wrapper around a web service api and all actual actions result in http calls over the wire. As an example, we could have code on this shape (in this case using the jvm CompletableFuture class, but the shape applies to rx as well):

var pipe = userService.getUserByKey("some-user-key")
          .thenCombineAsync(
            userService.getUserGroupByKey("some-group-key"),
            (user, group) ->
              userService.assignUserToGroup(user, group))
          .thenComposeAsync(this::checkUserForFraud)
          .exceptionally(throwable -> {
            ("log some error", throwable)
          })

pipe.get()
where the last get would trigger the execution of the "asynchronous chain of operations" defined above (the get is contrived, just here to indicate that the pipe is only executed if somebody asks for the result). How would one go about translating this pattern into an idiomatic clojure api? Use core.async pipelines? Use plain core.async? Transducers? Plain old fp? Use something else entirely? Java interop with the completable future (yeah I'm not going there but figured I'd throw it into the competition)? Pros, cons? In a perfect world I would like to retain the ability to declaratively define chains of operations as in the above, but if the downsides drown you, living without it might be an option.

Ben Sless17:11:53

https://github.com/funcool/promesa aims to work with completable futures idiomatically in Clojure. I think it does the job rather well

Ben Sless17:11:04

doesn't solve the code being async-colored

mbjarland08:11:16

@UK0810AQ2 thanks for the pointer, this is useful. Will add it to my list of things to consider. Though I am starting to lean towards the downsides with the reactive pattern weighing more than the upsides, i.e. perhaps a clojure api would be best served staying away from this pattern.

Ben Sless09:11:11

I share the feeling, however, until we get project loom (unless you want to run experimental JDK in production) your options are: - synchronous code - buy in to async world which colors all your code - buy into streaming abstraction (such as core async pipelines) which will dictate your entire architecture If all of this happens in a web handler perhaps you could try interceptors as well. I also saw you mentioned synchronous db access due to jdbc. You can try to square that circle with core async, or alternatively give the vertx client a try. I saw metosin used it in porsas but haven't tried it myself. (http://github.com/metosin/porsas)

mbjarland10:11:27

The sync db comment was not from me. In my particular case for this particular api it would be making calls to rest apis to assemble a response to a client http request

mbjarland16:11:07

We can for the sake of this example assume only single valued things in the pipe, i.e. not reactive streams like Flux et al

seancorfield16:11:15

Looking at the code above, I'm struggling to understand how you benefit from async operations, if you're calling get straight after?

mbjarland16:11:28

get is contrived

seancorfield16:11:43

OK, so I'd probably just reach for future and deref at first, but core.async for coordinating results might be useful in larger examples.

mbjarland16:11:51

so for example in micronaut which is a web service framework, you can return a reactive type (Flux or Mono, but the shapes look similar to the above) from a controller method and this lets the web server use the request thread for something else, have the lengthy operation run on some thread pool and get notified when the long running operation is done

mbjarland16:11:20

I get the sneaking feeling that we would somehow drop to a lower level of abstraction with future and friends vs let’s say project reactor

☝️ 1
seancorfield16:11:50

We have some macros that wrap CompletableFuture https://github.com/worldsingles/commons/blob/master/src/ws/clojure/extensions.clj#L124-L150 but I don't think we're using them at work right now.

seancorfield16:11:15

(those exist because the callback version of future never got added to the language)

mbjarland16:11:13

my somewhat uninformed read is that the whole point of CompletableFuture and later the reactive extensions of which reactor and rxjava are implementations is the composability, i.e. we're not just saying here's a future value but defining a chain of operations which can be passed around

seancorfield16:11:35

I think that unless you're genuinely getting performance benefits from async code, you're just making stuff more complex.

👍 2
mbjarland16:11:54

yeah I'm leaning in that direction as well

emccue16:11:32

The whole point is to juggle N tasks between M threads where N >> M

mbjarland16:11:37

my exposure to the api is not large enough for me to say with certainty but the code looks horrible and what it is doing is really not rocket science

emccue16:11:49

If you don't need that performance, don't do it

mbjarland16:11:32

well this would be living in a cloud micro service environment so not tying a thread to each request might be relevant

seancorfield16:11:57

If you mapped the above example onto futures and derefs, you'd get something like:

(check-for-fraud @(assign-user-to-group @(get-user-by-key "some-user-key") @(get-user-group-by-key "some-group-key")))
(assuming all those functions returned future values)

seancorfield16:11:17

I find it hard to believe it's really worth running such low-level tasks on threads...

mbjarland16:11:19

and error handling?

seancorfield16:11:38

Just wrap it in try/`catch` 🙂

seancorfield16:11:30

When you have a reactive framework, the temptation is to write everything in that style which often doesn't make sense, IMO.

seancorfield16:11:09

Some things definitely benefit from async execution but the granularity of that is important.

emccue16:11:28

There are kinda two things going on here 1 - Doubt at you needing to do the reactive way 2 - How to do the reactive way anyways

mbjarland16:11:03

right and I totally hear you with the doubt part, I have a healthy dose of it myself so could be it's where this ends up.

mbjarland16:11:47

But to get the options clear, going with 2 you would look at future as a building block with possible some async channels and coordination thrown in?

seancorfield16:11:00

Given that I/O is going to be synchronous, you're going to either be blocked on that or use up threads to try to get them async. But are two simple "get" calls (presumably to a DB? or maybe an API, I guess) going to be so slow that using two threads to run them and then joining the results is faster than just running them sequentially?

emccue16:11:11

well, that is one way

emccue16:11:31

you can also just use the CompletableFuture api like the code already does

emccue16:11:40

and just interop

mbjarland16:11:55

right, yeah, that is an option...perhaps one I was hoping to avoid

seancorfield16:11:03

(or some syntactic sugar to make it easier to read)

emccue16:11:09

and there are two issues with doing that

emccue16:11:17

1. The unavoidable async-ness of the code

emccue16:11:27

2. The syntax

emccue16:11:33

2 is eminently solvable

emccue16:11:15

what is really important to note regardless is that your database access is going to be synchronous

emccue16:11:21

no matter what you do

emccue16:11:32

(thanks jdbc)

emccue16:11:10

so if most of what your app does is talk to a db, there is reason to believe that the actual performance benefits of "async-ifying" small tasks like that are gonna be pretty small

mbjarland16:11:14

in this case the api-to-be wraps a rest service and the java side decided to build it based on CompletableFuture

mbjarland16:11:54

so the scenario would be that there is a user request to a cloud server, the code we are talking about sits on that server and is used to talk to another web service over http. Scale could potentially be quite large so performance might come into the picture.

mbjarland16:11:45

I'm simplifying but that's essentially the scenario

mbjarland16:11:17

so most of what the app would do is talk to other web services and then serve a http result

seancorfield16:11:39

Hitting a REST API for every "low-level" operation could be slow enough to warrant async/futures but that's what our obsession with microservices has led us to 😞

emccue16:11:46

and it is now that we all say a prayer for the swift arrival of project loom amen

🙏 1
seancorfield16:11:42

Even with Loom, you're still going to have code that is full of threads and derefs. They're just cheaper.

emccue16:11:27

I mean, yeah but in the context of a web service you can just thread-per-request and then the core logic wont need it

mbjarland16:11:39

guess loom talks about tco as well but alas they seem to want to do most everything else first, tco last (totally beside the point here, I would just love to see tco)

emccue16:11:05

anywho, thats probably not going to come before economic collapse from exponential global warming

emccue16:11:42

so you are stuck with futures/completable futures

emccue16:11:10

slash core.async channels

mbjarland16:11:56

so let's assume we are, how does the web server situation look in clojure? can you tell say ring to handle io things on a separate pool of threads somehow? How would you actually do that part or would you have to code that yourself?

mbjarland16:11:19

(and apologies, I have little production exposure with clojure)

mbjarland16:11:55

would you even run a clojure web server at scale?

emccue16:11:27

oh i have no real world experience with clojure, i'm just a vocal idiot - but the "simplest" way is just to run a Jetty server

emccue16:11:12

and your core logic works with the ring request map

emccue16:11:19

and returns a ring response map

lukasz16:11:35

FWIW Ring supports async handlers

mbjarland16:11:46

right I was wondering about the "return a reactive type" or "return a future" and have the server realize what to do with it

mbjarland16:11:00

@lukaszkorecki ah ok, yeah that is what I was fishing for

emccue16:11:20

I know in pedestal (which is a few more layers of kool aid deep) you can have handlers return a core.async channel

mbjarland16:11:34

and I guess in this context a core.async channel would be analogous (Rich will kick me in the nuts) to say a reactor Flux

emccue16:11:41

the ring system seems slightly different, but fundamentally similar - you call a callback with the value at some point in the computation

lukasz16:11:58

but! We run a service doing some DB calls and processing every single user request at 15ms 95p at 130 req/s. Not sure if that's a scale

seancorfield16:11:29

The async model in Ring is kind of hard to use and is sort of "fake". But running Jetty in production -- via the default Ring adapter -- scales pretty well without needing fancy async stuff.

lukasz16:11:39

☝️ this

mbjarland16:11:04

well that is nice news, I have some mileage with jetty from previous lives

lukasz16:11:12

Personally, I'm still dubious about core.async in general - maybe the code we write just doesn't need it or something, but our IO heavy services just use good old thread pools (with sugar provided by claypoole).

mbjarland16:11:35

assuming your user requests can be either paritioned predictably or do not have any node affinity, you can always scale horizontally, not sure about the perameters here yet

emccue16:11:18

in that case it really is a "where do you place value" problem

emccue16:11:44

lets say you can horizontally scale, but it costs the company 5x as much than if you wrote it high performance

mbjarland16:11:48

yeah, like I said, at this point not sure if horizontal scaling will be simple or not

emccue16:11:59

is living with worse code worth your life energy

seancorfield16:11:05

@mbjarland At work, almost all our services and apps are stateless and we run three instances of most of them. I wouldn't say we are "high scale" but we have thousands of concurrent users 24x7. And that's on Jetty, without much multi-threaded assistance.

emccue16:11:22

when in a year or 2 you get loom for free

emccue16:11:31

(or more, eternal pessimism)

mbjarland16:11:36

I'm old enough to value my life energy

emccue16:11:50

and all your async work is now useless

mbjarland16:11:59

@seancorfield very useful information, thank you

emccue16:11:19

(assuming requests do just delegate to other servers, that would probably be the case)

mbjarland16:11:06

also that model seems alluringly simple and we would love to not complicate things if not necessary

emccue16:11:02

I remember a relevant talk, uno momento

lukasz16:11:01

Old ops mantra is to have two of everything, no matter how performant ;-) Your super scalable is not so scalable if it has to go down when deploying new code

mbjarland16:11:02

I was thinking of the cognitect aws api as an example of a idiomatic api

mbjarland16:11:24

have not looked at it much but I have to say I love the data only approach...and no completable future in sight

mbjarland17:11:01

also very repl-self-documenting which is nice

seancorfield17:11:16

An anecdote about concurrency: years ago, when we were early on our Clojure path, we built a process that scanned our DB for updates, ran searches against a (proprietary) search engine, and then produced HTML emails. It worked nicely but we were curious about how much volume we could run through the system so I turned a few map calls into pmap calls... and crashed the search engine because it couldn't handle the volume. We ended up standing up two more search engine instances, just for this process. When we started to analyze the effectiveness of sending (by that point) millions of emails a day, we figured out that we got better "bang for our buck" by sending about an order of magnitude fewer emails and being more targeted about the audience (and the content) -- and at that level, we didn't need the pmap's (and, ultimately, we didn't need those extra search engine instances). Sometimes, "scale" is not what you really need 🙂

👏 5
💯 1
Ivan10:11:14

exactly; from a technical perspective it is cool and interesting, but from a business perspective it is not always what you're after - it is key to get your engineer to think towards the business focus

leonoel09:11:35

I don't buy the "boring code" argument. This problem is not essential to reactive/functional programming, it's purely about syntax, and it's only an issue in languages with poor metaprogramming support. The infamous snippet shown around 9:33 could be written like this :

(require '[missionary.core :as m])
(m/ap
  (let [user (m/? (find-user-by-name ws name))
        cart (m/? (m/aggregate conj (load-cart user)))
        uuid (m/? (pay (transduce (map get-price) + cart)))]
    (m/? (send-email (m/?= (m/enumerate cart)) uuid))))
It's arguably as much readable and maintainable as the "boring" version, concurrent email sending is made explicit, and it's still fully non-blocking.

☝️ 1
emccue17:11:49

yeah this is the one^

mbjarland17:11:41

ok I have an interrupt, thank you everybody, this was enlightening. The clojure community is really one of the many reasons to love this language

seancorfield17:11:12

@emccue That talk looks interesting. Added to my "Talks to Watch" collection 🙂

mbjarland17:11:35

oh and I will watch the above, thanks @emccue

emccue17:11:58

The only reason I pass as halfway competent is that I watch a bunch of talks

emccue17:11:10

This one is by the person who wrote Reactive programming with RxJava

emccue17:11:21

(I had to skim to find the reason I thought it had some authority)

seancorfield18:11:35

That is a great talk -- thanks for the pointer @emccue!

greglook18:11:41

Late to the party, but we use manifold for async and it works well. We use yada/aleph for our service APIs, so it was a natural fit. https://github.com/aleph-io/manifold/

mbjarland10:11:08

Any chance you can share the container model? I.e. are you running this inside of say jetty as well or are you running an aleph server which is world facing?

greglook22:11:01

For our public APIs, the aleph endpoints are fronted by a pretty standard loadbalancer setup (with some machinery to handle dynamically-allocated instances). Aleph is built on Netty, so I’m not sure how you’d use it with Jetty anyway. Our internal services also use Aleph, but those endpoints get routed to directly by the service mesh.