Fork me on GitHub
#clojure
<
2022-03-12
>
pinkfrog03:03:32

Hi. Is it possible to perform network requests with async/await like paradigm in Clojure?

Noah Bogart03:03:29

What do you mean by “async/await like paradigm”? On its face i’d say the answer is “clojure.async”

pinkfrog04:03:01

Yup. I can create macro to wrap core.async and achieve similar purpose. Might be the only way.

domparry04:03:22

You can use a clojure promise too.

domparry04:03:16

And future might be useful too, depending on what you're trying to do.

pinkfrog04:03:22

Promise and future ties up the underlying thread though.

hiredman04:03:35

core.async doesn't do io at all

hiredman04:03:31

You can connect to existing async things (netty, nio, etc) but there is nothing in core.async itself out of the box

hiredman04:03:36

Aleph and manifold are core.async sort of things (I prefer core.async) that started out more directly being built on top of netty so it has more direct io support

hiredman04:03:04

If your touch point is JavaScript, I would not worry about it too much to start. The jvm and clojure have multiple threads, they work great

craftybones09:03:22

I ask here as #reitit doesn’t seem to be terribly active. What is the canonical way to refer to route data within handlers/interceptors/middleware with reitit? I see a lot of things being passed around in the context, so I am not sure what the right way to access it is

p-himik09:03:14

What do you mean by "route data"? Path parameters specifically, all kinds of parameters, or something else?

p-himik09:03:02

Ah, it allows you to attach arbitrary data to a route, I see. Interesting, never noticed it before.

craftybones09:03:36

Yeah. Arbitrary data

craftybones09:03:19

A core premise is that one should have the ability to associate arbitrary data. However, there is no clear way mentioned to access this data within interceptors/middleware etc.

craftybones09:03:10

The data is clearly included as a part of the context, but it is incredibly verbose.

craftybones09:03:59

The context is structured in a way that the data can be accessed through multiple keys. I just wish to know what the right way is

p-himik09:03:28

Are there usages of the route data in reitit's source code itself? It can easily be that interceptors should access that data in a way that's different from the one middleware should use, or handlers.

craftybones09:03:09

Yeah, I am looking 😄

craftybones09:03:44

They have some info about it in their middleware section, but that pertains to ring

craftybones09:03:26

They have a reitit.ring/get-match function that gets the matched route along with its associated data

Richie16:03:58

The section "Linked Lists Considered Harmful" from https://www.forrestthewoods.com/blog/memory-bandwidth-napkin-math/ emphasizes that "pointer chasing is bad". "Pointer chasing is 10 to 20 times slower." I was surprised at how large a difference it made using pointers. I feel a little embarrassed that I'm surprised at the magnitude. Aren't all jvm objects behind a pointer? My reasoning is that the GC needs some indirection there so that it can move things around. Does the jvm suffer any less? Thanks!

Ben Sless16:03:04

That's indeed a weakness of the JVM, although the JIT compiler can often inline a lot of this indirection

Richie16:03:11

Huh. Do you know where I can read more about it? I didn't find what I was looking for on the internet but my search terms may have been too specific.

Richie16:03:31

I guess I should try to recreate the example benchmark. Thanks.

dgb2316:03:15

Look for articles and talks about data orientation in video game development. Or some of the talks of Martin Thompson to get a glimpse.

Ben Sless16:03:57

You can read about value types on the JVM, the proposal illustrates the difference

Richie17:03:49

Ok, thanks! Those suggestions sound helpful.

Alex Miller (Clojure team)17:03:56

“Mechanical sympathy” is a good search term

Richie17:03:12

Oh yea, that's good; thanks!

emccue20:03:15

I read a hot take that most of the GC complexity of java is "apologizing" for that "everything is behind a pointer and able to be shared" property.

Ben Sless20:03:31

Let my hot take be "and that's a good thing", I don't want to fight with a compiler, leave that to run time :)

dgb2323:03:45

From reading the Clojure source, there seems to be a surprising amount of “mechanical sympathy”. Sure therr is a tradeoff leaning towards functional programming. But runtime awareness sneaks in left and right.

emccue20:03:19

At work we are starting to build out a structure for inter-service communication. For some stuff http calls are appropriate, but others require some queue like semantics. Our current approach for building out those "inter-service queues" is to use Kafka and have a dedicated kafka consumer on every service that needs to read messages out + producers that just write values. Before we go too deep i think it makes sense to dig for experience reports - how have others solved this sort of issue? (backend is ~99% clojure with handful of independent services)

vemv20:03:14

a common report (which I can +1) is that Kafka is conceptually sound but operationally painful. e.g. what's the point of immutability if you're gonna have bugs that essentially can mean lost/never-processed data anyway I don't have a favorite alternative but most certainly wouldn't pick Kafka right off the bat

Drew Verlee20:03:27

why do they require queue like semantics? kafka is distributed queue. so your question is a bit self fulfilling. Are you asking how kafka compare to other queue like options? At a glance the answer is obvious, they have done less marketing 🙂.

Jon Boone21:03:23

@U3JH98J4R -- following up on @U0DJ4T5U1's response -- are you trying to implement a distributed message bus with queue semantics, that may have multiple writers and 1+ reader per topic?

Jon Boone21:03:47

If so, would you ever need the peer-to-peer HTTP calls?

Ben Sless21:03:55

It's not solved at our place but I've been eying asyncapi

Ben Sless21:03:15

To the extent I want to invest in writing clojure tooling around it

emccue21:03:33

Our only explicitly enumerated use cases right now are enqueing things for asynchronous processing - a few flavors of notification service and background jobs like pdf generation - and reliably handling webhooks from services we integrate with for money stuff

emccue21:03:47

The side benefits there is hope of getting are around decoupling, encouraging services to have more well defined interfaces, etc

vemv21:03:38

there's also the approach of polylith for inter-module communication, and SQS/etc for simple stuff like background jobs or webhook retention buys you a lot of simplicity (vs distributed services, kafka deployment, etc) You might also see a decoupling in my approach: architectural/maintainability gains are decoupled from distsys matters

emccue21:03:16

The reasoning for Kafka internally right now is • We don't want multiple queue systems and we are pretty sure Kafka can be moulded to whatever use case • The perception that it's the "industry leader" and so we will be in a better position when issues arise

emccue21:03:50

We aren't going full polylith, but we have split out shared components as local dependencies

🙂 2
emccue21:03:19

(very much in progress though, as we are splitting a sizeable monolith)

emccue21:03:52

@U025L93BC1M I think there will likely still be use cases for http calls as a synchronous api

emccue21:03:48

If we don't have that ability in some places it would artificially make implementing our frontend more silly

Drew Verlee23:03:26

Kafka is a lot more than a que though. Do you really need persistence, replication, multiple readers and writers, rm eventual consistency, etc... Those are costly. I don't have answers just more questions ;).

Ben Sless04:03:13

Once you get to a point where your services need to scale you'll need support for multiple readers and writers, and persistence lets you have back pressure semantics between services. Kafka's pretty great

devn05:03:14

additional skeptical-of-Kafka as your answer reply

devn05:03:13

the operationalization of it is not trivial by any stretch and frequently in my experience is overkill

devn05:03:59

my read on adding it to a stack has always been something like: “do you intend on having a team manage it?”

devn05:03:59

in really big orgs, maybe it’s a cost worth eating, but in sub 1000 orgs it's a cost center more than a capability

devn05:03:15

I suppose the other elephant in the room is whether splitting your monolith is what you think you ought to be doing. Splitting is not always useful. I've endured multiple “split it up” and then a year later “perhaps we ought to put it back together” runs.

☝️ 1
Ben Sless05:03:59

1000 is a but of a stretch, no? When our r&d was well below 100 everything was already on Kafka

Ben Sless05:03:14

The distributed monolith is the real concern imo

devn05:03:25

Fair, just trying to paint a picture of the operational cost.

devn05:03:37

It's sometimes a very bad idea if your org isn't geared towards dedicating “resources” to managing the whole thing.

devn05:03:40

I've seen places add it and it was everyone’s job to manage Kafka, and that did not work out particularly well IMO.

Ben Sless05:03:05

True, although the operational costs aren't exorbitant. A "platform" group is good as the organization grows

Ben Sless05:03:38

Any distributed system introduces a supervision overhead. If you don't have Kafka you'll need to manage DNS and service discovery

devn05:03:02

err, I'll just add that “exorbitant” is a business sort of call

devn05:03:29

some businesses are happy to burn barrels of cash

Ben Sless05:03:50

And that you can't make it beforehand :)

🙂 1
Ben Sless05:03:16

"We are all trapped in causality"

devn05:03:19

where I struggle is that there is often some kind of boring non Kafka solution that every dev can participate in managing without leveling on Kafka

devn05:03:48

so that's where I'd be reaching first if I could make it work

devn05:03:21

if you want to spawn a platform team or whatever, that's cool, just would be careful about having buy in on the biz side before I said hell yeah

devn05:03:26

that growth from small to medium org often comes with some pain

devn05:03:42

again very much in my experience

noisesmith07:03:49

> kafka is distributed queue. this is false - if what you need is a queue, there are much simpler alternatives than kafka, and kafka isn't really a good queue

Ben Sless07:03:55

Interesting. Can you break it down a bit? It seems like a conclusion of knowledge and experience and it could benefit from context

noisesmith07:03:23

I've successfully used kafka on a very small (never more than 5 devs) team, and the operations aspect slowed down product development, but my main complaint in that case was that what we needed was a job queue, and kafka put a lot of complexity into offering features (durability, replication) that didn't help our use case and required us to roll a lot of features you'd expect from a work queue (task assignment, restarts) by hand. At another job kafka was used as an immutable log and event source, which is exactly what it is meant to do. It was still a pain in the ass but it actually matched the task at hand in that case.

noisesmith07:03:11

kafka gives you the ingredients that can hypothetically make a job queue - topics, reader offsets, etc. but they aren't actually put together as a queue - what you actually have is an immutable (though garbage collectable) distributed log. using something that's meant to be a queue is a lot easier to use and maintain, and it's what I'd advise in cases where you aren't event sourcing and don't require an immutable log

Ben Sless07:03:41

> Kafka as a job queue > shudders I've seen the same mistake. Getting rid of it will take time

Jon Boone13:03:52

@U3JH98J4R I anticipated you might say that. I would like you to consider the following: if your core is asynchronous, but your front-end is synchronous, you are creating a friction point. Synchronous API calls into an asynchronous core are a facade that limits the front-end’s perception of the system so that it appears to be synchronous only.

Jon Boone13:03:09

The result will be unnecessary buffering/latency and resultant scaling of resources at the API layer.

Jon Boone13:03:39

My advice (as an overall strategy; introduced piecemeal for transition purposes) is “Go asynch or stay synch” — don't bother mixing and matching the two as a long term approach.

jumar15:03:25

Somebody shared this on another slack: https://redpanda.com I know nothing about it and have no experience with Kafka, I'm just sharing it out of curiosity If I needed something queue-like I would probably use Amazon SQS, SNS, or Kinesis

lukasz16:03:07

If you need a queue - use a queue system like RabbitMQ - much simpler to operate IMHO. We used Kafka at one point and handling things like retries, backoff are so much harder. Kafka has upsides though: you can consume messages in order and "rewind" topic history - something that Rabbit cannot do just yet. Or you can use the database as a queue - until a certain scale it's also a viable option.

jumar17:03:48

Again, I have little experience with this stuff but what I would consider a useful feature is the ability to go offline, possibly for many minutes, then come back and easily consume events that arrived in the meantime - that is they wouldn't be lost if the consumer(s) are offline around the time they arrive.

jumar17:03:21

Clubhouse published a nice blog post about this topic: https://shortcut.com/blog/more-reliable-webhooks-with-queues

lukasz17:03:49

Yeah, you can do that with Rabbit just fine - set your queues to be durable and persistent. In fact, that's how all our consumers operate - we have open sourced our framework for safe RMQ consumers: https://github.com/nomnom-insights/nomnom.bunnicula/

lukasz17:03:31

and one of the primary use cases is exactly that: accepting webhooks and processing them out of band - I forget how many millions of msg/s we're processing right now with a mid-size managed instance in AWS

didibus22:03:57

Might sound dumb, and I don't know your companies policies about this, but cloud vendors are great for that use case. Use say AWS SQS and SNS and all your problems are now trivial.