Fork me on GitHub
#off-topic
<
2021-02-16
>
mauricio.szabo02:02:17

Unpopular opinion: after working with Kafka for some time, I'm having trouble thinking of it as anything else as a horrible piece of software. Most of the time, it's compared against RabbitMQ or other message brokers. Then, you start to use it, and it's not a message broker - there's no retry or dlx mechanism, nor individual commit of messages (making retries really hard). It's consumer is also not thread-safe, so you have to poll and treat messages in a single thread... Except that this also does not work, because if your consumer is slow, Kafka drops you (even if it's still conected), so you have to poll from time to time; but well, it's not thread-safe, so you have to poll on a thread, send messages to another thread, coordinate things to ack (commit) on the "poll thread", and handle states like "pause/unpause". It also lefts to clients how to handle when new clients connect or disconnect, defaulting to a very poor implementation of "wait everyone to finish"... Am I missing some magic that makes Kafka so popular? If someone knows, please tell me...

ghadi02:02:13

@mauricio.szabo I have been knee deep in Kafka for a few weeks looking at rebalancing behavior, and understanding how the timeouts compose with each other is complex

ghadi02:02:06

The full lifecycle of consumer groups is never described anywhere

ghadi02:02:41

But you can find bits and pieces by reading source code, KIPs and config docs

noisesmith02:02:03

my understanding was that a consumer group was a mapping from a topic to a known offset, but what you say makes me think I'm missing something

noisesmith02:02:13

oh, it also maps clients to partitions within that group - that is complex

noisesmith02:02:21

I never used kafka in a way that relied on that aspect

ghadi02:02:23

A consumer group is a set of offsets into topic partitions, but the consumers that constitute a group are dynamic and change during a failure or deploy. How that works is tricky but critical to grok

noisesmith03:02:50

there are a lot of features in kafka that seem like attractive nuisances - the high level description sounds like a good thing to have for your app, but the way it maps to the system of behaviors in kafka turn them into tarpits (or maybe nobody I've worked on kafka apps with are using the features quite right)

noisesmith03:02:51

someone more qualified than me should write "kafka: the good parts"

mauricio.szabo03:02:39

Yes, exactly - everything is incredibly low-level, and all the complexities that other systems did solve are delegated to clients. AFAIK there's no official library that handles all these edge cases, lifecycles, etc...

noisesmith03:02:04

for example, I had an app that worked great with single partitions per topic and programmatically created topics taking their place, we had a long term plan to work with partitions if we were bottlenecked on throughput but that didn't happen

noisesmith03:02:59

or maybe I'm remembering it slightly wrong because it was more rewarding working on my own multiplex, rather than trying to figure out the docs and config and missing source code around the one built in

mauricio.szabo03:02:38

Well, currently I'm facing a problem: I have slow consumers (they need to call a slow API for each message) and Kafka does not like at all slow consumers... I tried multiple libraries, most (if not all) suffer from not implementing something that Kafka wants... And I also have no idea how people work with Kafka on single-threaded envs like Node, or ones that have GIL like Ruby or Python (considering that you need to keep polling in background)

noisesmith14:02:29

my understanding is that confirming read isn't meant to be a high level "my app state is that this message is done with", it's a low level "I'm letting the broker know that my offset can move forward" with a limited time to reply this is not just a bad design kafka wise (though they could maybe abstract it better). if you didn't have these sorts of timeout limits it provably could not provide the correctness guarantees that it pretends to offer this is annoying to deal with, but still easier than doing distributed consensus by hand yourself

mauricio.szabo22:02:53

It's not just commit. In fact, commit is the least of the problems. Is the low level API, the non-threadsafe but you need to use threads, the pause/unpause dance, the API for healthcheck where you need to handle the gain and loss of partitions...

mauricio.szabo22:02:45

When you combine all of these, and the lack of a higher level API, it's incredible awkward to work with, and seems like Kafka was sold like a Ferrari but in the end is more like a "here are all the pieces, assemble yourself, and the instruction manual is fragmented in multiple places, sites, and fóruns"

hiredman03:02:25

I mean, they use C extensions and run it outside the gil?

em06:02:09

I've heard a lot of good things about https://pulsar.apache.org/, and the architecture and usability seems far improved from surface level inspection. E.g. actual separation of storage and serving concerns, pull based messaging support so no more long polling, stateless broker and ease of replication, etc. Definitely less of a community compared to Kafka, but I'm wondering if anyone has had real experience and can chime in about whether or not it's as good of a straight upgrade as a lot of the material suggests it to be

mauricio.szabo12:02:58

It's already incredibly hard to coordinate everything on a high level language. Using C extensions to run outside the GIL and getting it right seems close to impossible

borkdude12:02:29

There is a GraalVM Polyglot implementation for Python which allows you to interop from Java/Clojure to Python. Even in that implementation (implemented on the GraalVM, which is a JVM) they use a GIL because else they could not support C extensions.

mauricio.szabo12:02:34

Specially because Kafka seems to think that all these manual coordination are a feature that you want to use and customize to your own liking

mauricio.szabo12:02:37

I really want to try pulsar. I had good experiences with rabbitmq and SQS (sqs only for smaller number of messages).

David Pham12:02:47

It is really off-topic, but I wanted to thank the Clojure community. The past two years have been wonderful thanks to you all. My first kid was born last Sunday, and I wanted honor Clojure and name him Rich or Richard, but I got a strong veto from my wife, so I went with the next best thing: Loan-ISaac Pham (whose initial will forever be LISP). :)

❤️ 45
lisphug 21
🎉 12
👍 7
borkdude12:02:42

Congrats David!!

Aron14:02:57

> 12. How important have each of these aspects of Clojure, ClojureScript, or ClojureCLR been to you and your projects? and in the answers "ease of development", together this question answer sounds like the old communist joke: Write an essay answering the following two questions: Who is your role model? Why Stalin?

alexmiller14:02:35

I'll be taking note of any negative responses and you will be punished

😂 15
borkdude14:02:00

with a JIRA ticket?

borkdude14:02:50

Alex knows what I'm referring to probably ;)

alexmiller14:02:36

seriously though, like many of the questions/answers we have retained them over many years to get longitudinal data and I think this dates back to the very early years when Chas was running it

alexmiller14:02:06

https://cemerick.com/blog/2011/07/11/results-of-the-2011-state-of-clojure-survey.html - you can see this in the "What have been the biggest wins for you in using Clojure?" question

Aron14:02:52

that's all relative, I don't have the same experience, never had

Aron14:02:20

I think preferences can be updated, changed

Aron14:02:42

And you must realize how much of a strong signal is to everyone who feels differently. It makes it an either-or question, no room for subtlety. I either already like and say it's important, or I can say that it's not important. Seems weird that one has to reason why a survey question need to cover all possible answers.

borkdude14:02:32

There is enough room in the survey for free text. If one looks as this as non-benevolent propaganda, you can probably find what you're looking for.

alexmiller14:02:05

well, as it says at the top, the only required questions are the first 2, you can skip it

Aron14:02:25

Not going quite there to "propaganda" 🙂, it reminded me of the joke, not the oppression 😛 I can totally agree that ease of development is part of the clojure experience, but it's not an overall feature and some things are missing that were not important in 2015.

Aron14:02:45

So if you asked me several years ago if ease of development is something that I gained by adopting clojurescript, I would've said yes because there is much less incidental complexity, but today ease of development means competition with typescript and vscode, that is, that language's sole purpose is "ease of development", while you can do everything they do, I am not sure it's worth the effort. But using typescript it's "easy", so obviously, even though I don't like typescript, I can't say that clojurescript is anywhere close to easy to use compared to that ecosystem.

Aron14:02:13

also, wizards saying that using spells is easy doesn't count.

wizard 3
Mno14:02:04

(abracadabra problem)
;=> "Solved"

borkdude14:02:41

I'm expecting this discussion to turn into an argument about simple vs easy now

Aron14:02:45

who I am to argue anyway

borkdude14:02:23

I would take simplicity of development over easy any time of the day.

Aron14:02:25

exactly, but it's not really an either or, that's why i dared to mention it 🙂

borkdude14:02:54

True, it's not either / or, but it can be a priority thing.

noisesmith14:02:30

@ashnur > wizards saying that using spells is easy rhickey is on record comparing clojure to a cello vs. a piano (a tool that requires much more work to find fluency in, in exchange for a kind of simplicity of mastery)

dpsutton14:02:58

framing ts as easy and cljs as simple is kinda priming and begging the question here i think. also not how i would frame it

Aron14:02:55

@noisesmith that makes only sense if we forget that clojure is software, its "shape" is much more malleable

noisesmith14:02:18

of course not all musical problems call for cello (but they don't all call for piano either) I like this comparison because (hey this is "#off-topic" right?) I see a similarity between the way type systems aid and constrict development and the way fixed pitch instruments like piano and and constrict music

borkdude14:02:24

How I see it: How did X help? A. B. C. This isn't framing, you can just skip the question or leave a remark that X didn't help you at all, because TSZ is now your favorite thing.

borkdude14:02:36

Oh, I sorry I misquoted that, that wasn't on purpose.

borkdude15:02:03

I removed the message as it didn't add much anyway.

Aron15:02:06

makes sense, it was originally phrased like that but then it morphed

noisesmith14:02:14

@ashnur in terms of resulting behavior sure, but in terms of development flow and experience, languages are not equal or equivalent

Aron14:02:21

who said they are equal? I only said that it doesn't make sense to make this harsh distinction, since the way we use programming languages is completely different than how we use musical instruments. Obviously there are differences, otherwise we wouldn't use different names to them... if they are equal, there is no way to tell them apart, so using different names would be only by personal preference...

noisesmith14:02:58

I didn't mean the distinction as harsh, but rather as extremely important. when I'm turning my ideas into running processes I want the set of abstraction tools that best fits the domain.

noisesmith15:02:23

or perhaps we are talking about different domains - I was talking about the domain of using the programming language not the domain of the program written (though there is something to be said for making those align)

Aron15:02:46

everyone here (> 80%) loves the experience working with clojure, but I think this also means that those who don't quite do the same things the same way or need to adapt to other constraints as well, are basically filtered out before they (we) can have a voice that is anywhere close to authentic

noisesmith15:02:30

the reason I use clojure is that it fits the way I want to work, if I want strong static types I switch to ocaml, for low level stuff I'm loving zig lately

Aron15:02:10

the discussion with static typing is different, typescript is not about static typing because they don't have that at runtime

Aron15:02:20

my only argument is that simple and easy are not mutually exclusive. One is about opportunity cost and the other about maintenance and development costs. They can't even be paid by the same currency, just to drive home that analogy.

noisesmith15:02:24

static typing is by definition not runtime

Aron15:02:49

if you say so

noisesmith15:02:53

I never said simple an easy were mutually exclusive, and I'm confused by the idea that "static" could mean something other than "during a build step"

noisesmith15:02:35

I mean, OCaml or Haskell don't have run time type checks (unlike js and the jvm which do)

noisesmith15:02:09

and I could be confused, but my understanding was this distinction was what static typing referred to

noisesmith15:02:50

what are you referring to as "authentic" above?

Aron15:02:29

just the dictionary definition

Aron15:02:35

"credible", "convincing"

noisesmith15:02:58

oh, so people won't be listened to / won't be taken seriously because they want some behavior or experience clojure doesn't offer?

Aron15:02:31

are you trying to annoy me with that reading of what I said?

Aron15:02:06

I am talking about a delicate shift in perspective in a complex domain. Please don't try to reduce it to some simplistic linear narrative.

noisesmith15:02:22

no, I'm trying to understand you, instead of being annoyed

Aron15:02:58

Do you know what the difference between complex and complicated is?

noisesmith15:02:59

I still am not sure what shift of perspective or complex domain you are talking about - I thought I understood but now I'm not sure

noisesmith15:02:17

no, please explain

noisesmith15:02:27

sure - btw based on google complex vs. complicated looks like the sort of thing I'd use systems language to address (first order vs. second order systems)

noisesmith15:02:23

btw I was confused by the usage of "authentic" above because to me authenticity is about veracity of some type, and I wasn't able to map what would be real vs. fake in that context

noisesmith14:02:52

otherwise we could all save ourselves a lot of hassle and write everything in the one objectively perfect language

alexmiller14:02:04

so, Clojure?

😄 12
clojure-spin 27
😏 3
gottem 3
✔️ 3
alexmiller14:02:06

sorry, I apologize for on-topic comments in #off-topic

lol 9
Mno14:02:06

The “What is on-topic for #off-topic?” debate is a classic as well. I personally love it.

☝️ 6
alexmiller15:02:27

that debate is definitely on-topic for off-topic

12
jjttjj17:02:41

I want to be able to dynamically register subscriptions on a stream of incoming messages. Incoming messages are assumed to be a flat map, and subscriptions can match either exact values on the keys of the incoming message, or a set of possible values.

;;; subscriptions
{1 {:x :a ;;:x must be :a, :y must be :b
    :y :b}
 2 {:x #{:c :d} ;;x can be :c :or :d
    :y :b}
 3 {:y :b}
 4 {:y :b :d 2}}

;;incoming messages
{:x :c, :y :b} => matches 2, 3
{:y :b} => matches 3
{:x :a} => matches nothing
Is there a term for doing this kind of matching and/or a way to keep the subscriptions organized so that I can add more subscriptions while keeping the computation to go from message->subscriptions optimal? I could figure out a way to do this but I have feeling this might be a thing already that I just don’t know the name of?

jjttjj17:02:31

This seems related to and probably a subset of rules engines or datalog/EAV stores but I'm wondering if there's something more basic or specific to just keeping of tree of ANDs and ORs

noisesmith17:02:48

there's also similarity to graph style execution (eg. what make(1) and plumatic/plumbing do), except those are single dispatch (deep and not wide) and it seems like you want full traversal

noisesmith17:02:04

this is also the dominant pattern in audio synthesis: you make a graph of which nodes process the result of which other nodes, you start with your output node, and walk up the graph to find all the things to calculate, then propagate the data back down the graph

hiredman17:02:30

rete is sort of like a database in reverse. in a database you have a bunch of data and you want to organize it for fast lookup when you are given a query, with rete you have a bunch of rules and you want to organize them for fast matching when you are given data

hiredman17:02:28

http://www.clara-rules.org/ might be something to look at

Mno19:02:59

Daaaang now I’m looking for an excuse to use that… I guess it’d be useful for core.logic stuff?