Fork me on GitHub
#clojure
<
2021-11-05
>
FiVo09:11:52

I have the following problem. I have n machines which are both consumers and producers of some data. I want to distribute the data in a consistent hashing manner across the machines. Is there like a go to library for something like this in Clojure/Java? I looked at ZeroMQ and JGroups. ZeroMQ is a hassle to setup and both have don't have well maintained clojure clients. Would prefer something brokerless.

didibus15:11:03

There's Apache Ignite I believe you could checkout. It defaults to TCP/IP multicast for node discovery, or you can just statically tell it the other IPs and ports of the nodes, both are relatively easy to setup. You can also scale it to thousands+ nodes by switching to ZooKeeper discovery, but that's more annoying to setup. It's brokerless, nodes talk to each other directly. Not sure about a nice Clojure wrapper, but I tend to use interior directly with stuff like this, and find it just as easy most of the time as using a Clojure wrapper, or I guess I tend to wrap the parts I use myself haha. There's also Hazelcast, but not sure if it's free.

respatialized18:11:16

For something like this I might try just dropping both machines on to a https://tailscale.com/ network and use a CRDT to distribute the data instead of reaching for a message queue or something with a more traditional client/server setup https://github.com/smothers/cause https://github.com/aredington/schism

respatialized18:11:59

You might even get away with just using something like unison if you prefer to rely more on the filesystem https://www.cis.upenn.edu/%7Ebcpierce/unison/

dharrigan09:11:22

Would you consider Kafka? That is designed for such a thing, and would all you to have consistent partitioning based upon the partition key.

FiVo09:11:39

I mean it feels a bit overkill to setup a kafka cluster for this, that's why I was wondering if there is something without a broker in the middle.

emccue11:11:16

@finn.volkel can you go more in depth into your requirements?

FiVo13:11:25

I actually don't have super good grasp of the requirements. I expect less than 1000 msg per second per machine and msg size less than 200 bytes.

FiVo13:11:29

It's for distributed crawling. So the messages are essentially urls to be crawled by a different worker.

rutledgepaulv16:11:32

If you're in a cloud environment this might be a good use case for a queue service like AWS SQS. Your workers just read messages off the queue and delete them from the queue after successfully processing the task

rutledgepaulv16:11:18

Certainly easier if you can avoid having your processes coordinating with each other and instead pull tasks from a shared queue

emccue12:11:26

> don’t have well maintained clojure clients I’d lower your expectations there. The clojure community is very small in the scope of things and there isn’t enough time in our lives to give a proper wrapper for every useful java library. Just calling whatever well maintained java library isn’t bad at all in most cases

👍 5
💯 1
1
flowthing15:11:10

Why does sequence, when given a transducer and a coll, realize (only) the first element of the coll?

user=> (defn prn-inc [n] (prn n) (inc n))
#'user/prn-inc
user=> (def xs (sequence (map prn-inc) [1 2 3 4 5]))
⁣1
⁣#'user/xs

Ed15:11:07

sequence returns a lazy-seq of the application of the transducer to the collection. However, it realises the first element on initialisation, which a normal lazy-seq doesn't do.

Ed15:11:34

if you (count xs), for example, it'll realise the whole thing

flowthing15:11:41

Yes, sorry, imprecise question — I'm wondering specifically about realizing the first element on initialization.

flowthing15:11:13

That is, why sequence does that when normal lazy seqs don't.

Ed15:11:22

I think that's to do with the implementation of transducers ... I think it was explained in one of Rich's transducer talks, but I forget the actual reason right now

Alex Miller (Clojure team)15:11:33

the whole point of it is to lazily compute

Alex Miller (Clojure team)15:11:53

but sequence with a transducer is NOT a lazy seq

Alex Miller (Clojure team)15:11:34

it is incrementally computed but not lazy in the intermediate steps like other lazy seqs

dpsutton15:11:00

> When a transducer is supplied, returns a lazy sequence of applications of the transform to the items in coll(s) from the docstring of sequence.

flowthing15:11:06

Hmm. Does that mean the docstring for sequence is wrong?

Alex Miller (Clojure team)15:11:26

it means there's too much to explain in a docstring :)

dpsutton15:11:27

Haven’t dug into the RT bits but i suspect that the reason you are seeing the first element “computed” is the logic (or (seq x) ()) somewhere in the code.

dpsutton15:11:31

or some analog to that.

dpsutton15:11:26

([xform coll]
 (or (clojure.lang.RT/chunkIteratorSeq
      (clojure.lang.TransformerIterator/create xform (clojure.lang.RT/iter coll)))
     ()))
from the source of sequence. and the first thing chunkIteratorSeq does is call iter.hasNext() to see whether to return null or a lazy seq

Ed15:11:46

https://youtu.be/4KqUvG8HPYo?t=2165 ... I think this is the bit where Rich explained it

flowthing15:11:29

@U11BV7MTK Right, that makes sense, thanks. :thumbsup:

flowthing15:11:01

@U0P0TMEFJ thanks, will take a look.

flowthing15:11:42

(I'm pretty sure I've watched that talk, but I must've forgotten much of it.)

Ed15:11:03

yeah ... me too ... I find the details leaking out of my brain these days ... I'm clearly getting old 😉

flowthing15:11:53

But that bit does explain it nicely. :thumbsup:

didibus15:11:29

I always forget the difference, every time lol. If I remember, instead of having each chunk go through each step one by one. It has each element of the chunk go through all steps. So lazy-seq goes: do step1 for next-chunk do step2 for next-chunk And sequence goes: for each element in next chunk: do step1 for element do step2 for element

Alex Miller (Clojure team)15:11:08

the easiest example to think about is something like mapcat range - lazy seq will just compute elements as needed where as each intermediate range will be fully computed for the transducer sequence

Alex Miller (Clojure team)15:11:05

and in the extreme case, an infinite intermediate sequence is fine for a lazy seq, and very not fine for transducers :)

didibus15:11:22

Ya, I think that's the case that makes more sense. Because the lazy-seq will force realize each chunk as well. So the real difference in behavior from a user point of the view is short-circuit no?

Alex Miller (Clojure team)15:11:42

it's not short circuit as much as actually lazy vs not lazy

Alex Miller (Clojure team)15:11:34

in practice, infinite intermediate seqs is pretty unusual :)

didibus18:11:52

Ya, and I think that concept confuses me too much haha. I tried with a (range) as intermediate, and than a take 10 at the end, and transducer didn't go in an infinite loop. And then I thought, well that make sense, since each element go through the whole chain one by one. So I'm not sure how to construct an infinite intermediate that transducer chokes on.

didibus18:11:44

(->> [1 2 3]
  #_=>      (map inc)
  #_=>      (mapcat #(do (println %) (range)))
  #_=>      (map inc)
  #_=>      (take 10))

didibus18:11:33

(into []
  #_=>  (comp
  #_=>      (map inc)
  #_=>      (mapcat #(do (println %) (range)))
  #_=>      (map inc)
  #_=>      (take 10))
  #_=>       [1 2 3])

Alex Miller (Clojure team)18:11:50

so that doesn't use sequence, which is what we're talking about

Alex Miller (Clojure team)18:11:01

(take 10 (sequence (comp (map inc) (mapcat #(do (println %) (range))) (map inc)) [1 2 3]))

didibus22:11:48

Oh... ok for some reason I thought sequence would still be lazy chunk by chunk, except each chunk is realized through the transducer pipeline. You're right though, that spins.

jumar05:11:23

From the discussion I understand that sequence realizes at least one element (chunk) on initialization But I was still surprised that this realized first 64 elements (2 chunks):

;; this prints 0 .. 31
(def xs-seq (sequence (map inc) (map prn-inc (range 128))))

;; this prints  32 .. 63
(first  xs-seq)

didibus01:11:30

I think that's different, try

(def xs
 (sequence
  (comp
   (map println)
   (map inc)) 
  (map
   #(do
     (println ":: " %)
     %)
   (range 128))))

didibus01:11:35

What happens is that sequence will pull the first chunk from the lazy-seq. Then when you grab elements from the sequence, it will run the transducer on the full chunk, and since it ran over that chunk, it will grab the next chunk from the lazy-seq

didibus01:11:16

So you'll see (first xs) will print the first 32 from the lazy-seq, then it'll print the first 32 from the transducer, and finally the next chunk, 32-64 from the lazy-seq will print, but they won't go through the transducer.

restenb15:11:02

any way to "clojure" this? new TypeToken<Watch.Response<V1Namespace>>() {}.getType()

didibus18:11:33

I think that's creating an anonymous child class no? So maybe with proxy?

didibus18:11:59

(-> (proxy [TypeToken] []) .getType)

Alex Miller (Clojure team)15:11:04

that's some wizard stuff :)

restenb15:11:25

imagine having something on the level of cognitect/aws-apifor that too

rutledgepaulv13:11:25

That is the goal of my library: https://github.com/RutledgePaulV/kube-api . The core module is usable at this point but still work to be done in the others and I haven't touched it in a while

hiredman16:11:23

87k lines of json describing the api

😲 1
hiredman17:11:44

for reference google's android publisher api description is about 4k lines of json (but it isn't a completely valid spec)

pyr16:11:02

@restenb I hope we can finish cleaning up our library and manifest build tooling. Our managed Kubernetes offering at Exoscale runs on top of Kubernetes itself and is built in Clojure

pyr16:11:50

one thing I discourage is doing plain Clojure codegen from the openapi spec, it results in namespaces that are too big and causes issues

ivana18:11:40

Hello. How can I convert .clj file with set of functions into .jar file and use last one in another clojure project?

Ben Sless18:11:38

Using your build tool, build and deploy to some repository

ivana18:11:09

Thanks, but can I do it locally?

Ben Sless18:11:47

Both lein and clojure CLI support local install

ivana18:11:28

Have such code in separate project

(ns trivial-library-example.core
  (:gen-class))

(defn add-doc [db e doc] [[:db/add e :db/doc doc]])

ivana18:11:05

run in its wd lein do pom, jar, install , get /home/ivana/pet-projects/Clojure/trivial-library-example/target/trivial-library-example-0.1.0-SNAPSHOT.jar

ivana18:11:39

In main project Dockerfile added ENV DATOMIC_EXT_CLASSPATH=/home/ivana/pet-projects/Clojure/trivial-library-example/target/trivial-library-example-0.1.0-SNAPSHOT.jar

ivana18:11:31

rebuild container, run @(*d/transact* conn [[trivial-library-example.core/add-doc 17592186045423 "this is foo's doc"]]) - get Syntax error (ClassNotFoundException)

ivana18:11:47

What do I wrong?

ivana19:11:05

Is my .jar file correct? How can I test it without datomic&docker interaction? Just for localising the problem. Or I should fight for absence SNAPSHOT in its name or smth else?

ivana19:11:51

Just checked it via :resource-paths in project.clj - it works, at least in repl. Then seems that .jar is right, will continue to fight with docker/datomic

emccue20:11:12

@U0A6H3MFT what does the separate Jar get you?

ivana20:11:15

@U3JH98J4R ability of using my own util & helper functions inside datomic transactions, not only stdlib and core

ivana20:11:32

Oh my, seems that it works!... Just trying not to forget all the steps how to reproduce it 🙂

ivana23:11:35

Finally found usable way of using datomic classpath functions as standalone tx-creators or database transactor function helpers. That was all I wish 😊

West03:11:09

@U0A6H3MFT Would you consider sharing this mini library?

ivana20:11:29

@U01KQ9EGU79, looks like it's not a case for library in a sense of code, actually it's a step-by-step process for involving your custom clojure code into datomic transactor and all the ways of using it - directly or inside db-functions. Maybe I'm not smart enough, but I coudn't do it reading docs only, and had to spent several hours on experiments and googling, but finally I got it.

West20:11:09

Either way, the knowledge is valuable. I’d love to see this shared online.

ivana20:11:51

Maybe it's a good case for blogpost or short youtube stream. I'l think how to share it, but if not only you are interested in it, I'd like to see the all of participants - maybe I'l organize a small stream with online code & screen sharing and realtime feedback with all the participants. What do you think about it? Or this theme is so special and not actual for other guys?

JohnJ19:11:39

Is there a transcript of the 'a history of clojure' talk? (I know there is a paper)