Fork me on GitHub
#babashka
<
2021-06-21
>
Sathiya06:06:30

Hi. I have a requirement to periodically push data into kafka topic in batches. If its a one time run, i could have directly run it in with bb script only construct the data. But since i need to run on a periodic basis, i need to push to kafka through bb script. But i keep running into dependency problems with every clojure jar i use. The java import

(import '[org.apache.kafka.clients.producer KafkaProducer ProducerRecord])
also didnt help. Can anyone help me out with this. Is it possible to use babashka for kafka

lispyclouds07:06:18

Hey @U016GSFNDNY so the lib you're using is written in Java and bb is meant for primarily interpreting clojure code and not java byte codes. Some of the java classes have been baked into bb builds for convenience and to help out libs but not the Kafka one.

lispyclouds07:06:32

The way you're expecting to use wont be possible in bb unless we compile bb with it being baked in. I had a similar need once and gotten around by shelling out to https://github.com/deviceinsight/kafkactl and sending things. Not so smooth but gets the job done

Sathiya07:06:51

I tried adding various clojure dependencies too @U7ERLH6JX. It gets errored out for missing dependencies

Sathiya07:06:12

Is there any kafka client library that is compatible wiht babashka

lispyclouds07:06:43

Most of the kafka libs are wrappers around the canonical java lib and we need to bake that in to make it work like you want

Sathiya07:06:19

Thanks @U7ERLH6JX. That helps.

lispyclouds07:06:51

as far as i know, none of the popular libs would work, unless they somehow implement in pure clojure, still there could be issues

Sathiya07:06:39

Yes. thats exactly what happened. Thanks for your help. I will try to approach with a different solution

lispyclouds07:06:16

kafkactl with babashka.process sort of solved my usecase, which was quite simple though 😄

Sathiya07:06:09

Cool thanks. Will give it a go

lispyclouds07:06:22

you could also implement it as a Pod: https://book.babashka.org/#pods Would be a really useful one! 😄

👍 3
borkdude07:06:20

There is a pod here: https://github.com/tzzh/pod-tzzh-kafka But it doesn't have any releases yet. @UJADF897F?

Sathiya07:06:49

nice. thanks @U04V15CAJ. will try it out

Sathiya07:06:09

yeah @U04V15CAJ there is no manifest available for this

borkdude07:06:13

There is no manifest because there is no release

👍 3
borkdude07:06:26

You could try to compile it locally

borkdude07:06:46

and then load it with (load-pod "my-local-binary")

Sathiya05:06:53

There were errors while compiling it. unable to create a binary from it

lispyclouds06:06:48

How did you try to compile it @U016GSFNDNY?

lispyclouds06:06:07

just cloned and ran go build seems to compile for me

Sathiya06:06:42

Did the same actually

lispyclouds06:06:51

what errors do you see?

lispyclouds06:06:34

also what's your go version?

Sathiya06:06:42

# 
In file included from ..\..\go\pkg\mod\\confluentinc\[email protected]\kafka\00version.go:24:
./librdkafka/rdkafka.h:83:10: fatal error: sys/socket.h: No such file or directory
   83 | #include <sys/socket.h> /* for sockaddr, .. */
      |          ^~~~~~~~~~~~~~
compilation terminated.

Sathiya06:06:46

go version go1.16.5 windows/amd64

lispyclouds06:06:37

so it seems librdkafka, the c lib powering the confluent connector isnt supported on windows

lispyclouds06:06:52

basically you need to have MinGW gcc installed

lispyclouds06:06:04

if thats there, the compile may work

lispyclouds06:06:33

alternatively you could try using Windows Subsystem Linux

tzzh13:06:01

Hey guys yeah sorry there is no release as it's using CGO (ie C extensions for go) and so it's harder to cross compile it (I didn't know about the issues about confluent-kafka-go on windows but I guess that makes it even harder)

borkdude15:06:12

@UJADF897F We have a pod which is using CGO here: https://github.com/babashka/pod-babashka-go-sqlite3 If you want, you can copy the configs :)

tzzh15:06:21

nice, thanks 🙂 I'll try to take a look when I have a bit more time

stijn08:06:33

is this intended behavior?

{:tasks {repro {:doc (str "This tasks repro is: " "showing a doc string")
                :task (shell "ls")}}}

stijn08:06:55

❯ bb tasks
The following tasks are available:

repro (str "This tasks repro is: " "showing a doc string")

stijn08:06:50

I'm trying to build a doc string from some default values, but it interprets the value of :doc literally

borkdude08:06:19

Currently the docstring isn't evaluated, except those on functions, if you refer to them with :task foo.bar/baz

borkdude08:06:35

I mean, then the docstring is picked from the function

borkdude08:06:00

We could change that, this is just currently how it works

borkdude08:06:41

it's a bit difficult to decide how to evaluate those string args though

stijn08:06:57

yeah, it would come in handy, but looking at most commandline tools that support 'actions', the overview doesn't give a lot of detail, and only does so when you would bb repro --help

borkdude08:06:58

because it may refer to things depending on previous tasks etc

stijn08:06:49

ok, no worries. I can work with this, was just wondering if this was intended behavior

stijn08:06:22

thanks for babashka tasks by the way, it's super awesome for writing dev tooling

❤️ 3
👍 3
bocaj17:06:24

I’m looking at this libary to do consistent hashing https://github.com/replikativ/hasch. The lib has a platform ns: (:import java.io.ByteArrayOutputStream java.nio.ByteBuffer java.security.MessageDigest) . Are these dependencies under consideration?

bocaj17:06:18

Or, any more bash-like idea for consistent hashing? I’m simply trying to cache which rows/objects/maps have been already processed in a data pipeline.

bocaj17:06:44

I like hasch b/c it’s very thorough.

borkdude17:06:08

I think these classes might already be in bb

bocaj17:06:58

digging into the code, the lib uses protocols as well: that’s the larger hurdle, right?

borkdude17:06:10

protocols are also supported in bb

bocaj17:06:14

oh! great

borkdude17:06:22

just try out the library, you'll likely run into something else ;)

bocaj17:06:20

If I hunt in here: https://github.com/babashka/babashka/blob/master/src/babashka/impl/classes.clj and don’t e.g. ByteArrayOutputStream then it’s not in bb, yet. Correct?

borkdude17:06:09

Let's talk in a thread.

borkdude17:06:21

borkdude@MBP2019 /tmp $ export BABASHKA_CLASSPATH=$(clojure -Spath -Sdeps '{:deps {io.replikativ/hasch {:mvn/version "0.3.7"}}}')
borkdude@MBP2019 /tmp $ bb -e "(use 'hasch.core)"
----- Error --------------------------------------------------------------------
Type:     clojure.lang.ExceptionInfo
Message:  Unable to resolve classname: java.io.FileInputStream
Location: hasch/platform.clj:162:24
Phase:    analysis

----- Context ------------------------------------------------------------------
158:   java.io.File
159:   (-coerce [f md-create-fn write-handlers]
160:     (let [^MessageDigest md (md-create-fn)
161:           len (.length f)]
162:       (with-open [fis (java.io.FileInputStream. f)]
                            ^--- Unable to resolve classname: java.io.FileInputStream
163:         (encode (:binary magics)
164:                 ;; support default split-size behaviour transparently
165:                 (if (< len split-size)
166:                   (let [ba (with-open [out (java.io.ByteArrayOutputStream.)]
167:                              ( fis out)

borkdude17:06:43

I think we can just add that class.

bocaj17:06:27

Is there a way for me to quickly add, and iterate through all the dependencies?

borkdude17:06:08

yeah, you can clone babashka and invoke it using clojure -M:babashka/dev ... when you make an alias for it, that's how I do it

🙏 3
borkdude17:06:46

After adding that class, it appears to work

borkdude17:06:59

$ clojure -M:babashka/dev -e "(use 'hasch.core) (uuid5 (edn-hash \"hello world\"))"
#uuid "1227fe0a-471b-5329-88db-875fb82737a8"

borkdude17:06:17

I will commit to master. New binary will appear in #babashka-circleci-builds in a few minutes

borkdude17:06:37

@U068BQFJ9 ok, it's there now and it seems to work.

$ bb -e "(use 'hasch.core) (edn-hash [\"hello world\" {:a 3.14} #{42} '(if true nil \f)])"
(245 139 93 212 32 95 155 217 230 42 204 224 210 124 22 156 241 230 65 199 21 108 160 143 225 185 228 16 141 66 80 96 35 189 202 198 252 99 12 214 23 106 43 113 138 129 16 131 7 110 102 1 55 15 4 148 118 187 201 28 144 90 192 21)

bocaj17:06:49

Thanks, again! Do you do windows builds each release?

bocaj17:06:36

wow, again

bocaj18:06:04

So glad this is all in the open, I have lots to learn.

borkdude18:06:10

@U068BQFJ9 To be clear, you can just download the binary from appveyor and test it out, right now

borkdude18:06:25

You don't have to wait for a release and you can do some testing pre-release if you want

bocaj18:06:48

Yes, I assumed that. It’s working on my mac.

bocaj18:06:26

I have a windows server, so I’ll check that later. Safe to assume windows binary always works out?

borkdude18:06:01

Just to be sure, I'll add hasch's unit tests to babashka's CI later this week. Those tests also run on Windows CI

borkdude18:06:12

but usually there aren't any Windows-specific problems

👍 3
borkdude19:06:39

I added all tests to CI now. They are all passing, except around hashing records like (defrecord Foo []). This is because records are implemented differently in bb.