Fork me on GitHub
#onyx
<
2017-04-02
>
lmergen14:04:06

is there something like a ‘no-op’ output for onyx ? i’ve seen leaf functions, is there an alternative approach ?

Travis14:04:16

We just use clojure identity function for that

lmergen14:04:18

yeah but it appears as if onyx always wants to have a workflow end in an :output type

Travis14:04:08

use the identity as your function and mark it is an output

lmergen15:04:36

hmmm, i’m having some trouble getting onyx-kafka to work properly — specifically, getting errors about unable to match a protocol for kafka.Cluster.Endpoint: https://www.refheap.com/708d9229757d311059aca3472

lmergen15:04:44

(paste above contains stack trace + catalog)

lmergen15:04:52

i’m not sure where to look for the cause of the error — misconfiguration of catalog, incompatible library versions, or something else ?

gardnervickers16:04:25

@lmergen How are you running Kafka?

lmergen16:04:46

@gardnervickers in a docker container

gardnervickers16:04:31

Are you setting your KAFKA_ADVERTISED_HOSTNAME?

drewverlee17:04:53

I'm considering if I could create a wrapper around Pyroclast and make it our analytics engine. This would mean letting our data-scientists submit onyx jobs. I’m trying to understand the possible pros and cons of such an approach. * Would pyroclast be able to handle a large amount of chaotic job requests. e.g lots of changes to their jobs, deletes, etc..? * What is the pricing model? Is it data used, does it depend on number of jobs? * Could we build a wrapper around the simulator to give it our brand.

drewverlee17:04:16

should i ask this question somewhere else 🙂

gardnervickers17:04:12

@drewverlee 1. Yes, frequently updating jobs is something we expect to happen, and we’re actively working on tooling to help users stop/start jobs at the right offset in their event stream. 2. Mike our Lucas would have to speak towards that, we’ve discussed this internally quite a bit but I don’t think we have anything public to share yet. 3. We’ve had a lot of interest in on-prem Pyroclast, which has some nice advantages but complicates distribution/support a bit.

lmergen17:04:24

@gardnervickers so that’s likely to be the root case ? i’m a bit of a kafka noob, and was just using some run-of-the-mill kafka docker container

gardnervickers17:04:48

Possibly, can you share how you’re running Kafka?

gardnervickers17:04:54

And ZK for that matter?

lmergen17:04:03

well, zookeeper is even just the standard zookeeper:latest docker image, and kafka (as i have just discovered) does not have advertised.host.name or advertised.port set

lmergen17:04:24

so it appears you’re on to something 🙂

gardnervickers17:04:50

Ahh ok, you’ll want to advertise a hostname and port that you can reach from outside of docker.

gardnervickers17:04:12

Depends on where docker is running on your machine

gardnervickers17:04:45

In fact we do use that 😄

lmergen17:04:14

i’ll replace it with that config as well, and see whether that helps

lmergen18:04:48

@gardnervickers well, that doesn’t appear to be it. anything else i can do to figure this out ?

lmergen18:04:14

it really feels as if it is a configuration issue somewhere, with onyx-kafka not being able to figure out what the kafka brokers are

gardnervickers18:04:00

Where is docker running for you? 192.168.99.100 or 127.0.0.1?

lmergen18:04:04

OSX, so it’s hidden somewhere in a virtual machine

lmergen18:04:14

however, i’m mapping the ports on the host machine

gardnervickers18:04:14

So docker machine then?

lmergen18:04:21

so i’m advertising localhost

lmergen18:04:41

similar to that

gardnervickers18:04:58

So you’re using docker for mac, not docker-machine?

lmergen18:04:41

i’m not using docker-machine, i’m using the “noob” docker for osx

gardnervickers18:04:24

Thats odd, in your container logs for kafka does everything look ok?

gardnervickers18:04:48

I use this sometimes to work with Kafka in dev https://github.com/edenhill/kafkacat

lmergen18:04:56

yep, nothing wrong there

lmergen18:04:07

well, it’s 9pm here, i’ll take a fresh look at this in the morning

lmergen18:04:29

i’m a total kafka noob, so i’m probably doing something silly

gardnervickers18:04:53

We’ll get it figured out tomorrow, catch ya later!

lmergen18:04:22

fwiw, the kafka logs clearly state > Registered broker 1001 at path /brokers/ids/1001 with addresses: PLAINTEXT -> EndPoint(127.0.0.1,9092,PLAINTEXT) (kafka.utils.ZkUtils)

lmergen18:04:28

which is correct

lmergen19:04:59

could it have something to do with my serializer-fn ?

lmergen19:04:25

could it, hypothetically, be that the latest kafka version is not compatible ? i’m looking at this code https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/EndPoint.scala and it looks like protocolType has been removed just a few months ago in this commit: https://github.com/apache/kafka/commit/d25671884bbbdf7843ada3e7797573a00ac7cd56#diff-bab4de8ac88a63d2a69ea54e08d286dcL58

lmergen19:04:35

(which triggers an exception in this library, which hasn’t been updated in well over a year) : https://github.com/ymilky/franzy-admin/blob/master/src/franzy/admin/codec.clj#L200