This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-04-29
Channels
- # admin-announcements (1)
- # aws (10)
- # beginners (76)
- # boot (53)
- # braid-chat (1)
- # cider (80)
- # cljs-edn (3)
- # clojure (65)
- # clojure-belgium (2)
- # clojure-gamedev (2)
- # clojure-nl (3)
- # clojure-poland (1)
- # clojure-russia (39)
- # clojure-uk (14)
- # clojurescript (91)
- # cursive (62)
- # datascript (1)
- # datomic (9)
- # dirac (34)
- # emacs (25)
- # error-message-catalog (8)
- # events (1)
- # hoplon (88)
- # instaparse (1)
- # jobs (2)
- # jobs-discuss (6)
- # lein-figwheel (7)
- # luminus (43)
- # mount (5)
- # off-topic (7)
- # om (28)
- # onyx (61)
- # planck (4)
- # re-frame (27)
- # reagent (3)
- # remote-jobs (2)
- # spacemacs (3)
- # untangled (136)
What is the recommended way to set the number of peers for the docker-compose cluster? (https://github.com/onyx-platform/onyx-template/blob/0.9.x/src/leiningen/new/onyx_app/script/run_peers.sh#L8)
This is done via a docker environment variable, through NPEERS, which is supplied in https://github.com/onyx-platform/onyx-template/blob/0.9.x/src/leiningen/new/onyx_app/docker-compose.yml
ah thanks, i missed it completely
No worries
What is the best way to find the underlying issue when you see Not enough virtual peers have warmed up to start the task yet, backing off and trying again…
continuously?
I (think) I have enough peers to complete the task, I have up-ed the Zookeeper allowed client connections to 1000. Not sure what else it could be
Ah i think it is a hidden exception
peer_1 | ...
peer_1 | org.apache.zookeeper.ZooKeeper.getData ZooKeeper.java: 1184
peer_1 | org.apache.zookeeper.ZooKeeper.getData ZooKeeper.java: 1155
peer_1 | org.apache.zookeeper.KeeperException.create KeeperException.java: 51
peer_1 | org.apache.zookeeper.KeeperException.create KeeperException.java: 111
peer_1 | org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /brokers/topics/onyx-data
peer_1 | code: -101
peer_1 | path: "/brokers/topics/onyx-data"
peer_1 | clojure.lang.ExceptionInfo: Caught exception inside task lifecycle. Rebooting the task. -> Exception type: org.apache.zookeeper.KeeperException$NoNodeException. Exception message: KeeperErrorCode = NoNode for /brokers/topics/onyx-data
Ok fixing the exception helped. I guess the "not warmed up” message is a bit misleading (for me)
Yeah, I think there’s an interaction there. One of the peers fails while starting, which means that peer hasn’t warmed up and the job is killed. Maybe we could change “warmed” to “started” or something like that
or maybe “signalled ready"
Yeah it would be nice if something could tell me that something went wrong during starting instead of the generic-everything-is-good-just-wait message
Another thing, I made a mistake in the configuration of the kafka deserialization function, so there was no output. I assume there was an error somewhere given this line https://github.com/onyx-platform/onyx-kafka/blob/4e9a9da8804677b447645d872c437aa7d6619692/src/onyx/tasks/kafka.clj#L11 Where can this error be found?
btw, I was wrong yesterday about the incompatibility between kafka clients (0.8.1 vs 0.8.2). It seems that and old client can write to a newer cluster, and be read from a new client just fine. (Still need to test whether a newer client can read from an old cluster though)
I think ran into this issue while trying to consume kafka messages from my OSX host http://stackoverflow.com/questions/28664456/kafka-unable-to-connect-to-zookeeper#answer-36841719 Not sure how to solve this. I’m running everything inside the docker-compose cluster now
You should find that exception in the logs, or you can read the exception that killed the job via the job id. I'll point you to the function to do that in a second
For the Kafka connection issue, I assume Kafka is outside of your docker compose setup?
@jeroenvandijk: you can use this function which is part of onyx.test-helper to wait until a job is completed, and if it’s killed instead read back the exception that caused it to be killed
https://github.com/onyx-platform/onyx/blob/0.9.x/src/onyx/test_helper.clj#L12
The dashboard also works to view the exceptions
@lucasbradstreet: regarding the kafka connection issue, I was trying to consume from a clojure process on my os x host to the kafka process running inside the boot2docker docker cluster. Note that producing did work, and zookeeper connecting also, just consuming requires something different (similar to telnet’s requirements i suppose, as that didn’t work either from the host machine)
Regarding the exception, I think the serializer function was eating the exception via the try/catch and emitting {:error “something}. I was wondering where i would find these “error” messages
Oh right, I didn’t notice that it didn’t rethrow
I think the idea there is that the deserializer shouldn’t necessarily take down the job, but you will have to use flow conditions to catch segments that have the error key in them, and pass them to an error handling task
ah yeah I see, I guess i have to be more careful
with the kafka connection issue, I think you may need to expose the ports under the kafka and zookeeper settings in docker-compose.yml, and then use your boot2docker machine’s ip in the local consumer
yeah i think that’s what I did. I can actually produce to kafka from the host machine. And i can read zookeeper from the host machine. Just consuming the kafka stream gives a zookeeper error. I think this requires somethings else than just exposing the default ports (2181 and 9092) have been properly forwarded
Ah I think what is happening is that kafka is advertising itself to ZooKeeper with an IP that you can’t connect to, since it’s internal
maybe, but note that telnet doesn’t work either from the host and that’s just zookeeper
I think that requires setting advertised.host.name in your kafka server
ah that is interesting
i had to change the advertised hostname already for the produce of messages
so you can’t telnet to your boot2docker ip at port 9092?
from my docker-compose.yml
KAFKA_ADVERTISED_HOST_NAME: ${DOCKER_HOSTNAME}
When you docker ps, is it forwarding 9092 to 9092?
o man, i think i screwed up. telnet works. I copy pasted the wrong port
than i don’t know what goes wrong
but consuming doesn’t work from the host
816b2e35bcb7 wurstmeister/kafka "start-kafka.sh" About an hour ago Up About an hour 0.0.0.0:9092->9092/tcp adgojietlb
I assume telnetting to 2181 from the host works?
yeah both 2181 and 9092
It is not a major issue as i can rebuild my job jar and run in it inside the cluster, but it would be more convenient
Any idea what DOCKER_HOSTNAME resolves to? Could you try manually setting it to the boot2docker ip and see if that works?
ah sorry, that’s my creation DOCKER_HOSTNAME=$(echo $DOCKER_HOST|cut -d ':' -f 2|sed "s/\/\///g") docker-compose up
I double checked, it’s the same
Yeah, that looks right
It seems to be a zookeeper connection error anyway. Hmm
The fact that you can telnet to it but your clojure process on your host can’t connect to it is weird
Yeah I’ll leave it for a while and try again later. Maybe I’m just trying too many things at a time
Wireshark has Kafka protocol support. I've found it useful when diagnosing Kafka connection issues.
I was fiddling with advertised.host.name
yesterday, too. That can be frustrating to get right.
+1500 lines for the new static job analysis patch. Once again proving that providing good error messages is really, really hard.
Also proving that line count isn’t always the best judge of code quality
I will finally have some time next week, so I'm going to get the experimental other language support stuff finally moved under onyx-platform
It's in my personal github now: https://github.com/bridgethillyer/onyx-ruby
The cleanup is already partially done anyway. Just need to (remember where I was and) finish up and push the changes.
Great, thanks!
Upgrade to [org.onyxplatform/onyx "0.9.5-20160429.201738-5"]
to try out the new static analyzer. Official release will be out mid next week.