This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2015-11-04
Channels
- # admin-announcements (36)
- # beginners (54)
- # boot (74)
- # cider (14)
- # cljs-dev (91)
- # clojure (197)
- # clojure-austin (1)
- # clojure-conj (3)
- # clojure-india (5)
- # clojure-japan (3)
- # clojurescript (111)
- # core-logic (12)
- # cursive (6)
- # datascript (3)
- # datomic (16)
- # devcards (26)
- # events (1)
- # funcool (11)
- # hoplon (63)
- # jobs (2)
- # ldnclj (10)
- # lein-figwheel (6)
- # luminus (5)
- # nginx (7)
- # nyc (3)
- # off-topic (1)
- # om (148)
- # onyx (122)
- # portland-or (5)
- # re-frame (3)
- # reagent (25)
- # yada (15)
@michaeldrogalis Okay, so I have verified that if I start two instances of the app with the same onyx id, none of the tasks in the workflow actually start
If I go back to each instance having their own :onyx/id
then one of them executes tasks as normal, but the other one never gets any tasks
So I wonder if there is something I need to do to trigger the second peer joining the cluster?
So specifically, segments are pushed onto the channel but never read off the channel
@spangler: You said channel - are you trying to use core.async in a distributed environment?
The core.async plugin is for dev only and shouldn't be used in a multi-machine environment since core.async is a local runtime thing in general
I was trying to get this to work before going over to kafka, but maybe I need to switch over to kafka before this will work?
I believe the production profile of the template reads from an HTTP endpoint and dumps its output to a core.async channel - which is fine but kind of useless since no one can read that.
Its reading from a channel which doesnt really make sense
You could do that. Its hard to say without seeing your logs, you might have a more basic connectivity problem
Like, if the peers arent receiving tasks, at all, period, its the latter
onyx.log
Sure, I can check it out
So from the log it looks like it tries to start all the tasks, backs off for awhile then starts them
Is this from 1 machine?
Seems like all the tasks started fine, nothing is wrong there.
Do you see similar messages in the other log?
Right, except I have println's in all the tasks right when they start, and none of them fire
When you say "start", where do you mean?
Yeah - okay, so my guess is that Aeron's addressing is misconfigured, and the two instances can't send messages to each other. Basic network configuration hiccup
I'd investigate with standard netstat
usage to make sure ports are open and Netcat or whatever to see if data is coming through. Same as you'd do for any distributed system's pieces that cant talk
Hmm, I am using the aeron setup from the template
(defn -main [& args]
(let [ctx (doto (MediaDriver$Context.))
media-driver (MediaDriver/launch ctx)]
(println "Launched the Media Driver. Blocking forever...")
(<!! (chan))))
In your peer config, you specify a hostname and port
http://www.onyxplatform.org/cheat-sheet.html#:onyx.messaging/bind-addr && http://www.onyxplatform.org/cheat-sheet.html#:onyx.messaging/peer-port
The media driver supports the Aeron process, those params configure how the peers talk specifically in terms of network protocols
Ah, sorry those links are broken
Those params are under "peer configuration".
It doesnt matter, as long as its free. I think bind-addr
is burning you. Again just a guess. But thats the host name that the peer advertises how to contact it
So if you're using localhost, obviously that cant work
If I have two instances on the same machine, I need to give them different bind-addr
s?
We probably shouldnt have that in the prod template, but there's not really a good default
No - you need to supply a hostname that any other node in your cluster can use to talk to it.
Its like the, "This is my IP, use it to talk to me" param
So, that makes sense for when they are deployed on different machines... but if I want to test it on my development box, how can I do that?
Use localhost for that, but use different peer ports so they dont collide.
Our bad on this one. See the first note in the upcoming 0.8.0 release: https://github.com/onyx-platform/onyx/blob/master/changes.md#080
Keep using peer-port-range for now, switch it to peer-port when we release. Tiny change, big efficiency gains
Not in any 0.7.x version. Its added in 0.8.0-SNAPSHOT though.
Or do I need to set them each to different peer-port-range
s to make it work currently?
Use non-overlapping ranges, I cant recall if its smart enough to not collide
It might be, give it a shot I suppose
Alright I gotta run, sounds like you're in good shape. Catch ya tomorrow!
Anytime!
@spangler: if you're going to test locally with two JVM instances on the same machine, using different Aeron ports, look up how to turn off short circuiting via the peer config. If you don't it'll lead to lost messages.
You need to get the IP address of the interface you’re binding to. This is a good discussion of it http://stackoverflow.com/questions/9481865/getting-the-ip-address-of-the-current-machine-using-java
Alternately you could do some ifconfig shell magic and pass that in via an environment variable or command line arg
Not really. We’d probably need something durable like Kafka setup so that we could actually try pushing some data between nodes and out to somewhere where we could check
I’ll put it on my list of things to consider doing though
Ah. You may want to shoehorn your code onto the onyx-template
it’s all setup for multiple dev/prod modes
Even if you don’t try to port it over, it’d be worth creating a new project with it and have a look how you do things
I'm about to write a "going to production" check list. I'll paste it in here when it's done.
Hi all, we finally created a production ready (or multi-node) checklist that you can run through before going to production https://github.com/onyx-platform/onyx/blob/develop/doc/user-guide/environment.md#multi-node--production-checklist
@devll see above
@lucasbradstreet: it seems to me like nearly all of these points could be checked automatically by a linter?
That’s a good point. Maybe half of them could be
Probably a bit less than that. In addition, understanding of what’s actually going on is important
We’re having a lot of users go to production recently and it’s better to get a doc up before we consider that
It’s made especially hard because we need to check a number of conditions from multiple nodes
You’re right though, there are several settings that should be configured a certain way when used in production/multi-node. I think it’d require an extra peer-config/env-config setting to be enabled before it gets linted.
I'll end up asking this a few times in the next few weeks, but can you speak up here/PM me if you're using Onyx in production or are using it internally? Compiling my Conj slides.
i reckon you’re aware by now that we’re using it
@robert-stuttaford: Indeed, got'cha 😉
@michaeldrogalis: we're in test... customer pilots in the next few weeks, full production sometime in december or january probably
@mccraigmccraig: I can stick your company's logo on the slide if you'd like.
@erichmond: Hahah thanks man!