Fork me on GitHub
#onyx
<
2016-08-05
>
gardnervickers00:08:35

Definitely write your workflow as a test using mock data and inputs/outputs first

gardnervickers00:08:43

Or if you have access to your live data source, you can just read it in through the appropriate Onyx plugin locally while you write your test.

Drew Verlee03:08:01

sorry feel like i asked this before but forgot to record it down (that and some other bits). Does it make more sense to start a project that intends to end using kuberentes with onyx-template, onxy-starter or something else. Im reading over the project files now to get a sense on how things are setup.

gardnervickers03:08:06

I would definitely go with the Onyx Template. It gets you started with a Dockerfile and docker-compose.

gardnervickers03:08:04

On top of that, I am currently working on the docs to go along with a full Kubernetes deployment. Things have been rather busy lately so it’s slow going. It’s being developed in the open here https://github.com/onyx-platform/onyx-twitter-sample/tree/master/kubernetes Most everything works as far as K8s manifests go, it’s just that to use it requires a fairly comprehensive understanding of both Onyx and Kubernetes at this point.

gardnervickers03:08:57

It mostly just needs a final good push to tidy up some loose ends and piece together the disparate documentation. Expect something more concrete in the very near future.

gardnervickers03:08:20

@drewverlee: I sense you’re asking that question to try and be as future-proof as possible, if that’s the case then going with the Onyx Template as it stands now, and following the task bundle pattern will get you there 😄

Drew Verlee04:08:08

@gardnervickers: Thanks a ton for the insight. futureproofing was one aspect of it. My question was mainly because of my nativity with the to-production story. I suppose, that story depends greatly on your needs though. Its been helpful to read. Just to see someone specific example and thought process https://dataissexy.wordpress.com/2016/07/31/using-onyx-template-to-craft-a-kafka-streaming-application-part-1-clojure-onyx-kafka-streaming-data/

gardnervickers04:08:38

Ah great good to hear!

dominicm11:08:48

Is there a strong reason that :onyx/params uses keywords into the catalogue? It's an interesting idea.

lucasbradstreet12:08:16

As opposed to a vector of args?

lucasbradstreet12:08:28

I think you'd have to ask Mike, but my guess would be to make the data in the catalog entries more composable, e.g. you can just merge task map opts in to your task map, without having to check whether you have to mess with an args vector

lucasbradstreet12:08:49

It also allows tasks to be supplied or without without a fn, where the args are named, which is a kind of documentation

dominicm12:08:01

That seems to make sense. I guess it makes it easier to add arguments in varying orders too.

dominicm12:08:00

Do you know where the idea for the catalogue came from? I'm interested in implementing something similar in a library.

dominicm12:08:18

For a very different domain. But it's just such a nice idea.

lucasbradstreet12:08:49

You'd have to ask Mike :)

dominicm12:08:58

Haha, okay. Thanks 🙂

michaeldrogalis14:08:35

@dominicm: Once you separate structure from the equation (the workflow), you have to do something with all the parameterization that it used to be joined with. Thus born, the catalog.

dominicm14:08:25

@michaeldrogalis: It's an awesome structure. It's something I've been grasping at mentally. Seeing it realized in Onyx made it finally click for me.

dominicm14:08:38

I'm currently building a validation library built on a similar split.

dominicm14:08:38

I just think it's such a good way to bring declarative programming to Clojure.

michaeldrogalis14:08:31

@dominicm: One of the upshots is that, since our entire API is described in a big Clojure map, we were able to generate a Datomic schema pretty quickly, then load in tasks exactly as they are.

dominicm14:08:14

@michaeldrogalis: Not dissimilar from the benefits I'm seeing in this domain. I hadn't even considered it's application for datomic schema generation yet.

michaeldrogalis14:08:52

Clojure.spec generation, too, for client & server.

dominicm14:08:34

Yep. That's my favourite one. Especially as the equation is serializable, the client can take the equation, and figure out what it needs from it.

michaeldrogalis14:08:14

It's been good stuff. :thumbsup:

michaeldrogalis16:08:34

@vijaykiran: Upgrading the docs may be a weekend activity. Lots to do today. Haven't forgotten about it. 🙂

aengelberg17:08:06

It looks like Onyx yells at me if I specify a :onyx/type :function task with no outbound edge in the graph. Is there a way to specify a function task with no out edge, whose only purpose is some side effect with no result to pass downstream?

lucasbradstreet17:08:07

you can just set :onyx/plugin :onyx.peer.function/function

lucasbradstreet17:08:25

it’s not fantastically documented unfortunately. We should make a better error message

aengelberg17:08:19

ah awesome, thanks guys!

Travis17:08:23

Any ideas on why onyx test env would hang? Also no onyx.log is being produced

aengelberg17:08:53

hmm, tried putting in that plugin but still getting:

------ Onyx Job Error -----
There was a validation error in your workflow for key :handle-lifecycle
[
   [:read-input :dispatch-events]
   [:dispatch-events :handle-lifecycle]
                     ^-- Intermediate task :handle-lifecycle requires both incoming and outgoing edges.
...

michaeldrogalis17:08:49

@camechis: No idea, sorry. Doesnt sound like it actually got off the ground if the log isnt writing

aengelberg17:08:55

oh I see, it has to be a :onyx/type :output.

Travis17:08:39

@michaeldrogalis: Yeah thats my thought, I know didn’t give you a lot to go on, lol

Travis18:08:42

So my onyx job with the onyx test env is running ( although having issues somewhere in the middle ) but no onyx.log is being created. Any ideas on what would cause the log to not be produced at all?

gardnervickers18:08:07

What does your config look like?

Travis18:08:14

which part?

Travis18:08:19

peer or env or both?

gardnervickers18:08:40

onyx.log/config

Travis18:08:57

ah, its set to warn? Is there any warn output?

gardnervickers18:08:18

Try not setting it

gardnervickers18:08:22

use the defaults

Travis18:08:49

just switched it to trace in the config. So looks like this

Travis18:08:14

:onyx.log/config #profile {:default {:level :trace}
                             :docker {:level info}}}

Travis18:08:34

for both peer and env

gardnervickers18:08:48

That forces the default timbre appenders, writing to stdout

gardnervickers18:08:56

set :default nil

gardnervickers18:08:04

and then you’ll get logging out to onyx.log

gardnervickers18:08:23

we set {:level info} there so when using docker we write to stdout as is customary

Travis18:08:43

that makes sense

Travis18:08:18

hmm set it to nil but still nothing in the log

Travis18:08:02

:onyx.log/config #profile {:default {:level nil}
                             :docker {:level :info}}

Travis18:08:37

there we go

mccraigmccraig19:08:43

just upgraded to 0.9.9.0 and i'm seeing this from onyx-kafka on startup - https://www.refheap.com/121943 - does that look familiar ?

lucasbradstreet19:08:40

I’m waiting for it to load

lucasbradstreet19:08:51

but is it possible you’re using kafka 0.8

lucasbradstreet19:08:00

you need to use plugin onyx-kafka-0.8

mccraigmccraig19:08:45

i've upgraded production, just haven't upgraded my laptop

lucasbradstreet19:08:46

We probably should’ve done a better job making sure that the upgrade threw an error there

lucasbradstreet19:08:03

I don’t know if we even mentioned it in changes.md. Sorry

mccraigmccraig19:08:18

it is a bit non-obvious 🙂 - thanks for the help @lucasbradstreet

Travis20:08:40

Is there an easy way to tell if multiple peers are properly communicating?

Travis20:08:15

I just scaled up on mesos but I am not sure if the peers are connecting to each other or if they think there each running standalone

gardnervickers20:08:14

Have your nodes each spawn one or two peers and run a job that requires more than one or two peers.

gardnervickers20:08:49

But tbh, as long as your bind_addr is routeable from all other nodes in the cluster you should be fine

Travis20:08:52

I scaled up 3 nodes with 5 virtual peers each

Travis20:08:08

I see that in the dashboard it is showing 15 peers

gardnervickers20:08:24

What are you setting BIND_ADDR to?

gardnervickers20:08:38

or :onyx/bind-addr

Travis20:08:53

hmm, i am using the template i need to check how that is getting set

gardnervickers20:08:33

It defaults to hostname

Travis20:08:56

yeah, i just looked at my scripts

Travis20:08:13

so I am running inside Docker via marathon

Travis20:08:45

trying to think of what that should be

Travis20:08:46

6215de8a6d98        peerimage:1.0   "/init opt/run_peer.s"   24 minutes ago      Up 24 minutes       3196-3198/tcp, 40200/udp, 0.0.0.0:7250->40200/tcp

gardnervickers20:08:22

Does Marathon have any tools for service discovery? Container identity?

Travis20:08:23

yeah it has mesos-dns which gives you srv record, There is also VIP which allows you to set an IP:PORT combo that can be used ( it will be load balanced with this )

Travis20:08:49

potentially could also set the networking to Host instead of Bridged

gardnervickers20:08:24

Kubernetes provides an ip address through the “Downward API”, which is what we use. They also allow creating services that provide srv records to container’s based on a selector system. You’ll need something like this for Marathon.

Travis20:08:40

right, just so I make sure i understand if these were physical boxes how do the peers discover each other and then communicate?

gardnervickers20:08:29

bind_addr needs to be set to something that other peers in the cluster can talk to. So for bare metal that would be an IP address or a hostname

Travis20:08:32

trying to get all the networking straight in my head

Travis20:08:51

do they talk on the aeron port ?

michaeldrogalis20:08:16

@camechis: Peers advertise their address through a log entry, the address specified by bind_addr & the Aeron port. The host and port of each peer then gets stored in the replica.

Travis20:08:42

ok thats kind of what i was thinking

Travis20:08:35

is there any easy way to see what the peers advertised? Then i might be able to figure out how to fix it

Travis20:08:09

{:tags nil, :group-id #uuid "edf849e8-1ede-4890-a096-a40629e6c8d2", :id #uuid "5a9c313e-87c8-494d-8646-79570da5f95c", :peer-site {:aeron/port 40200, :aeron/external-addr "localhost”}}

gardnervickers20:08:10

If you’re not setting BIND_ADDR when launching your containers, it is whatever hostname is in your container.

Travis20:08:12

from the dashboard

Travis20:08:10

my guess is

aeron/external-addr “localhost
is not right , lol

gardnervickers20:08:16

Yea I’m not sure what you’re exact setup is

Travis20:08:51

you know what, I can probably query the mesos-dns to get what its srv record is

Travis20:08:58

inside of the docker container

Travis20:08:06

and then set the BIND_ADDR

michaeldrogalis20:08:05

@camechis: If you use localhost, all peers will try to talk to themselves since they're advertising their own address.

michaeldrogalis20:08:43

@mccraigmccraig: You may have the title for longest running Onyx production deployment. Not sure if @robert-stuttaford got off the ground before you. But it's been a while. 🙂

mccraigmccraig22:08:23

it has indeed been a while @michaeldrogalis - i started cutting code july 15 and iirc we hit our first production version in dec 15. it's been a pretty straightforward experience too - onyx pretty much always does what it says on the tin :D

michaeldrogalis22:08:08

@mccraigmccraig: Glad to hear its been smooth 🙂

Travis22:08:40

So we have 3 peers deployed with 6 virtual peers and we can see a log where it connected to kafka however nothing happens. If we launch the job with 1 peer and enough virtual peers it starts processing data. Anyone have any ideas on what to look at here? Still wondering if our aeron communication is working right with some new changes

michaeldrogalis23:08:06

@camechis: Almost certainly sounds like an Aeron problem. Is the UDP port open?

michaeldrogalis23:08:18

If it works for 1 but not for > 1, there's a network communication issue.

Travis23:08:53

@michaeldrogalis: thnx will look into it more. Think we are having issues getting the ephemeral IPS and ports for docker mapped right through mesos