Fork me on GitHub
#onyx
<
2016-09-02
>
vijayakkineni05:09:50

does onyx play nice with tolitius/mount?

lucasbradstreet05:09:32

@vijayakkineni there’s no reason it shouldn’t. We normally build our onyx peer systems with component, but you can use mount just fine for this since they’re pretty compatible ideas

asolovyov05:09:18

@lucasbradstreet so for ordering easiest way seems to be :onyx.windowing.aggregation/conj all the stuff for some time window and sort them before writing to kafka, right?

lucasbradstreet05:09:15

That's basically correct, yes. You may gain some benefit by writing your own aggregation and refinement, depending on your requirements. But I think you will need to use an aggregation

asolovyov05:09:52

right, I even think some short one (like 10 secs) should be ok

asolovyov05:09:14

I guess 10 seconds should be enough to prevent ordering collisions? 🙂

asolovyov05:09:19

or maybe a minute

lucasbradstreet05:09:57

That might be okay, except that retries will kill you.

lucasbradstreet05:09:34

We'll be doing retries differently in the future, which should help you get the number down later

asolovyov05:09:19

Heh :) ok, I'll try living like that :)

mariusz_jachimowicz09:09:40

I was using mouse wheel for zooming

lucasbradstreet09:09:11

Thanks for that PR. It's in onyx-dashboard master.

lucasbradstreet09:09:19

Helps just to be able to pan

mariusz_jachimowicz09:09:55

@aaelony I was developing using only onyx-visualization, so I need to check is it working correctly via picking by onyx-dashboard

lucasbradstreet09:09:23

If you pull master on onyx-dashboard it will have the viz release with your changes

mariusz_jachimowicz10:09:54

I seems that zooming is not working in dashboard, I am not sure why. I see that there is an exception in console Uncaught TypeError: undefined is not a function

lucasbradstreet10:09:23

Ah, hmm. Let me give it a try

mariusz_jachimowicz11:09:48

@lucasbradstreet running dashboard via dev mode or prod mode?

lucasbradstreet11:09:11

I can try building a jar and seeing if it works though

mariusz_jachimowicz11:09:41

I was testing via java -server -jar target/onyx-dashboard.jar 127.0.0.1:2188

lucasbradstreet11:09:54

Ah, let me give that a go

lucasbradstreet11:09:50

Ah yes, it’s breaking in advanced mode "core.cljs:90Uncaught TypeError: d3.behavior.zoom(...).Tp is not a function"

lucasbradstreet11:09:14

Probably about time I get it using d3 with externs, rather than use the +++ hack everywhere

mariusz_jachimowicz11:09:14

I could try add externs for d3

lucasbradstreet11:09:10

It’s worth a try. I remember I had some issues getting it working reliably, but I probably didn’t try hard enough

mariusz_jachimowicz11:09:21

need to check how it is done in Circle CI frontend

lucasbradstreet11:09:33

I worked around it with the +++ macro, but please feel free to investigate the externs stuff further, since we need a proper fix for it eventually

lucasbradstreet11:09:14

master should work now

lucasbradstreet13:09:39

@mariusz_jachimowicz thanks, I added comments to the PR. It doesn’t seem to be helping in onyx-dashboard

Travis13:09:58

@lucasbradstreet More Windowing issues, lol. I am about to confirm if this is the case but we are using a custom aggregation as we talked about before ( fixed with timer trigger ) and Kafka loaded with a dataset. What we are having issues with is that with each run of this job we are getting different results in our aggregation. One of the fields in our Aggregation is a count so if we see a segment that has the same group by key we update the data structure in the window and update the count. At the end of the run the count is always different. Any ideas on what we might be able to look at?

Travis13:09:50

let me state a little different the sum of all the counts in our data source is always different ( it should equal whats in kafka )

lucasbradstreet14:09:56

@camechis a few questions/considerations

lucasbradstreet14:09:17

1. Are you seeing any retries?

lucasbradstreet14:09:28

2. If so, are you using deduplication on your windowed task?

Travis14:09:36

we are getting ready to do another run but I don’t think so

lucasbradstreet14:09:55

3. How much are the counts off?

Travis14:09:11

So we never see more usually a ton less

Travis14:09:22

thousands off

lucasbradstreet14:09:14

Thousands, out of millions? Obviously any discrepancy is bad. I’m just asking because it might point to an issue.

Travis14:09:03

so roughly 90K in Kafka only got about 35K

lucasbradstreet14:09:24

Right. Big discrepency then

lucasbradstreet14:09:36

Also, yeah, if it’s undercounting then it’s not a retry issue

Travis14:09:58

yeah, i would feel better if it was over and I saw retries

lucasbradstreet14:09:12

What refinement are you using?

Travis14:09:27

we pulled elastic out of the picture from the trigger and replaced with a simple cassandra writer because we were wondering if we were losing data writing to elastic ( it was causing crazy slow downs ), Cassandra is screaming.

Travis14:09:32

discarding I believe

lucasbradstreet14:09:58

If you’re using discarding, is it possible you’re writing out the window state to a particular key in cassandra, discarding the window state, accumulating some more, and then overwriting that same key?

Travis14:09:21

that was a thought but we are using a UUID as the key

lucasbradstreet14:09:23

as in, a new UUID each time you write to cassandra, distinct from the grouping key? Hmm

Travis14:09:49

we used a key from the data in the trigger but maybe it would be a good test to just generate a new key on each flush

lucasbradstreet14:09:10

K, yeah, or you can try by switching to an accumulating refinement to test

lucasbradstreet14:09:19

Either would work

Travis14:09:09

yeah, this issue is just driving us nuts, lol

Travis14:09:05

will let you know what we find with these few changes

mariusz_jachimowicz14:09:53

@lucasbradstreet is dashboard not working in prod and dev mode after my PR ?

mariusz_jachimowicz14:09:09

are there any msgs in console?

Travis14:09:44

@lucasbradstreet What does the

:trigger/fire-all-extents? true
actually mean?

lucasbradstreet14:09:18

@camechis It means it will call a trigger for each window. This is always the case for timed triggers, and you use a fixed window anyway.

lucasbradstreet14:09:39

@mariusz_jachimowicz I saw the same d3 error as before

lucasbradstreet14:09:01

@mariusz_jachimowicz Did onyx-dashboard work for you after you lein installed your version of onyx-visualisation?

mariusz_jachimowicz14:09:30

I was testing only in onyx-visualization. How to pick up fresh code in onyx-dashboard?

lucasbradstreet14:09:41

I just tested by bumping onyx-viz’s version number, “lein install”ing, and then making sure that it’s the same version in onyx-dashboard

Travis15:09:28

@lucasbradstreet We changed the trigger to generate a UUID for the key into cassandra to guarantee no overwrites. This had no effect. What we are noticing is the numbers are different but within a couple hundred each time

Travis15:09:09

so actuall numbers 38K ( +- a couple hundred ) out 115K

Travis15:09:54

We are going to attach a lifecycle to our OUT task to see if we are even processing all the data by writing the un windowed segments out to cassandra

Travis15:09:56

@lucasbradstreet So we are definitely processing all the data. We are losing all the data in the window. The out task write showed the exact number in kafka ( 115K )

lucasbradstreet16:09:43

@camechis sorry, not much more I can really do to help you debug this one. I think you’ll have to look at exactly what is going on in the trigger. Try with less data, run some sanity tests, etc

Travis16:09:01

yeah, starting that now

Travis18:09:30

@lucasbradstreet We figured it out, It as our own fault, lol. We had a little bug in the count we were writing out. Man that was something to track down, but I am so happy its working as expected.

gardnervickers18:09:01

Ah nice, glad you were able to figure it out.

Travis18:09:51

Yeah, feel like throwing a party after getting that one figured out, lol

lucasbradstreet19:09:07

Phew. Yeah, after we got to that point I thought there couldn’t be much that would be our fault 😄. Glad you found it

Travis19:09:06

I was pretty sure it was ours , lol

aaelony19:09:20

just noticed that http://www.onyxplatform.org/docs/user-guide/0.9.10-beta1/#windowing-and-aggregation says

... Windows are intimately related to the Triggers feature. When you’re finished reading this section, head over to the Triggers chapter next. 
but the Triggers section precedes the Windows section.

lucasbradstreet19:09:51

Yeah, I think that’s the wrong order then

lucasbradstreet19:09:03

Our docs recently got redone. I’ll fix that

lucasbradstreet19:09:05

Fix in develop. It’ll be up on the website when we do the next beta. Thanks!

aaelony21:09:45

Found another section with issues...

Coordination within Plugins
Often virtual peers allocated to a task may need to coordinate with respect to allocating work. For example, a Kafka reader task may need to assign partitions to different peers on the same topic. The Onyx mechanism for coordinating peers is the [log](\{\{ "/architecture-low-level-design.html" | prepend: page.dir | prepend: site.baseurl }}) The Onyx log is extensible by plugins, by implementing several extensions defmethods.
http://www.onyxplatform.org/docs/user-guide/0.9.10-beta1/#plugins

vladclj22:09:11

Hi, how can I send a lot of different request to another server with Onyx? [:in :send :send1 :send2 :send3 :send... :out] The number of requests is variable