Fork me on GitHub
#onyx
<
2016-08-10
>
Kira Sotnikov07:08:05

Hi guys 🙂

Kira Sotnikov07:08:43

So, question, does onyx use ZK as consistent DB only?

lucasbradstreet08:08:28

And for consensus about when peers go offline, via watches on ephemeral nodes

Kira Sotnikov08:08:40

How much data Onyx stores into DB and what kind?:)

lucasbradstreet08:08:22

It writes log messages for peer coordination, and data chunks which contain the job data containing the data that defines the job. This is of varying size and depends on what your job looks like. If this gets too big it's possible to pull it out of the job definition and put it on s3 to be pulled down when tasks start up

Kira Sotnikov08:08:18

Now I'm looking for how to increase performance of ZK cluster

Kira Sotnikov08:08:03

Other words cluster is more for fault tolerance rather than for performance

Kira Sotnikov08:08:05

thank you, looking

Kira Sotnikov08:08:16

lucasbradstreet: I suppose the case are better for a queue?

Kira Sotnikov08:08:27

I'm not sure ZK is good for this

Kira Sotnikov08:08:29

i mean the case of data chunks

lucasbradstreet08:08:21

We’ll likely implement a second chunk backend that would allow you to read and write that data to S3 instead

Kira Sotnikov08:08:44

lucasbradstreet: you rock, much appreciate it

Travis13:08:50

@lucasbradstreet: Is there anything special that needs to be done in order to make the Dashboard display Job information and other related info? Right now the only thing I see in the dashboard are the peer events and when a Job gets submitted but nothing else. We are attempting to debug a job and its very tricky when its spread across peers and I am not seeing any errors, lol.

jmv14:08:12

after more digging around, i can see that tweets are coming through by adding logging on the tweet callback handler at https://github.com/onyx-platform/onyx-twitter/blob/master/src/onyx/plugin/twitter.clj#L62. but it seems like next-state is never run at https://github.com/onyx-platform/onyx-twitter/blob/master/src/onyx/plugin/twitter.clj#L76. i've added logging there as well but it never seems to run. update: i left this running and the process ran out of memory which seems to further indicate that data is coming in but not being processed further

lucasbradstreet15:08:12

Datomic log API supports in mem now. Nice! http://blog.datomic.com/2016/08/log-api-for-memory-databases.html?m=1. At some point we can remove that restriction from onyx-datomic

Travis20:08:22

Any ideas on what this means

WARN [onyx.messaging.aeron] - 
                                  java.lang.Thread.run           Thread.java: 745
    uk.co.real_logic.agrona.concurrent.AgentRunner.run      AgentRunner.java: 105
         uk.co.real_logic.aeron.ClientConductor.doWork  ClientConductor.java: 113
         uk.co.real_logic.aeron.ClientConductor.doWork  ClientConductor.java: 293
uk.co.real_logic.aeron.ClientConductor.onCheckTimeouts  ClientConductor.java: 338
uk.co.real_logic.aeron.exceptions.ConductorServiceTimeoutException: Timeout between service calls over 5000000000ns

16-Aug-10 20:04:44  WARN [onyx.messaging.aeron.publication-manager] - Aeron messaging publication error: uk.co.real_logic.aeron.exceptions.ConductorServiceTimeoutException: Timeout between service calls over 5000000000ns
16-Aug-10 20:04:44  WARN [onyx.messaging.aeron.publication-manager] - Aeron messaging publication error: uk.co.real_logic.aeron.exceptions.ConductorServiceTimeoutException: Timeout between service calls over 5000000000ns
16-Aug-10 20:04:44  WARN [onyx.messaging.aeron.publication-manager] - Aeron messaging publication error: uk.co.real_logic.aeron.exceptions.ConductorServiceTimeoutException: Timeout between service calls over 5000000000ns
16-Aug-10 20:04:44  WARN [onyx.messaging.aeron] - 

Travis20:08:31

this is spewing over and over on one of my peers

Travis20:08:39

after several hours

lucasbradstreet20:08:35

Generally this happens only when you start getting memory pressure issues, and are spending a lot of your time garbage collecting

lucasbradstreet20:08:59

You can increased the timeout, which can help things, but you probably need to solve your underlying issue.

Travis20:08:08

no worries