Fork me on GitHub
#onyx
<
2017-01-04
>
len10:01:03

I am using the datomic read-log plugin and getting this exception

Handling uncaught exception thrown inside task lifecycle - killing this job. -> Exception type: clojure.lang.ExceptionInfo. Exception message: Unfreezable type: class clojure.lang.Delay

len10:01:20

any ideas where I could look ?

len10:01:20

Also this might be related, running the dashboard, this is all in local dev mode with an embedded zookeeper, I get these messages on the dashboard console

org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /onyx/2/log-parameters/log-parameters
    code: -101
    path: "/onyx/2/log-parameters/log-parameters"

17-01-04 10:31:29 mung WARN [onyx.log.zookeeper:207] - Log parameters couldn't be discovered. Backing off 500ms and trying again...

isaac11:01:10

How onyx coordinate between multiple input peers?

michaeldrogalis15:01:00

@len I believe we saw the Delay thing pop up before. @robert-stuttaford Do you recall what was happening there?

len15:01:44

It gets 2 segments from datomic and then blows up

michaeldrogalis15:01:01

@len Re: dashboard. The dashboard needs a ZooKeeper connection to hold onto permanently. Local-dev uses embedded ZooKeeper and trashes the state and connection everytime.

michaeldrogalis15:01:15

Yeah, that’s familiar. Let me try searching the Slack logs.

michaeldrogalis15:01:34

What version of onyx-datomic and Onyx are you using?

len15:01:42

yeah i start the dashboard after the local peers, says connected

len15:01:05

[org.onyxplatform/onyx-datomic "0.9.15.0"]

michaeldrogalis15:01:21

Im not exactly sure where we landed with this one to be honest: https://clojurians-log.clojureverse.org/datomic/2016-10-04.html

len16:01:01

thanks thats interesting - I have the latest datomic and will wait for @robert-stuttaford to achieve enlightenment 🙂

robert-stuttaford16:01:21

terribly sorry, but i believe we never got to the bottom of that

len16:01:24

I think the dashboard log error is because no jobs have completed yet

len16:01:47

is there a workaround @robert-stuttaford ?

robert-stuttaford16:01:24

so, it usually only happened because we were mucking about somehow - restarting things in the wrong order, that sort of thing

robert-stuttaford16:01:32

we haven’t had it in a long while

michaeldrogalis16:01:35

@len The dashboard message you’re seeing is unrelated. It’s having trouble connecting to a valid ZK tenancy.

michaeldrogalis16:01:57

@robert-stuttaford Are you still rebooting the peers nightly?

robert-stuttaford16:01:12

it sounds like you have a fairly reproducible case, @len. i’d put something together for Lucas to chew on

robert-stuttaford16:01:20

no, we stopped that a while ago 🙂

michaeldrogalis16:01:06

Right, I remember now. We couldn’t figure out exactly where the Delay was coming from because we can’t see Datomic’s source. 😛

robert-stuttaford16:01:35

yep. it was something to do with trying to read outside of the available range, or something

robert-stuttaford16:01:54

i believe we filed a bug with Datomic actually

michaeldrogalis16:01:10

Strange how it only comes up sometimes

robert-stuttaford16:01:20

by ‘we’ i mean we told them about it over in #datomic and one of the guys said ‘oh, right, cool, we’ll look into that'

len16:01:39

I was thinking that its because I am starting at the beginning of time

len16:01:00

I just got back to it now so will try some things and let you know

robert-stuttaford16:01:02

i do know that the 3 txes in an ‘empty’ db don’t show up in tx-range

robert-stuttaford16:01:12

the ones at the epoch

robert-stuttaford16:01:20

but nil as a start-t is documented as valid

len17:01:20

ok so if I start the datomic log with a recent t then all is fine and I dont get the Delay

michaeldrogalis17:01:58

@len What is “recent” in your case?

len17:01:53

I started with t from (last (d/tx-range (d/log conn) nil nil))

len17:01:34

now working with t from the 1st jan

len17:01:38

it looks like the initial txns in datomic from when we started last year august have the issue, not sure which ones, I will only need the t from when we start in production so looks ok for me

michaeldrogalis17:01:12

Okie doke. I really wish we had some visibility into that one slice of code. 😕

len17:01:09

yeah that would help

len17:01:47

ok so still getting those log errors from the dashboard - any pointers where to look ?

michaeldrogalis17:01:47

Can you describe your set up once more? You said you’re running embedded ZooKeeper. Is that through with-test-env?

len17:01:52

this is a simple app from the lein template

len17:01:03

just running in dev mode

len17:01:18

so configs as per the template

len17:01:42

starting the dashboard and connecting to 127.0.0.1:2188

len17:01:23

dashboard start looks good

java -jar onyx-dashboard-0.9.15.0.jar 127.0.0.1:2188
=================================
Starting Dashboard components ...
Starting Sente
Starting Channels
Starting ZKClient
Trying connect ZK 5s ...
Starting Deployments
ZK connection state: CONNECTED
Starting HTTP Server
Http-kit server is running at 

michaeldrogalis17:01:41

Okay, that’s helpful. So you’re leaving that running and not taking it down, right?

michaeldrogalis17:01:00

I’m only asking because doing that would also stop the ZooKeeper server, which is required for the dashboard.

len17:01:24

i restart it after the restart of the dev peers yeah

len17:01:14

its only when I select the tenancy that I see the errors in the console

michaeldrogalis17:01:30

Are you sure the tenancies match?

len17:01:13

yeah the dropdown is populated in the UI

len17:01:22

I just select from that list

michaeldrogalis17:01:01

Sorry, a few more questions. Which version of the dashboard are you running?

michaeldrogalis17:01:11

Everything is 0.9.15.x I presume?

len17:01:17

0.9.15 yes

len17:01:52

although the template brings 0.9.10 ?

len17:01:04

for the peers - should I bump that ?

michaeldrogalis17:01:20

Yeah those versions need to be upgraded. The template project.clj was too difficult to get into our auto-release process.

len17:01:36

will do and will let you know thanks

len17:01:51

lib-onyx to 0.9.15 as well ?

michaeldrogalis17:01:09

Yeah. Keep me posted, we’ll get you up and running.

len17:01:58

ok upgraded all to 0.9.15 and still the same scenario with the dashboard

len17:01:56

mmm ok switched to the old tenancy id and i get data

michaeldrogalis17:01:58

Okay. So the next thing you can do is pop open a ZooKeeper console and verify that the error is valid. It’s taking it can’t find a znode on /onyx/2/log-parameters. You can check if that exists with zkCli, then ls /onyx/2

michaeldrogalis17:01:19

If you were hopping between versions, you might want to start with a brand new tenancy ID.

len17:01:26

how can i clear everything out ?

len17:01:48

I am new to zookeeper as well 🙂

michaeldrogalis17:01:05

rmr /onyx in zkCli

len17:01:18

ok cleaned out and restarted everything and found and issue on my side - had differents tennacy ids for the env and peer configs !

len17:01:30

all looks good now thanks for the assists