Fork me on GitHub
#onyx
<
2018-08-15
>
lucasbradstreet02:08:41

That’s definitely my overall impression of onyx-sql

lucasbradstreet05:08:36

onyx-kafka, onyx-seq, and onyx-core-async (for testing), onyx-amazon-s3, onyx-amazon-sqs, and onyx-http are the most popular plugins overall

lucasbradstreet05:08:32

Ranking between those is tricky

lucasbradstreet05:08:00

onyx-kafka is definitely at the top, followed by sqs and s3 probably

lmergen05:08:42

@rustam.gilaztdinov not sure whether it fits in your architecture, but perhaps it's a better idea to use Kafka Connect or something like that to source data from PostgreSQL into Kafka, and then read from Kafka directly in Onyx

lucasbradstreet05:08:57

I’ve heard a lot of good things for CDC with debezium. Then onyx-kafka

lucasbradstreet05:08:43

The CDC case is pretty big and hasn’t been paid enough attention to imo

lmergen05:08:42

nice, didn't know about debezium

lmergen05:08:43

that's pretty sweet

lucasbradstreet05:08:29

The CDC direction to streaming is a pretty low risk way to start integrating systems into the streaming model

lmergen06:08:20

yes, and it makes a lot of sense -- i see that debezium uses actual database xlogs to capture changes

lmergen06:08:38

i still remember the insane old days of PostgreSQL + Slony

jasonbell12:08:17

I should really dust off the twitter plugin work that I did.

rustam.gilaztdinov12:08:43

this is a very frustrating behavior of sql plugin. Still doesn’t solve it

lmergen12:08:28

@jasonbell actually i've brought it up to date until 0.12.7, i think it's compatible with 0.14 as well

lmergen12:08:25

do you know which line of code is triggering the NPE? that would help

rustam.gilaztdinov12:08:57

Handling uncaught exception thrown inside task lifecycle :lifecycle/read-batch. Killing the job.


java.lang.Thread.run              Thread.java:  748
java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  624
 java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1149
                                               ...
                 clojure.core.async/thread-call/fn                async.clj:  441
 onyx.peer.task-lifecycle/start-task-lifecycle!/fn       task_lifecycle.clj: 1155
      onyx.peer.task-lifecycle/run-task-lifecycle!       task_lifecycle.clj:  550
    onyx.peer.task-lifecycle.TaskStateMachine/exec       task_lifecycle.clj: 1070
onyx.peer.task-lifecycle/wrap-lifecycle-metrics/fn       task_lifecycle.clj: 1097
      onyx.peer.task-lifecycle/build-read-batch/fn       task_lifecycle.clj:  651
             onyx.peer.read-batch/read-input-batch           read_batch.clj:   49
          onyx.peer.read-batch/read-input-batch/fn           read_batch.clj:   54
              onyx.plugin.sql.SqlPartitioner/poll!                  sql.clj:  114
                                clojure.core/first                 core.clj:   55
                                               ...
                          clojure.core/take-nth/fn                 core.clj: 4271
                                  clojure.core/seq                 core.clj:  137
                                               ...
                              clojure.core/drop/fn                 core.clj: 2924
                            clojure.core/drop/step                 core.clj: 2921

lmergen12:08:34

great, that helps, let me see

lmergen12:08:53

ohhh, i see what's going on already

lmergen13:08:42

https://github.com/solatis/onyx-sql/tree/master i've just pushed a fix for this issue there, could you try it out ?

lmergen13:08:55

are you sure you're using the correct version? is the error still exactly the same ?

rustam.gilaztdinov13:08:40

yep, I’m sure, and error the same

rustam.gilaztdinov13:08:29

this is weird, lein clean not help

rustam.gilaztdinov13:08:16

error on onyx.plugin.sql.SqlPartitioner/poll! sql.clj: 115

lmergen13:08:18

ok, it could be that partition-table returns nil

lmergen14:08:19

could you try to change the if-let at line 115 to this:

(if-let [part (and rst
                       (first @rst))]

rustam.gilaztdinov14:08:24

no, doesn’t help 😞

lmergen14:08:12

this makes no sense at all

lmergen14:08:38

wait! :thinking_face:

lmergen14:08:52

could it be that there is a sequence with lazy side-effects here

lmergen14:08:59

which is triggered by the first...

lmergen14:08:45

ok i don't have the time to debug this issue atm, but i'm fairly sure the issue is with the drop function inside the partition-table function

lmergen14:08:20

there's probably an incorrect slot-id, or an empty ranges, or something like that

lmergen14:08:24

what happens at that function is that onyx takes your total input table, and divides it over the number of peers you have assigned to the input task

lmergen14:08:33

that way, you get automatic partitioning

lmergen14:08:20

it does this by creating N partitions, and then dropping the first M partitions (where M == our own peer id)

lmergen14:08:47

there is probably something off in one of those things

lmergen14:08:20

if you could identify what the inputs of that function are, specifically slot-id and the ranges, that would be very useful

rustam.gilaztdinov13:08:11

@lmergen hello, sorry for the delay, can we pls try to fix this? how can I know slot-id? ranges is right, I guess -- [0 99] [100 199] [200 299]...

rustam.gilaztdinov13:08:17

(take-nth n-peers (drop slot-id ...)) -- in my case, n-peers = 3, so, I choose every 3rd value in sequence

jasonbell13:08:24

Excellent news @lmergen thanks for letting me know