Fork me on GitHub
#onyx
<
2016-09-16
>
vladclj07:09:21

I use onyx-http plugin, why no retry? log:

ERROR [onyx.plugin.http-output] - Request {:url "", :args {:body "a=11", :as :json}, :response #qbits.jet.client.http.Response{:status 404, :headers {"content-type" "text/html; charset=UTF-8", "content-length" "1564", "connection" "close"}, :body nil}}
INFO [onyx.peer.virtual-peer] - Stopping Virtual Peer 2a6e2b9c-223e-4728-aa73-24c385ad71b4
...
(defn success? [{:keys [status] :as response}]
  (contains? #{200 202} status))

lucasbradstreet07:09:34

Hi everyone! 0.9.10 been released. For onyx this is mostly a bug fix release, but it also includes support for an embedded http endpoint https://github.com/onyx-platform/onyx-peer-http-query that allows you to query the state of the cluster e.g. what jobs are running, how many peers are on each job, etc. This can save you a little effort starting up the dashboard, and is useful to see what each node thinks about the cluster. We’ve also made massive performance improvements to the performance of onyx-kafka.

lucasbradstreet08:09:06

@vladclj batch request, or regular request?

vladclj08:09:35

:onyx/plugin onyx.plugin.http-output/output

lucasbradstreet08:09:36

@vladclj the message will be retried after :onyx/pending-timeout is hit on the input source

lucasbradstreet08:09:17

The peer http query would have come into handy during some of our past debugging 😄

lucasbradstreet08:09:19

Awesome! Thanks. I’ll review when I get a chance. I’m going to be a bit busy over the next couple of days

michaeldrogalis15:09:35

Thanks for all the user contributions to 0.9.10 everyone. There's a lot of good stuff in there.

Travis15:09:09

Is it official now ?

Travis15:09:48

Awesome, Great work guys!

aengelberg16:09:30

@lucasbradstreet How does the HTTP health endpoint compare to the Onyx dashboard? Is it a different or same set of functionality?

aengelberg16:09:50

correct me if I'm mistaken, but it sounds like this is similar to the onyx dashboard, but it's something you attach to a peer, as opposed to the onyx dashboard which is its own peer that monitors the whole state. (so you can't see the current replica from the perspective of an individual peer.)

michaeldrogalis17:09:28

@aengelberg Mostly accurate. The dashboard provides a view of the cluster from an observer-only peer. The HTTP health-check gives you a view from a specific live member of the cluster. It's nice because of how light-weight it is. The dashboard is a separate application entirely, whereas the HTTP health check server is a library that you add to your app and only exposes a JSON api.

aengelberg17:09:15

@michaeldrogalis cool! sounds like I should expect the http health check to be more stable, since it just piggiebacks onto the peer's replica rather than having to update and keep track of its own replica?

michaeldrogalis17:09:44

Correct, yeah. It doesn't have any ClojureScript component. It's pretty thin.

Travis17:09:29

Does this release include the new eventing system? I am assuming no from the description

michaeldrogalis17:09:04

It doesn't, nope. That'll be out in 0.10.0.

aaelony18:09:36

What's the best way to increase the amount of data written to each file via the s3-output plugin ?

michaeldrogalis18:09:54

@aaelony Increase the batch-size of the task.

aaelony18:09:26

I've been increasing the batch-size, but haven't seen larger file sizes yet

aaelony18:09:04

for example I have batch-size set to 100000, but I'm seeing files with 1000-2000 lines max

michaeldrogalis18:09:54

@aaelony Batch sizes are an upper limit on the number of segments that will be processed during a single iteration of a lifecycle. Try increasing the batch sizes of the tasks upstream from it as well.

aaelony18:09:12

I've done that too. Just reviewed the entire catalog I'm using. Is there a way to set a minimum size prior to writing to file?

michaeldrogalis18:09:51

There's not, no - that would make the fault handling model too complicated. The writing is already happening asynchronously. A few things that you might be able to do are rolling actual segments into larger, logical segments for bigger writes. Or you could do the write from a window.

michaeldrogalis18:09:16

Might want to check with @aspra or @lucasbradstreet when he's back in a few days though, they're the ones who mainly worked on the S3 plugin

aaelony18:09:54

cool ideas. no rush, I'm still testing... thanks

vijayakkineni21:09:22

How can i customize onyx-kafka consumer to pass additional configuration parameters? Looks like the consumer-config is limited to only few options https://github.com/onyx-platform/onyx-kafka/blob/master/src/onyx/plugin/kafka.clj#L156

michaeldrogalis22:09:22

@vijayakkineni We were just discussing this the other day. We're likely going to allow an arbitrary opts map to be merged in. Is there something in particular we're missing right now that you need?

michaeldrogalis22:09:40

It's a trade-off between having an abstract consumer interface through Onyx vs. exposing the entire underlying Kafka library. Sometimes you just need the latter.

vijayakkineni22:09:45

Yes, i wanted to use certain connection settings “http://connections.max.idle.ms” & “http://linger.ms” and also ssl configuration parameters.

michaeldrogalis22:09:25

@vijayakkineni Give me an hour, I'll take a look at doing it now.

michaeldrogalis22:09:27

Long overdue change.

Travis22:09:49

@michaeldrogalis what were some of the performance enhancements in the Kafka plugin for 0.9.10?

michaeldrogalis22:09:01

Heh, the build failed because I didn't document that the new options in the correct places. This is how we keep our act together.

michaeldrogalis22:09:14

@camechis Mostly by removing a thread that was doing ready and going at it direct.

michaeldrogalis22:09:24

I didn't work on that though, would need to check out the patch.

michaeldrogalis22:09:20

@vijayakkineni Try [org.onyxplatform/onyx-kafka "0.9.10.1-20160916.224342-3"]

michaeldrogalis22:09:03

:kafka/consumer-opts & :kafka/producer-opts, both take prescedence during the merge.