This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-10-16
Channels
- # aws-lambda (10)
- # beginners (52)
- # boot (42)
- # cider (2)
- # cljs-dev (8)
- # cljsjs (4)
- # cljsrn (10)
- # clojars (3)
- # clojure (48)
- # clojure-conj (4)
- # clojure-dev (19)
- # clojure-italy (7)
- # clojure-norway (4)
- # clojure-russia (44)
- # clojure-spec (70)
- # clojure-uk (34)
- # clojurescript (39)
- # cursive (35)
- # data-science (11)
- # datomic (7)
- # emacs (6)
- # fulcro (2)
- # hoplon (12)
- # jobs (1)
- # juxt (18)
- # lein-figwheel (2)
- # leiningen (4)
- # luminus (9)
- # off-topic (29)
- # om (46)
- # onyx (131)
- # other-languages (24)
- # parinfer (84)
- # pedestal (10)
- # portkey (45)
- # protorepl (1)
- # re-frame (15)
- # reagent (43)
- # ring-swagger (41)
- # schema (6)
- # shadow-cljs (293)
- # slack-help (2)
- # specter (42)
been having another go at upgrading to 0.11.0 ... my onyx-kafka source task is failing with this exception:
and i'm not sure how to proceed - the same task works fine on onyx-0.9.15
any clues on where to go next will be greatly appreciated 😬
ohh - looks like 0.11 won't work with kafka 0.9... upgrading my dev machine to kafka 0.10 seems to make it work again. is there an error in the onyx-kafka README (which indicates compatibility with kafka 0.9) ?
Hi All, I've been working with @camechis on a little project to get checkpointing to work on Google Cloud Storage like it does with S3. I'm hoping this will not only be useful for us but others who may be running on Google Container Engine on GCP. I think I'm starting to get a good idea on how to put the code together but a couple of questions did pop up. In the S3 implementation, the AWS uploader spawns threads to do an asynchronous upload of the checkpoint bytes to the bucket. Do you have any code that limits the amount of uploaders involved to prevent those threads from chewing up CPU? Or is that, since the upload process is finite, and the number of requests to upload is small enough, that it's something you don't have to worry about?
@mccraigmccraig Doh, yeah. It should read 0.10+
I'll get that fixed up. Sorry, that was probably a giant rabbit hole.
I think 0.10 -> 0.11 is the first time there's been compatibility between Kafka versions.
np @michaeldrogalis fortunately my random-walk of stuff to change was quite short before i hit on a correct solution 😬
Everything else has breaking changes in between 😕
Hah, cool. Im glad.
@jholmberg Ill circle up with @lucasbradstreet on this one.
Awesome, thanks @michaeldrogalis!
Is anyone here currently using Kafka who'd be interested in using it in a hosted setting with fewer APIs? The shtick is that you don't have to do any of the ops work to keep a distributed, replicated Kafka cluster running, and only the most critical APIs are exposed. (produce, consume, seek, partition scan, etc)
@michaeldrogalis has anything changed about the aeron setup between 0.9.15 and 0.11.0 ? when i deploy 0.11.0 to our production environment i'm seeing a bunch of errors like this https://gist.github.com/a42b53c7c226c97f96fec34cc82cc799 , after what looks like a fairly normal startup which gets as far as logging "Enough peers are active, starting the task"
@michaeldrogalis That could be interesting
@mccraigmccraig We upgraded Aeron several versions.
That error doesn't look to be of the sort you'd get for version mismatches though.
Looks like it never connected?
possibly - i'm deploying to docker/marathon, with a dockerfile which looks pretty much in line with https://github.com/onyx-platform/onyx-template/blob/0.11.x/src/leiningen/new/onyx_app/Dockerfile
@camechis Cool. Ill PM you in a bit. Thanks 🙂
hmm. lib-onyx is currently at 0.11.0.0 rather than the latest 0.11.1.0-SNAPSHOT
Bah, we just added lib-onyx to our auto-release suite. Must have messed that bit up
Thanks, we'll get that one patched up.
Trying to think of how else that error could creep in. Does it display as soon as you boot up?
no, it's quite late on - there are some "enough peers are active, starting the task" messages and then it borks
oh, hold on... i've got some warning like this Warning: space is running low in /dev/shm (shm) threshold=167,772,160 usable=159,305,728
Same job -- no large increases in peers or changes in hardware set up?
Ah, there it is.
yes, same job - hardware and peer config is unchanged
do i need to reconfigure share mem settings ?
@mccraigmccraig I've seen that before when peers restart too quickly, not giving Aeron a chance to cleanup.
Approximately how many peers are on each node, and how many peers in total for the job?
Based on the error, it looks like your /dev/shm is only 160MB, which will be used up pretty quickly in 0.10+. You may need to increase it and/or decrease the aeron.term.buffer.length property https://github.com/real-logic/aeron/wiki/Configuration-Options
there are 100 peers on each node, 3 nodes, and somewhere between ~10 (min-peers) and 100 (max-peers) peers required for the job
OK. Yeah, you’re going to run out of shm space very very fast.
i'm currently experimenting with tuning different knobs to improve throughput... i can certainly reduce the number of peers and start using batch-fn? true
for some of my tasks - the ones with the greatest :onyx/max-peers
The messaging model was rewritten in 0.10, and peer to peer connections now use their own buffers more often, especially for task grouping.
or i could bump shm... if it's plausible to run with as many peers as i've got i'd like to start with that, so i can continue to play
OK, yeah, generally you won’t gain much by running that many peers on single nodes if the peers are generally busy
It’s plausible, but it’s not a scenario we’ve really tested enough, so there may be fixes required on our end.
so i'd be better with batches ? i've got a fan-out of 30,000 for some messages (1 segment in, 30,000 segments out)
The main use for batch-fn? true
is when you can optimise operating over more than one segment at a time, e.g. asyncronously sending requests somewhere and waiting for all of the results to come back for the batch. Are you doing anything like that?
yes, the 30k segments are all routing effects - records to be persisted, notifications to be sent etc, and all my backend ops are async
Right, ok, batch-fn will definitely help there.
It’ll definitely be much less overhead than using that many peers.
ok, well i'll verify the problem is too many peers first, by dropping my max-peers, and if that's the case then i'll switch to batching to improve throughput
Sounds good
I’d still up your -shm-size
@mccraigmccraig the headroom helps, depends on the message size.
Yeah, I agree with that too. 160MB is not a lot.
We have some changes planned to allow peers to only message a subset of peers as the number of peers grows, since the connection growth is n^2, however it still needs to be implemented.
what sort of number should i be looking at for shm size ?
It totally depends on the number of peers on each node, and the term buffer size.
@mccraigmccraig things that I learned along the way. aeron.term.buffer.length = 3 x (onyx/batch-size x segment size x total number of edge connections to tasks)
Then you can figure out system overheads, JVM overheads and tune the --shm-size accordingly.
^ that’s the minimum term buffer length you’d want to use, just to fit your messages. Throughput may be affected if you use the min
@lucasbradstreet agreed. it’s a starting point to planning though.
It’s great info, just pointing out the catches :)
cool - thanks @jasonbell @lucasbradstreet that's given me something to play with
I need to dig back into this stuff again as I’ll be talking about it a bit more at clojurex in December
@jasonbell Great! Let me know if you have any questions as you prepare.
I will certainly do that. It’s more a war story of the past but I want to get the tuning things in.
@mccraigmccraig let me know how you go, as I’d like to survey how big it ends up, so I have more datapoints
@mccraigmccraig there’s definitely some things we could do to make this simpler, but I don’t want to choose parameters / designs that don’t solve real problems. You will almost definitely want to reduce the job size beforehand, though.
hmm. i just came across this in my run_media_driver.sh
# aeron messaging needs a bigger /dev/shm than the docker default of 64MB
# this requires --privileged option to the docker container (from marathon)
mount -t tmpfs -o remount,rw,nosuid,nodev,noexec,relatime,size=1024M tmpfs /dev/shm
so that should be giving me a 1GB /dev/shm
Yeah, that should be. Weird that it seems to be reporting a smaller size.
perhaps that remount isn't working for some reason
In any case, it won’t be big enough, considering the size of your job.
I’d look into that first though.
sorry, @michaeldrogalis it appears that metamorphic
might be used in creditcard fraud detection as demonstrated with cortex, mind to share an example? https://www.youtube.com/watch?v=0m6wz2vClQI
Check kiting is the primary example used in one of the papers we based the impl off: https://web.cs.wpi.edu/~chuanlei/papers/sigmod17.pdf
Thank you for sharing, @michaeldrogalis 🙂
@lovuikeng Surely. 🙂
@mccraigmccraig Are you running on K8s/Mesos?
@gardnervickers mesos+marathon
Ahh yea Mesos/K8s have their own way of setting /dev/shm
, as does docker now. That snippet is out of date.
I cannot remember but when I did it on mesos in the past I used a setting in the marathon config to tell it
i'm currently telling marathon to run the container privileged, then remounting /dev/shm...
—shm-size
the version of docker in use doesn't have the --shm-size option
i've got a cluster rebuild hanging over me, but until that's done i'm stuck on this version of docker, without --shm-size
If Mesos has the ability to mount a Memory volume at /dev/shm
that would be functionally equivalent. That's actually what we use for our production Kubernetes cluster. I believe that Mesos/Marathon count /dev/shm
usage against your total container memory allocation.
Does your version of docker have the tmpfs
arg?
apparently not
ok, i think i'll take the easy way out and downgrade to onyx 0.9 until i've done a cluster rebuild with a more recent version of docker
Getting stuck with an old version of Docker really stinks.
Been there. 😕
i'm scheduled to rebuild the cluster straight after we've gotten this next release out, so it shouldn't be for too long
Fortunate circumstances then
thank you for the assistance y'all 🙂
Anytime
Hi all. Noob here with quick question re:GC. I have set up a proof-of-concept single machine streaming cluster using 0.10.0. Currently has one job with 8 tasks. Input is via Kafka & output is to Cassandra. Kafka & Zookeeper are running as separate processes. I have a custom task aggregation over fixed windows of 1 minute, 10 minutes, 1 hour & 6 hours with associated triggers on watermark. There are also flow conditions, but nothing too crazy. Anyway, everything seems to work in that data enters the cluster, gets processed and then saved to storage. The problem arises when the stream processing has been going on for a few hours. Each of the processes for the job and kafka seem well behaved. For example, after running for over 3 hours, processing 12K inputs that generate 1.2M records, they both continue to GC back down to levels near where they started. The same cannot be said for Zookeeper. It started out collecting to around 300K. Not so now. From the zookeeper GC log:
[Eden: 256.0M(256.0M)->0.0B(272.0M) Survivors: 48.0M->48.0M Heap: 2104.0M(6144.0M)->1924.9M(6144.0M)]
In the user guide, it mentions that periodically, onyx.api/gc needs to be called to clean up (which I actually do every time it launches). Unfortunately, issuing a GC to the running cluster does not seem to result in any cleanup (log-cleaner.log has no entry corresponding to when the GC was issued though I see a new session id in the server.log ). Any ideas? Is that supported? If not, what is the recommended way of cleaning up a continuously running stream processor such that zookeeper won't run out of memory? Thanks!@brianh I think I know what’s going on. I’m guessing you need to switch over to s3 checkpointing, as I believe you are probably ending up checkpointing all of your window state to zookeeper. I have been meaning to throw an error when window state is checkpointed, as the zookeeper checkpoint implementation is not appropriate for window checkpoints.
We've been debating whether to make the default S3 for a while. On one hand, it makes it more troublesome for new comers. On the other hand, unless you know to switch over, you run into things like this.
@michaeldrogalis I’ve considered just throwing an error when it’s used to checkpoint a window.
It possibly shouldn’t even be used for input / output plugins though, as it’ll keep creating nodes.
Yeah, it's a tricky one.
@lucasbradstreet: I've been working with @camechis on an implementation of checkpoint interface that would support Google Cloud Storage in addition to S3. (We're running managed kubernetes on GCP). I've got a fork with the implementation you're welcome to take a look at: https://github.com/tenaciousjzh/onyx/blob/0.11.x/src/onyx/storage/gcs.clj https://github.com/tenaciousjzh/onyx/blob/0.11.x/src/onyx/information_model.cljc#L1190 Do you have an integration test you ran on S3 that I could maybe modify to try out for GCS?
It's a start. Pretty raw right now. Trying to match up implementation with S3 so it's familiar.
@jholmberg I would suggest switching over the test suite to use your implementation and then run the whole thing https://github.com/onyx-platform/onyx/blob/0.11.x/test-resources/test-config.edn#L9
@jholmberg Whoops, dropped the ball on that. Thanks for pinging again
no worries! Been busy getting this thing off the ground. @lucasbradstreet, I'll take a look at the test suite.
Looks like a pretty straight port. Should work fine overall, I think. Once it’s ready I might break some helpers out to re-use code a bit more, but don’t worry about that for now.
Awesome, this matches up with what i'm trying out. I've got my own edn that uses gcs instead of zookeeper. Getting all the auth stuff plugged in before testing it. @lucasbradstreet, good feedback. I want this to be usable for you so if you'd like me to make changes or you want to tweak it feel free.
:thumbsup:
Hmm. Thanks for the info. Unfortunately, S3 isn't an option (at least in the near term) where this thing is going to be running. Are there any other local storage options or would I need to roll my own?
@brianh some options include:
* Implement gc’ing for checkpointed storage
* Testing whether fakes3 works with Onyx’s s3 checkpointing implementation (it didn’t used to), and stand up a fakes3 endpoint
* Implementing your own checkpoint storage implementation that writes to HDFS or some other object store that you can stand up.
If the performance / window size characteristics of ZK is good enough, the first option might be a good way to go
@brianh minio can also potentially work. It's a small storage server written in Go. Full S3 API compatible I believe
Good idea
We will have to implement some garbage collection soon. We’ve gotten away with it ourselves because s3 is so cheap and effectively infinite size