Fork me on GitHub
#onyx
<
2016-06-29
>
v.solovyov09:06:39

Hello. I'm starting a new project, and I plan to make Onyx a heart of all data processing here. However, historically a lot of things were done in Python here (a lot of natural language processing, specific to a domain), so I can't just throw it out, and if we're going to phase it out someday and reimplement in Clojure, it will be done gradually. As I understand, you have some plans about making Onyx available for other languages, I've taken a look at onyx-ruby PoC, but it's not here yet, and it's not really an urgent priority

v.solovyov09:06:10

Basically, what I need is a foreign function calls, and right now I think I can get away with interacting from Onyx task with my Python code via http api, or something like that

v.solovyov09:06:26

I've taken a look at an approach that pyspark uses, and I think I'll be better of using a simpler thing

lucasbradstreet09:06:29

That sounds reasonable. I think to do so you will want something like :onyx/batch-fn, which is something I’ve been considering adding. That would let you send a whole batch of segments to your end point

lucasbradstreet09:06:56

Or allow you to asynchronously make batch-size calls, and return the whole batch when you’re done

v.solovyov09:06:08

Well, it depends on the time that's needed to process one segment, if it takes a couple of seconds, I can just do it one-by-one

lucasbradstreet09:06:09

That would improve performance a lot. Ideally you’d use something like urania to handle the requests over the batch http://funcool.github.io/urania/latest/

lucasbradstreet09:06:17

Yes, that is very true

lucasbradstreet09:06:27

It totally depends on what you’re doing

v.solovyov09:06:32

cool, I'll take a look at urania

agile_geek13:06:19

Is there an example of using the new task bundles approach using Kafka? I can't figure out how you would add a kafka task using task bundles?

agile_geek13:06:34

gardnervickers: is that code published as a jar in Clojars/Maven Central or will I need to build from that branch?

gardnervickers13:06:11

All our plugins are tested and published against both official releases and our snapshot builds https://github.com/onyx-platform/onyx-kafka

gardnervickers13:06:22

[org.onyxplatform/onyx-kafka "0.9.6.0”] is the current lein coordinate

michaeldrogalis14:06:07

@agile_geek: For reference, our build matrix links out to every dependent project, and its coordinates are on the top of every README if they exist: https://github.com/onyx-platform/onyx#build-status