Fork me on GitHub
#data-science
<
2018-08-22
>
ben15:08:10

I can see quite a few random forest implementations in clojure. Are there any in particular people here have used before and liked/think are production-worthy?

aaelony17:08:29

I recommend checking out h2o

ben14:08:45

Thanks! Does this mean you have to go through Java (http://docs.h2o.ai/#languages)?

ben14:08:20

And have you used h2o with clojure yourself before?

aaelony14:08:35

I typically use it from R, but it supports python, java, an interface called “Flow” from a web browser and more. Under the hood all these language bindings function the same way by emitting JSON commands (which could come from clojure) to the REST API. https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/api/REST/h2o_3_rest_api_overview.md

aaelony14:08:18

At some point, I wanted to write a clojure library to do this but never got around to it…

aaelony14:08:50

You could also interop directly with the H2o’s java code base, but it might be better to use the REST APi

ben15:08:10

Interesting - thank you very much

rustam.gilaztdinov20:08:11

Or xgboost, which has java binding. Of course, xgboost is about weaky classifiers and low variance, in opposite of RF, but results is very promising and wildly usefull.

ben15:08:23

Yeah, xgboost would probably be better in pure predictive terms and would be a fine approach for me. Random forest has a couple of practical conveniences though that make it attractive. That said, the ease of just using xgboost’s java bindings might be preferable. Thanks!

genmeblog15:08:15

It's easy to use java bindings to Smile's random forest implementation.

genmeblog15:08:31

I mean api is quite simple.

ben17:08:22

Ah yeah this looks great

aaelony23:08:40

you can use xgboost from h2o