Fork me on GitHub
#data-science
<
2020-02-11
>
Frederik14:02:30

With deep-diamond not released yet, what's the easiest to get started with neural net framework in Clojure? I just need something lightweight to train a reasonably shallow CNN with. Cortex seemed to fit the bill, but hasn't been maintained in two years, and unfortunately I can't get one of their basic examples to work: https://github.com/originrose/cortex/blob/master/examples/xor-mlp/src/xor_mlp/core.clj Trying to run

(train-xor)
I just get:
ExceptionInfo Network does not appear to contain a graph; keys should contain :compute-graph  
Any better options? Or some links to more complete cortex documentation so that I can get it to work? Thanks!

gigasquid14:02:05

Depends on how "lightweight" you need

gigasquid14:02:27

There are clojure bindings to MXNet - I'll get you an example of CNN

Frederik14:02:35

Thanks! 🙂 I'll have a look.

gigasquid14:02:17

There are other examples in that repo too

Frederik14:02:17

Great thanks! The more easily it is to get started with the better, willing to sacrifice bells and whistles for it. 🙂 E.g. in python I'd choose a basic Keras layers interface over a full Tensorflow Estimator API.

gigasquid14:02:38

You can also do Keras through python interop

gigasquid14:02:44

I don't have a straight up Keras example yet - but I'm sure you could follow the examples and figure it out 🙂

Frederik14:02:10

I was planning to stay away from python interop as I'm using this project to learn Clojure, but will keep it as a backup option 🙂

👍 4
Frederik14:02:25

Thanks for all the help!

Frederik16:02:43

Probably quite basic scala question, I don't get the mxnet installation to work, getting the following warning and error when starting a leiningen repl:

INFO  MXNetJVM: Try loading mxnet-scala from native path.
WARN  MXNetJVM: MXNet Scala native library not found in path. Copying native library from the archive. Consider installing the library somewhere in the path (for Windows: PATH, for Linux: LD_LIBRARY_PATH), or specifying by Java cmd option -Djava.library.path=[lib path].
WARN  MXNetJVM: LD_LIBRARY_PATH=/opt/intel/mkl/lib/intel64_lin
WARN  MXNetJVM: java.library.path=/opt/intel/mkl/lib/intel64_lin:/usr/java/packages/lib:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
ERROR: Unhandled REPL handler exception processing message {:op stacktrace, :id 352, :session cf4e20e7-57b8-43ce-8980-05728d1df8a3}
I changed the LD_LIBRARY_PATH to get neanderthal to work, which I assume broke the mxnet standard installation (?). I want to add the installation place of the scala library to my library_path, but have no idea where to find it.

gigasquid16:02:39

you don't need to set the LD_LIBRARY_PATH - the native libs are extracted from the jar and put into a temp directory to load

gigasquid16:02:33

There are a few options to get going - I would read through this https://github.com/apache/incubator-mxnet/tree/master/contrib/clojure-package

Frederik17:02:33

Started from a clean leiningen project and now it works, must be interaction with some other dependency info in my project.clj. 🙂 Sorry for all the questions, but trying to get a basic example to work and can't find any example that doesn't load data from disk, while I want to train data I already have in memory. mx-io/ndarray-iter seems the right approach (?) but can't seem to find any documentation and examples on how to use this. Is there any general documentation on setting up small end to end examples?

gigasquid17:02:55

glad it's working!

gigasquid17:02:27

There are some other examples in the bert sentence classification using it too

Frederik11:02:24

Thanks for all the links and your patience! Played around with it more, starting with the simple XOR problem, I've made a small gist for it: https://gist.github.com/Toekan/0d180f129c3bd3036a041149f87ac85e Tbh, being reasonably new to Clojure and completely new to mxnet, a trimmed down example like this would have helped me getting started. The majority of the examples start with a specialized iterator like the mnist-iterator, hiding the exact needed shape of your training data and labels a bit. It took me several errors e.g. to figure out that I had to wrap my ndarray in a vector before passing it to ndarray-iter. In any case, up and running to attack my CNN problem now. 🙂 Very happy this mxnet port to Clojure exists!

gigasquid14:02:50

nice example @UT58RLHEE! If you are interested in contributing as an example to the repo - it would be welcome 🙂

Frederik14:02:36

That would be great! 🙂 Anything I need to change? Things that aren't done in the best way, code of formatting that isn't very clojury, more documentation needed, etc? Let me know and I I'll put in a PR in the coming days

gigasquid14:02:52

Any comments that you want to add or clarify to help beginners would be great. Other than that, it just needs the apache license at the top in a comment like this https://github.com/apache/incubator-mxnet/blob/master/contrib/clojure-package/examples/tutorial/src/tutorial/module.clj

gigasquid14:02:07

Just tag me on the PR - thanks!

gigasquid14:02:05

oh there is a test for the tutorial that just loads the namespace - if you could add yours there too, would be great https://github.com/apache/incubator-mxnet/blob/master/contrib/clojure-package/examples/tutorial/test/tutorial/core_test.clj

Frederik14:02:57

Ok, will tag you when I make a PR 🙂 Thanks!

Frederik15:02:47

Hi, Hope you don't mind I post another question here, but google doesn't solve my problem. I'm trying to run m/predict on different batch sizes than I trained on. I've been playing around with rebinding the model, but nothing seems to work:

(def X-test (mx-io/ndarray-iter [(ndarray/->ndarray [[0.0 1.0]])]
                                {:label-name "output_label"
                                 :data-batch-size 1}))

; model was trained like in the gist linked earlier.
(def binded-model (m/bind trained-model
                          {:for-training false 
                           :data-shapes (mx-io/provide-data X-test)}))

; Trying to predict
(m/predict binded-model {:eval-data X-test :num-batch -1})
But whatever I seem to try out, I always get something in the likes of:
[14:59:12] src/operator/tensor/./matrix_op-inl.h:697: Check failed: e <= len (4 vs. 1) : slicing with end[0]=4 exceeds limit of input dimension[0]=1
How do I get rid of the expected 4 batch size?

jsa-aerial19:02:21

Major new release of Saite * lib 0.19.15 on clojars * uberjar 0.5.0 : wget http://bioinformatics.bc.edu/~jsa/aerial.aerosaite-0.5.0-standalone.jar Release summary * Added full editor panel support to picture frame elements - Add any number / combination to :left, :right, :up, and/or :down - May be either live or static. Latter is for typical code markdown - Live editors are fully functional with code execution and may also explicitly update any associated frame visualization - Static can be neither focused nor editable - Theming works for all these editors - Add per tab capabile user defined defaults for editor options (sizing, etc) * Added automated code 'starter' inserts for - Text only (empty) frames (for straight markdown and/or editor panels) - CodeMirror editor elements for picture frames - Visualization frames with starting default template and data source - These also include automatic and automated frame (fid), visualization (vid) and editor (eid) ids. * Added bulk static image saves - Will automatically save all images in a document - Saved per tab as session>docname>tabname>vid(s).png - Supports bulk creation by server or client. - Simple fast implementation - no 'headless browsers' or other extras required - New default 'chart' option in config.edn for where to save * Added new example documents: - cm-example-picframes.clj, showing editor support in picture frames - bulk-vis-save.clj, showing bulk visualization creation and saving * Added (fwd) slurping and barfing to strings * Fix several issues with strings in paredit. * Added main editor panel default sizing to config.edn * Added main doc (scroller) area defaul max size for width and height WRT the 'bulk saving' change: It was always possible to create visualization (actually frames of any sort) frames in bulk - either from the server or from the client. What wasn't available was saving all the visualizations in all the frames in all the tabs as one single simple operation. That is what is now available. As noted when previously discussed, this required very little code and no additional required addons or extras like node.js, graalvm, headless browsers or any other such stuff that seems to have come up when discussing saving VG/VGL in bulk. Very clean, simple and fast.

jsa-aerial19:02:02

Here's a screen shot of an example of the new CodeMirror editor in picture frames capability:

parrot 12
chrisn20:02:30

Very very impressive.

jsa-aerial05:02:28

Thanks! Coming from you with the impressive stuff you have done, that means a lot!

chrisn16:02:08

You deserve it; hanami and saite are I think one of a kind. I can't think of anything like them; sort of like a science/math exploration IDE.

David Pham22:02:34

Anyone knows if there is a way to connect to SSH to nextjournal? Google Colab just release a pro version (10 USD/month) but you get a P100/T4 or TPU at your disposal for playing around. There are some ways to SSH into these machine and use them as VM. For this case I think the price is fairly good. Wondered if Nextjournal has the same deal. My goal is to explore what RL can do and also to strengthen my skills on the process.

David Pham22:02:20

@jsa-aerial is Saite/Hanami the topic of your PhD? :)

jsa-aerial22:02:42

@neo2551 🙂 First, while they have proven quite nice and very useful for various work in the labs, I would not consider either or both in combination worthy of a PhD level thesis. There isn't enough true innovation in them for that. Second, I'm too old for that at this point 😔. Third, when I wasn't, that was in mathematics