Fork me on GitHub
#data-science
<
2019-05-18
>
Chase15:05:49

hello! are any of you folks exploring Neanderthal? I'm trying to get set up but I'm not quite there yet. Not sure where to ask for help.

Chase15:05:03

I would like to explore mxnet too but one of the tutorials had me trying to set up using an ec2 amazon instance or something. I'm not entirely opposed to that but is it necessary? I have a fear I'm going to forget to shut something down and wake up to a crazy AWS bill because I have no clue what I'm doing.

gigasquid15:05:16

I do most of my stuff on my laptop cpu with MXNet

gigasquid15:05:39

What system do you have OS X or Ubuntu?

Chase15:05:22

oh cool, you can just do it on your laptup huh. I'm on debian

gigasquid16:05:34

There might be some extra dependency you need for Debian I’m not sure. But check out the install instructions in the readme and ping if you need

Chase16:05:55

will do! Thank you

jsa-aerial16:05:39

@chase-lambert I have never had an issue setting Neanderthal up, but I have always been on Linux. The only 'extra' part beyond just including [uncomplicate/neanderthal "0.23.1"] in your project is - per the instructions - to have the MKL libraries available. I generally just put them in /usr/local/lib as that is already on the lib path. After that it is just lein repl and you're ready to go

jsa-aerial16:05:32

There's some 'extra thing' you need on the mac for GPU, but I have never run that on my mac

jsa-aerial16:05:50

And that 'mac extra thing ' is in the documentation

Chase16:05:53

yeah, i have mkl setup and used export to put the relevant directories on my path. maybe i need to put the whole thing in my /usr/local/lib. I'm on linux too. right now the mkl library sits in ~/intel with one of the .so dependencies being in ~/intel/lib/intel64/ and the others sitting in ~/intel/mkl/lib/intel64/

Chase16:05:47

but I'm using the lein-with-env-vars plugin he recommended and setup my project.clj like the example he has so it should be pointing directly.

Chase16:05:16

but then I get this error message when evaluating his simple examples:

Syntax error (NoClassDefFoundError) compiling at (core.clj:8:1).                 
Could not initialize class uncomplicate.neanderthal.internal.host.CBLAS

Chase16:05:40

my repl also shows me this error:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See  for further details

jsa-aerial16:05:36

You only need these:

jsa-aerial16:05:45

-rwxr-xr-x root/root   1935740 2017-12-18 15:54 libiomp5.so
-rwxr-xr-x root/root  53961017 2017-12-18 15:54 libmkl_avx2.so
-rwxr-xr-x root/root  65389379 2017-12-18 15:55 libmkl_avx512.so
-rwxr-xr-x root/root  35043637 2017-12-18 15:55 libmkl_core.so
-rwxr-xr-x root/root  31912181 2017-12-18 15:55 libmkl_def.so
-rwxr-xr-x root/root  10567885 2017-12-18 15:56 libmkl_intel_lp64.so
-rwxr-xr-x root/root  37230319 2017-12-18 15:56 libmkl_intel_thread.so
-rwxr-xr-x root/root   5993307 2017-12-18 15:56 libmkl_rt.so
-rwxr-xr-x root/root   6240447 2017-12-18 15:56 libmkl_vml_def.so

jsa-aerial16:05:37

I just copy them directly to /usr/local/lib, but you could just symlink them

Chase16:05:42

yeah so I have those in the ~/intel/lib/intel64/libiomp5.so and all the rest are in ~/intel/mkl/lib/intel64/

Chase16:05:07

so maybe I just copy all those individual files to /usr/local/lib?

jsa-aerial16:05:12

Well if that is not on you default lib path it won't ffind them

Chase16:05:47

maybe I'm confused with default lib path vs my $PATH. I have the directories on my $PATH

jsa-aerial16:05:58

You can set the LIBRARY_PATH env to include them but really the best thing is to just copy / symlink them

Chase16:05:08

ok. I'll try that. ty!

jsa-aerial16:05:11

lib path has nothing to do with PATH

jsa-aerial16:05:31

This is just normal linux stuff

Chase16:05:59

I never said I was good at normal linux stuff yet. lol!

jsa-aerial16:05:07

Actually, this is information is in the documentation - though it may not be readily visible. Also, I believe it is in an issue in the hello world example. Oh, that also includes the fix for mac nvidia issue and how to get around java 9+ problem. (which causes problems for all sorts of other things as well)

jsa-aerial16:05:50

Both of those latter are in the project.clj of that example

Chase16:05:40

i copied all those files to my /usr/local/lib/ and still get the same error. darn

jsa-aerial16:05:37

what does your project.clj look like? Maybe you should just clone the hello world example

jsa-aerial16:05:02

Something as simple as this just works fine for me:

jsa-aerial16:05:14

(defproject nnwn "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url ""
  :license {:name "Eclipse Public License"
            :url ""}
  :dependencies [[org.clojure/clojure "1.10.0"]
                 [uncomplicate/neanderthal "0.22.0"]
                 [criterium "0.4.4"]])

jsa-aerial16:05:52

Hmmmm, I guess I should move that to "0.23.1"

Chase16:05:08

yeah I get the same two errors using the cloned hello world example. frustrating

jsa-aerial16:05:58

Well, I just cloned that and it works fine for me.

jsa-aerial16:05:54

I've set this up on several machines from desktop to rack servers and never had an issue.

Chase16:05:20

do you need gpu acceleration for that to work? My linux setup currently doesn't have that capability

jsa-aerial16:05:26

A colleague did set it up on a linux vm on windows and it worked for him as well

jsa-aerial16:05:27

You do not need GPU if you just use native code. For GPU you will need to install opencl and/or cuda stuff, but you don't need those unless you are going to run gpu

Chase16:05:50

ahh, ok. well then I'm at a loss. Thank you for trying to help though!

jsa-aerial16:05:30

So, what exactly is you setup?

Chase16:05:38

debian stable. my /usr/local/lib/ directory now looks like this:

λ /usr/local/lib : ls
clojure         libmkl_avx512.so  libmkl_intel_lp64.so    libmkl_vml_def.so
libiomp5.so     libmkl_core.so    libmkl_intel_thread.so  python2.7
libmkl_avx2.so  libmkl_def.so     libmkl_rt.so            python3.5

jsa-aerial16:05:34

can you post an ls -l of that?

Chase16:05:41

λ /usr/local/lib : ls -l
total 297616
drwxr-sr-x 1 root staff       62 May 16 10:56 clojure
-rwxr-xr-x 1 root staff  2226679 May 18 11:20 libiomp5.so
-rwxr-xr-x 1 root staff 56865848 May 18 11:22 libmkl_avx2.so
-rwxr-xr-x 1 root staff 70221251 May 18 11:22 libmkl_avx512.so
-rwxr-xr-x 1 root staff 70211598 May 18 11:23 libmkl_core.so
-rwxr-xr-x 1 root staff 40846834 May 18 11:23 libmkl_def.so
-rwxr-xr-x 1 root staff 11161292 May 18 11:23 libmkl_intel_lp64.so
-rwxr-xr-x 1 root staff 39834096 May 18 11:24 libmkl_intel_thread.so
-rwxr-xr-x 1 root staff  6628244 May 18 11:24 libmkl_rt.so
-rwxr-xr-x 1 root staff  6745690 May 18 11:25 libmkl_vml_def.so
drwxrwsr-x 1 root staff       52 May 16 11:07 python2.7
drwxrwsr-x 1 root staff       26 Feb 14 08:35 python3.5

jsa-aerial17:05:25

hmmm those are all rather different from what I have.

Chase17:05:32

are they? anyways, I solved it! I had to run sudo ldconfig after putting those in the /usr/local/lib directory

Chase17:05:27

thanks again for all your help

jsa-aerial17:05:58

Good catch on the ldconfig - reboot would have workd 😁 as well. Probably just a different version. Just need to have Intel MKL 2018 + I believe

Chase17:05:38

now all I have to do is learn what data science and these libraries actually are. hahaha

Chase17:05:06

how is neanderthal different than mxnet? from what I've gathered so far, neanderthal is basically a pure clojure way of doing the kind of computationally intensive calculations you do for data science? And mxnet is a big c++ library doing the same except it has scala bindings and we can tap into that through clojure -> scala -> jvm -> mxnet bindings?

jsa-aerial18:05:32

@chase-lambert Neanderthal is a very well crafted interface to BLAS and LAPACK and as such is basically a high performance numerical and linear algebra (matrix) computation library. It's overhead (beyond what you would have if using c/c++) is nearly zero and it also helps in this by supporting you in writing operations naively (mathematically correct) without having to worry about how to best order those to get best performance. That latter can make an enormous difference (10-100x). You can also interactively (via the repl) use it to do exploratory GPU work. Kind of amazing really. I believe (and have advocated) that it is the best base on which to build the Clojure version of Numpy. To that end, I've recently begun noodling along those lines. I need to find the cycles to do a proper job of it... 😣

cider 4
jsa-aerial18:05:13

MXNet is an Apache project that is comparable (and in many ways superior to) Tensorflow. It is much higher level than Neanderthal and focused on various NN work / applications. @gigasquid Carin Meier has done a great job of providing Clojure access to these capabilities - see #mxnet for more information.

Chase18:05:58

awesome summary thank you!

Chase18:05:36

so Neanderthal would be the foundation for building something to do Deep Learning or it already has that capability? And mxnet does currently have that capability already right? I saw gigasquid's talk on "is it flan" and really enjoyed it. I think I like both approaches. Neanderthal to learn data science from first principals and get a good foundation (with the clojure way of doing it) while Mxnet as the more mature, batteries included approach to start doing some actual deep learning projects today.

jsa-aerial19:05:09

Well, yes it could be the base of a 'Deep Learning' (new buzz word for what was long aka multi layer NNs...) capability, which is sort of what the entire series https://dragan.rocks/articles/19/Deep-Learning-in-Clojure-From-Scratch-to-GPU-0-Why-Bother is about (as a mostly educational series, but also is quite impressive for simple feed forward networks). There is a ton of stuff that is / woud be needed to get to something like MXNet (or Pytorch for that matter). It could be done (see flare https://github.com/aria42/flare for an example using Neanderthal to build/provide an LSTM similar to what you can do in Pytorch), but may not make any sense, given things like MXNet - certainly not in the short term.

jsa-aerial19:05:51

I wouldn't say these things are 'different approaches' as that seems to imply they are about the same thing. They are actually very different in what they are about.

jsa-aerial19:05:40

For example, Neanderthal is also the basis of Bayadera https://github.com/uncomplicate/bayadera, which doesn't have anything to do with NNs - multilayer, recurrent, convolutional, or whatever.