Fork me on GitHub
#data-science
<
2018-07-16
>
alan06:07:29

https://gitlab.com/alanmarazzi/numpy-vs-neanderthal @blueberry update, now we are basically crashing Numpy, a few things I noted: 1. For very simple stuff like matrix multiplication times are similar: the language has basically nothing to do, is just a call to MKL, and this is what I would expect 2. With SVD I removed U and V calculation from Neanderthal code and it got much faster than Numpy (I don't feel like it is cheating, they should give me the option to calculate them or not) 3. PCA: either I did something stupid in Neanderthal code, or here we are seeing a huge difference in "real world" usage 😄 4. I used 0.19.0, I tried to build 0.20.0-SNAPSHOT but it was complaining about neanderthal-native version (can you help me with that?), if SVD and PCA are getting even faster that's it, I'm writing a statistical library in Neanderthal at least for myself 😏 5. I tried running Numpy with linalg.eig instead of linalg.eigh and it couldn't finish running, the 4096x4096 PCA would take more than 40 seconds!!! So i reverted to linalg.eigh, but we're still beating it anyway

alan06:07:33

Edit: I did something stupid in Neanderthal I was benchmarking only the mm! call 😓

alan06:07:45

I'll rerun everything, but the other results are ok

alan06:07:42

I guess I am pretty tired lately... :rolling_on_the_floor_laughing:

blueberry08:07:14

you'd have to build the new version of neanderthal-native, clojurecl, and clojurecuda, of course. or you can wait a couple of days until I release new versions.

👍 4
alan08:07:48

Would it make sense to test the GPU as well? I mean I know Numpy can't do it, but it might be interesting to show that Neanderthal can do that while the other can't

blueberry08:07:50

It does, of course, and I guess that should be compared to Numba or something similar. However, note that GPU engine does not (yet) support svd nor eigendecomposition, so you won't be able to compute PCA (unless you add bindings to that part of cuSolver or write the kernels yourself).

alan08:07:41

I'm not great at C++, so I guess I'll have to wait

blueberry08:07:52

No worries - there is zero C++ programming required. Only (Cuda or OpenCL) C.

blueberry08:07:36

The numerical algorithms and their optimization, on the other hand...

alan08:07:27

Yeah that's the issue (C or C++ doesn't really make much difference)

genmeblog09:07:50

if anyone is interested. In my math library https://github.com/generateme/fastmath you can find (among other namespaces): descriptive statistics, distributions, interpolations and clustering. Almost all functions are backed by SMILE or Apache Commons Math and work with native Clojure sequences.

👏 4
alan09:07:21

@tsulej is it comparable to Numpy?

genmeblog09:07:02

after quick check (I don't know Numpy/Scipy). fastmath's random numbers, distributions can be compared to numpy.random. Plus the rest of math and statistics.

genmeblog09:07:06

scipy.interpolate <-> fastmath.interpolation

genmeblog09:07:50

scipy.cluster -> fastmath.clustering (however scipy is very limited here)

genmeblog09:07:09

maybe one more: fastmath.transform includes DFT (+ various wavelets, DCT, DST and Hadamard)

alan17:07:12

If I use mm! inside cov code it fails, while with mm everything works

blueberry17:07:19

It probably crashed because you have released the result and later tried to use it outside that scope when it was already released. Why would you release the result? You'd want to keep the result, while releasing temporary objects...

blueberry17:07:50

although it would be great if you could share a github project with a minimal crashing code that I could try and debug. Even if it's due to misuse, maybe I can add a layer of protection against JVM crash...

alan18:07:15

Ok, this is what I meant with "I didn't understand well with-release and let-release", now it's clearer

alan18:07:09

If you want you can take the code from the repo, that's failing code, it's sha is d70b011be3c6dd4bf35fdfd348f0939d3dbf49a9

alan18:07:02

But why it was failing only with 1024x1024 version?

blueberry19:07:23

@justalanm These two functions have documentation 🙂

blueberry19:07:23

It would be easier for me to reproduce if you made a mini leiningen project wit one test file with examples that fail and examples that don't.