Fork me on GitHub

I am also not sure whether you have to use GE matrices. If some of those are symmetric, use SY and speed things up for certain calculation...


It's late. I guess you have more than enough stuff to try out now, so I'll leave it at that for now 🙂


@justalanm just a quick update: I don't know what exactly python's svd computes, but Neanderthal's svd that you called with true true, also computes v and vt, which you probably don't need. If you need singular values, simply call svd (pure version) or svd! if you want to optimize even further. Even svd is much faster. Numbers from my machine (an old i7 4790k) for (svd a): 128x128: 773 microseconds (instead 3.1 ms) vs Numpy's 1.98 ms, and 1024x1024: 70 ms (instead 772 ms) vs Numpy's 288 ms. Now, it's for the case when you only need singular values without u and vt, and I don't know what Numpy's svd computes...


and 4096x4096: 1.86 sec (instead 23 sec) vs Numpy's 23 sec...


@blueberry thank you very much for all the advice! Anyway I'm starting to understand and like Neanderthal! I'll try to adjust evrything asap


Numpy computes them as well that's the reason I called it with true true


And a general matter: Numpy doesn't destroy inputs, so would it be fair to compare it with destroying functions such as mm! and svd!?


@justalanm I've checked the link to numpy svd function, and the reason for the difference is that they use the gesdd function for computing svd, and not the gesvd. gesdd is faster, while requiring more workspace and gesvd is slower but more accurate for small singular values. I'll see to add the sdd function in the next version of Neanderthal.


Maybe then it is better to compute just singular values (svd a) in both numpy and neanderhtal, if you only need them for computing pca?


The same is probably true for eigenvalues: numpy by default uses some other routine than the most general geev, which is faster but have other less desirable properties (which might not be needed for pca). See which one it is. Maybe neanderthal already offers it. If not, I can add it in near future. Anyway, compute only the part that pca need.


I can try computing only singular values (I'm not using them for PCA though, @aaelony suggested to benchmark SVD as well so I did)


There is no way to get only singular values with svd! right?


As for the destruction: Numpy doesn't destroy inputs, but it does destroy working space. In the changes I advised, Neanderthal would also not destroy the input, just many of intermediate throwaway workspace objects.


Yes there is. Check the docs, but I think you should provide nil for the arguments you don't want to be computed


@justalanm Anyway, good catch about the default routines that Numpy uses. I should at least provide the counterparts in Neanderthal (such as sdd in addition to svd).


Numpy is a very good library after all, and is somewhat become the standard in the field, but of course it has been developed over many years and with a lot of funding (only lately officially)


And another thing in your implementation: Your n-inc method, for example, in addition to probably returning the wrong result, does this: receives a number (4096), creates neanderthal vector dim 4096, turns it into clojure's seq by calling map, traverses the lazy seq and increments elements, then creates a new neanderthal vector, traverses the lazy seq again to copy it to the vector.


So, when I pass nil to a function such as svd! Neanderthal doesn't compute that part at all, or simply doesn't store it?


In this case, doesn't compute it. But the documentation is what's relevant in each case.


Another question: when I compute ev! do I always get a sorted result?


If the documentation says yes, then yes 🙂


do you need it to not be sorted?


No, no I like 'em sorted 😄


With the documentation you mean Intel's right?


'cause I don't see anything about them being sorted or not:

(ev! a w vl vr)
(ev! a w)
(ev! a vl vr)
(ev! a)

Computes the eigenvalues and left and right eigenvectors of a matrix a.

On exit, a is overwritten with QR factors. The first 2 columns of a column-oriented GE matrix w are overwritten with eigenvalues of a. If vl and vr GE matrices are provided, they will be overwritten with left and right eigenvectors.

If the QR algorithm failed to compute all the eigenvalues, throws ExceptionInfo, with the information on the index of the first eigenvalue that converged. If w is not column-oriented, or does not have 2 columns, throws ExceptionInfo. If vl or vr dimensions do not fit with a’s dimensions, throws ExceptionInfo. If some value in the native call is illegal, throws ExceptionInfo.


First you read Neanderthal's doc, and if you need more details about the routine's internals, then Intel is the ultimate reference, since they have one of the most detailed LAPACK docs out there (but it's still rather spotty for more obscure routines).


@justalanm I've improved svd! and svd to also support gesdd. It's in 0.20.0-SNAPSHOT. New results seem to be faster than Numpy (my processor is older but more powerful, so it might be that the speed is the same, but the difference between our timings was smaller with plain svd, so Neanderthal seems to be somewhat faster). Anyway, the slower variant, with u and vt computing, with new svd (sdd) on Neanderthal: 128x128: 1.68 ms, 1024x1024: 191 ms, 4096x4096: 12.4 s. The faster variant, only singular values, does not differ much from plain svd. Somethimes it is faster, sometimes slower than svd...


@justalanm i have no experience with python benchmarking yet, why have you not used the "timeit" module, it seems to be more popular


i guess it is not as critical because python has a dumb interpreter which does not require things like JIT warmup


@whilo perf is a toolkit built on top of timeit, the latter is very dumb, while the first one has a little more control


Anyway, thanks a lot guys! Clojure community is really one of the main reasons I'm getting to love this language!!!

🍹 28