This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-07-03
Channels
- # announcements (1)
- # babashka (22)
- # beginners (176)
- # calva (10)
- # cider (4)
- # circleci (5)
- # cljsrn (20)
- # clojure (28)
- # clojure-europe (11)
- # clojure-italy (5)
- # clojure-nl (5)
- # clojure-spec (1)
- # clojure-sweden (2)
- # clojure-uk (29)
- # clojuredesign-podcast (4)
- # clojurescript (38)
- # code-reviews (25)
- # conjure (1)
- # core-typed (1)
- # data-science (16)
- # datomic (23)
- # figwheel-main (16)
- # fulcro (48)
- # helix (9)
- # jobs (3)
- # juxt (5)
- # kaocha (17)
- # malli (19)
- # mount (9)
- # nrepl (4)
- # off-topic (35)
- # pathom (7)
- # re-frame (28)
- # reagent (26)
- # reitit (1)
- # releases (1)
- # remote-jobs (5)
- # sci (6)
- # shadow-cljs (36)
- # spacemacs (3)
- # sql (8)
- # tools-deps (13)
- # unrepl (1)
- # vim (4)
- # xtdb (8)
Has anyone looked at what the fastest PCA method available to us from Clojure is?
I guess that depends on the size, dimensionality and sparseness of the data as well?
clojure.core.matrix
has SVD method in it
We use (and expose) smile in http://tech.ml.dataset; he is using netlib blas under the covers. It would be interesting to time that against neanderthal but I imagine if you install mkl as your system blas then those timings aren't interesting.
Thanks for the feedback folks! I'm using
, and I didn't time it, but it was at least dozens of minutes on a thousands by thousands matrix.
@U05100J3V From my book (1,000 x 100,000 on 7 year old CPU i7-4790k):
(with-release [a (rand-normal! (fge 1000 100000))]
(time (pca (center! a))))
=> "Elapsed time: 355.167051 msecs"
@U086AG324 Epic 🙂 Thanks!
from which lib @U086AG324?
No lib. The handful-of-lines-implementation of PCA explained in the book. Uses Neanderthal for linear algebra.
ah it's you dragan, cool
@U05100J3V - Most likely the netlib is falling back on java implementation and not picking up system blas libraries. Regardless, you can transform you dataset to a tensor and from there you can copy it into neanderthal in a fairly straight forward manner and then get subsecond 🙂.
@chrisn Thanks for pointing that out. Realized that I haven't set up blas on this computer yet, so that would explain it. Other than timing things, is there a good way to check whether it's finding the blas routines?
Honestly, i do not know of any aside from timings. The netlib documentation may have more info; perhaps a verbose mode enabled by a java system property.
I would imagine intel mkl is an option and its installation process may have an option to set it as the system blas.
(Full decomposition; That is, not just power-iteration, etc)