Fork me on GitHub
#data-science
<
2024-07-21
>
Daniel Slutsky09:07:34

Hi. It is not too late to join the Scicloj real-world-data group. https://clojureverse.org/t/real-world-data-meeting-10/

phronmophobic22:07:47

I'm prototyping a wrapper for https://github.com/ggerganov/ggml, a tensor library for machine learning that powers llama.cpp, whisper, and about a dozen other ML libraries. ggml has multiple backends including cpu, nvidia gpus, and apple silicon. It seems like it's possible to get the boilerplate down to a pretty reasonable API:

(def cpu-sched (cpu-scheduler))
(def gpu-sched (gpu-scheduler))

(def my-graph
  (fn my-graph [ctx a b]
    (let [out (raw/ggml_scale ctx (raw/ggml_add ctx a b) -1)]
      ;; multiple outputs
      [out
       (raw/ggml_sum_rows ctx out)])))

(def n 10000)
(def a (float-array (repeatedly n rand)))
(def b (float-array (repeatedly n rand)))

(def results-cpu (time (compute cpu-sched my-graph a b)))
(def results-gpu (time (compute gpu-sched my-graph a b)))

(prn (-> results-cpu second seq)
     (-> results-gpu second seq))
This is definitely not a final API, but seems promising.

💪 4
👍 4
gratitude 2
1
Rupert (Sevva/All Street)14:07:36

Do you find it better to go direct to ggml vs through llama.cpp?

phronmophobic15:07:52

Depends on the use case. For generating tokens from an LLM, the llama.cpp api is much higher level. It would probably be a lot of work reimplementing the models llama.cpp supports just with ggml.

phronmophobic15:07:15

Theoretically, it doesn't seem like it would be that bad to implement models directly, but I'm not sure there's a better way to learn the model architecture than reading a bunch of papers, c++, or python.