ai

2023-04-14T14:31:45.867189Z

Hello. I’m considering writing a CUDA project in Clojure. I’d like to talk with anyone who’s done something like that and hear about your experience, good or bad. Please PM me.

Rupert (Sevva/All Street) 2023-04-14T14:53:28.036349Z

You have a few options for accessing CUDA from Clojure. • Via Clojure Library ◦ https://clojurecuda.uncomplicate.org/https://github.com/uncomplicate/neanderthal/blob/master/src/clojure/uncomplicate/neanderthal/cuda.clj • Via JNI Bindings ◦ https://forums.developer.nvidia.com/t/cuda-library-embedding-in-java-with-jni/11146 • Via Java Library ◦ https://yarenty.gitbook.io/cuda-for-java-developers/http://javagl.de/jcuda.org/ • Via Python library (via libpython-clj) ◦ transformers ◦ tensorflow etc

Rupert (Sevva/All Street) 2023-04-14T14:54:47.591619Z

Are you intending to start a new algorithm from scratch or are you porting an algorithm/AI/Model from a different programming language?

Rupert (Sevva/All Street) 2023-04-14T14:56:33.984259Z

CUDA is very low level - most people are writing AI code at least one or two abstractions higher up from CUDA. So CUDA is good if you are doing something new or intentionally very low level, but for general AI stuff you can use something higher level than it.

2023-04-14T15:13:13.312169Z

I’d like to build an inferencing engine for LLM models similar to LLaMA. There are a lot of these circulating now, almost all run in Python or C++. It’d be nice to have a Clojure runtime to use, if practical.

Rupert (Sevva/All Street) 2023-04-14T15:17:30.138079Z

It's a great idea. There's also a rust implementation out there too. It would be nice to have a clojure version. The original C++ one doesn't use CUDA - so maybe its best to write a pure clojure version first then convert to Clojure + CUDA. You can probably use a clojure CUDA wrapper like neanderthal. It's a very doable project - the C++ version was done mostly livestreamed in only a day or two (the implementations in other languages are not that many lines of code) but it is also not a completely trivial task. If you just want to use an LLM from Clojure then you can do this with: (A) Shell out to python (B) libpython-clj and (C) call python over HTTP.

2023-04-14T17:43:15.812609Z

I've been wondering the same thing but about Stable Diffusion (text to image generation) how far away are the existing clojure tools from implementing this without python? My understanding is that these python would need Clojure implementations: • https://github.com/huggingface/transformershttps://github.com/huggingface/diffusers But I don't have a sense yet for how difficult this is, or if it's a total non-starter for a single person just starting to get into deep learning (Don't want to hijack your thread, but this may be related enough to be useful, and there might be overlap in functionality)

2023-04-14T17:46:16.890049Z

(well, a person with say ~3-6 months of time to devote to it i mean)

Rupert (Sevva/All Street) 2023-04-14T17:47:06.895289Z

The Python transformers library contains lots of code - but stable diffusion is only using a fraction of it (probably less than 1%). Stable diffusion has several steps - e.g. it uses clip to move between words and concepts which is a distinct step which does diffusion. I think the best way would be to look at some of the smaller implementations e.g. • https://github.com/tanelp/tiny-diffusionhttps://github.com/karpathy/nanoGPT And working from there. And looking at other versions which are just ports (e.g. ports to JavaScript/rust etc).

👍 1
2023-04-14T17:48:51.402849Z

Awesome, thanks!

2023-04-14T17:49:27.287309Z

Yeah, I am using a Python/CPP version from Clojure via API now. That works fine, but I’m not a Python or C++ coder, so I can’t really make non-trivial alterations to the inferencing code (without a lot of work)

2023-04-14T17:49:45.670629Z

So I’m really looking for a Clojure-based CUDA version. CPU inferencing is OK, but massively slower.

Rupert (Sevva/All Street) 2023-04-14T17:50:43.065299Z

Inference code is much simpler than training for all neural networks (inc LLM and Stable diffusion). I think it is a managable task for a motivated developer. The motivation is important because if you only have 90% of the solution - you basically have nothing (no useful output). You basically have to code a lot of the solution before you get any idea if it working or not.

2023-04-14T17:50:52.100509Z

The guts of the reference Python code are <300 lines of Python. Someone already familiar with CUDA could port it in a day or two. https://github.com/facebookresearch/llama/blob/main/llama/model.py

2023-04-14T17:51:59.862529Z

The current leading CPU engine has grown a lot of other features now, but also comes in at ~300 lines of code for the actual inferencing. https://github.com/ggerganov/llama.cpp

2023-04-14T17:54:10.463529Z

It might be worth checking out neanderthal: https://neanderthal.uncomplicate.org/articles/guides.html seems to be a goal there to offer a common interface across cpu/gpu

👆 1
2023-04-14T17:54:20.625829Z

That’d be ideal

2023-04-14T17:56:06.009779Z

The author also has a book for sale https://aiprobook.com/deep-learning-for-programmers (I just started on it)

2023-04-14T17:56:45.004219Z

I unfortunately don’t have the bandwidth to go learn how to build the inferencing engine myself. I’d be happy to contribute to a pool to fund the work by someone who knows how to do this already.

2023-04-14T17:58:03.251739Z

Yeah I hear ya, I can check back in if I ever advance enough to be productive in this area. But for the record I would also contribute to such a pool to have this exist

2023-04-14T17:59:47.191559Z

I reached out to Dragan, the author of that book. He doesn’t have any bandwidth for new projects right now.

2023-04-14T18:07:23.989169Z

I just put a note in #remote-jobs.

Daniel Slutsky 2023-04-14T19:23:50.460949Z

Nice discussion 🙏 One other pathway for using CUDA from Clojure is https://github.com/scicloj/clj-djl by @kimi.im, which is a wrapper of https://djl.ai/. It seems to have some integration with HuggingFace models, but I have never tried it.

👀 1
jsa-aerial 2023-04-14T20:58:27.300259Z

Too bad Dragan doesn't have the bandwidth - he is hands down the guy for the job. I was actually discussing this with Daniel yesterday about how I have thought about implementing a transformer architecture in Neanderthal and DeepDiamond. Those are the correct path for this - they cover both GPU and CPU. Having looked at the original transformer (attention is all you need paper) and a couple others, I don't think any direct GPU programming would be required.