ai

john 2023-04-23T23:05:41.977739Z

You still probably need a few GPUs

john 2023-04-23T23:06:02.238969Z

But not like thousands

john 2023-04-23T23:06:12.064379Z

Though that would probably help

john 2023-04-23T23:07:22.206279Z

And if we ever wanted to do a large, expensive training run on a clojure code assistant (though I doubt enough training data out there exists to make it that expensive), we could probably crowdfund it within the clojure community

john 2023-04-23T23:09:20.109079Z

It may well be that there's efficiency gains in data prep too, for both initial training data and fine tuning data, where we can use clojure to do the data manipulation

john 2023-04-23T23:14:06.597509Z

So to replicate any given python project like https://github.com/haotian-liu/LLaVA I think you'd want to just use libpython-clj for the core model stuff and do as much of the data stuff in clojure as possible?

john 2023-04-23T23:23:13.299959Z

Though if anyone is working on a deepdiamond/neanderthal transformer impl that'd be amazing! 🤞

john 2023-04-23T23:38:38.873979Z

On the question of clojure data for building out training data and fine tune data, I'm curious if we could just spec gen billions of functions and record the output, letting the model develop an intuition of what the compiler will do

john 2023-04-23T23:41:46.912049Z

And then train it on a few other github exports, for large scale programming understanding. When playing with chatgpt, it's clear that it's transferring knowledge about programming solutions in other languages sometimes when giving a clojure solution. It just doesn't yet have a good intuition about some basic things that would obviously fail to compile, like syntax errors.

john 2023-04-23T23:44:22.438379Z

I mentioned this in the off-topic channel, but it seems like we may end up in a situation where having a language coding assistant is as common as having a code linter or an LSP server, so we may want to think about how to make sure we have a clojure assistant story