There any way to create embeddings locally from clojure? I currently have a small python script spun up using huggingface bge-large model that is responding to local http requests to provide me this, and of course dealing with python version/dependencies on an old VPS was the most difficult part of the task.
Local embedding models, such as All-MiniLM-L6-v2, are not suitable for most cases. But this is a continuously changing landscape, so please check the MTEB leaderboard on Hugging Face. For my last project, I used the Gemini Embedding API from Google. This is significantly better than running local models on OLLAMA in terms of accuracy and speed. And it is free.
LM Studio is super convenient for experimenting with local LLMs and embedding models :^)
Dropping the link: https://lmstudio.ai/ It implements the OpenAI API for embeddings and completions so you can use it with https://github.com/wkok/openai-clojure or whatever
Datalevin is ~10x faster than langchain4j for my 1024 dimension vector search across 1088 embeddings
(let [q "patrol or camry"]
(encore/qb 5
(l4j-query q 10) ; langchain4j
(top-k (user-query->embed q) 10) ; naive implementation
(d/search-vec index (user-query->embed q) {:top 10}) ; datalevin
))
;; => [8.41 56.79 0.74] Yup can avoid the network barrier by using libpython-clj at least
Hi. Here are a few resources demonstrating the use of LangChain4J to create embeddings in Clojure. A tutorial by @carsten.behring: https://scicloj.github.io/clojure-data-tutorials/projects/ml/llm/vectorstore.html A session we had at the Scicloj AI group: https://www.youtube.com/watch?v=fvcnCxFHyos The recent talk (and detailed notes) by @eoincarney0 at the SciNoj Light #1 conference: https://scicloj.github.io/scinoj-light-1/sessions.html#parliamentary-questions-and-answers-using-noj-to-explore-basic-rag-techniques I think LangChain4J supports different kinds of embeddings, and I haven't explored their diversity.
Oh very cool! Yesterday implemented a RAG for car search on a classified site without exploring this space previously and it's working very well despite having no knowledge. Thanks Daniel for all the material to pour over!
Have you tried the Datalevin similarity search features?
It's pretty sweet for doing RAG in Clojure 🤓
Haven't heard of Datalevin before, I'm pretty slow at picking up new things, like to let them mature a bit. Only thing I ever adopted extremely early was malli.
Just typing some comments here:
I tried the latest lang4j, its API changed quite a bit since the meetup video in the most awkward ways. The embedding store works faster than my naive clojure implementation bring search down from 25ms to 5ms for 1088 embeddings of 1024 dimensions.
The 384 dimension embeddings from AllMiniLmL6V2EmbeddingModel are significantly worse than my python script's model BAAI/bge-large-en-v1.5 returning flat out wrong makes/models when the bigger model correctly infers things.
Will try an embedded datalevin vector search next and libpython-clj so I can easily use the latest models.
Oh, that is so helpful to know.
This seems to be the list of embedding models supported: https://github.com/langchain4j/langchain4j-embeddings
Yeha I saw that, I asked gpt4.1 about them and they all seem worse than the bge-large model. The bge-large model takes about 100ms to make an embedding via python which isn't bad, I was getting 1-2 seconds using open AI's API.
I didn't notice any drop of quality of results either
in case you're interested, can try it out here by prefixing search query by "i want " https://sayartii.com/ Nearly all the latency is coming from calling gpt4.1mini to process the results and return JSON.
I saw somewhere that azure has 4.1mini and is faster.
Thanks
Planning on making a chatbot RAG at some point soon to ask probing questions and get extra debug information from people when they post repair requests like this https://motorsaif.com/requests/tct-892 then augment the description with that summary.
There seems to be a Java wrapper for various embedding services: https://howtodoinjava.com/spring-ai/vector-embedding-example/ https://docs.spring.io/spring-ai/reference/api/embeddings.html I've never tried it, and don't know whether it makes seems easier or more stable compared to using the bridge to Python.
It looks like those just call the API which is trivial to do with clojure http.
(defn embed-text [text]
(let [res (http/post ""
{:headers {"Authorization" (str "Bearer " open-api-key)
"Content-Type" "application/json"}
:body (json/encode {:model "text-embedding-3-small"
:encoding_format "base64"
:input text})})
body (-> res
:body
json/decode)
base64-embedding (get-in body ["data" 0 "embedding"])]
(decode-b64-embedding base64-embedding))) The python one I'm using runs locally, provides 1024 dimension embeddings fairly quickly.
Yes. Some of those APIs, such as Ollama, can be run locally, and may have good embedding models in their collection (and indeed should be easy to be accessed directly from Clojure). Anyway, the Python way sounds good.
Yup just got libpython-clj working, shaves ~200ms off 20 consecutive runs from the flask http version. Really cool!
you can also use onnx models directly in langchain4j, and at the very least any python model can be exported to onnx
I am doing something like the following