2024-08-21 data-science | Clojure Slack Archive

data-science 2024-08-21

oλv 2024-08-21T22:53:34.270469Z

I’d like to do some local sentiment analysis on my humble M1. Is there an easy way to do this?

Rupert (Sevva/All Street) 2024-08-22T08:27:04.312989Z

It's possible to prompt a language model to do sentiment analysis (keep it simple like "Given the above text decide if the sentiment is positive, negative or neutral. Don't provide any other text." There are small Open Source modles (e.g. 1.1B, 2.5B, 4B. 8B) - if the tiny ones don't work you can upgrade to a bigger one.

👍 1

oλv 2024-08-21T23:18:55.922579Z

Sparring a bit with ChatGPT. I think basilisp and 🤗transformers is the way to go. Seems pretty easy to use :^)

oλv 2024-08-23T15:30:48.806879Z

I used 🤗’s (.pipeline transformer "sentiment-analysis") which was very easy to use, thanks #basilisp! All it does is classify strings as positive or negative with a confidence number, which tends to be approximately zero or one and as such is not a very good proxy for how positive or negative a string is. E.g “i’m happy” gives ~~1 and “this is the best day of my life” also gives~~ 1.

oλv 2024-08-23T15:34:07.284649Z

I managed to get a somewhat decent heuristic by partitioning my input and averaging the sentiment of the partitions, but the result is so so. Useful enough to be interesting in my case :^).

oλv 2024-08-23T15:39:06.112129Z

I’m curious if there’s a good way to get a measure of how positive a string is using a local model. I suppose I could just ask a llm directly.

phronmophobic 2024-08-23T15:43:08.820709Z

I’ve done that with llama.clj with moderate success. I wrote a bit about the underlying approach at https://phronmophobic.github.io/llama.clj/notebooks/intro.html#classifiers. I think the method could still be improved.

oλv 2024-08-23T20:16:52.743079Z

Cool, will read!

Clojurians Log v2

data-science 2024-08-21