This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-09-16
Channels
- # ai (5)
- # announcements (47)
- # aws (11)
- # babashka (20)
- # beginners (85)
- # biff (1)
- # calva (72)
- # cider (9)
- # clj-kondo (37)
- # cljfx (9)
- # cljs-dev (1)
- # clojars (2)
- # clojure (61)
- # clojure-berlin (2)
- # clojure-europe (189)
- # clojure-nl (1)
- # clojure-norway (17)
- # clojure-uk (2)
- # clojurescript (51)
- # conjure (3)
- # cursive (4)
- # data-science (6)
- # datomic (6)
- # events (5)
- # fulcro (16)
- # gratitude (9)
- # holy-lambda (9)
- # introduce-yourself (6)
- # lsp (13)
- # malli (8)
- # membrane (2)
- # off-topic (47)
- # pedestal (11)
- # re-frame (15)
- # reitit (1)
- # releases (2)
- # rewrite-clj (6)
- # rum (4)
- # shadow-cljs (2)
- # tools-deps (3)
- # xtdb (25)
- # yada (13)
Anyone know of any plug and play libs that can extract keywords or topic tags from text?
Here ya go: https://gist.github.com/jacobobryant/f19b451af55a9541a1ac016a24e32981
Do you have a list of topic that you are interested in matching? You could start with clojure.striing/includes?
or regex.
Going down the data science route:
ā¢ If you know the categories, you could create a supervised classifier for each tag.
ā¢ If you don't know the categories, then you could detect rare terms with TF/IDF or use clustering like K-means.
Finally. large language models like GPT-J are good one/few shot learners you could probably prompt it to tell you about the tags (no fine tuning required). A notable downside is they are slow to run especially if you don't have a GPU handy.
maybe this LDA implementation still works... https://github.com/davidandrzej/chisel
Do you have some more details what you mean by "extract topics" ?