Fork me on GitHub
#data-science
<
2020-05-07
>
val_waeselynck12:05:24

Not sure if this has been pointed out already, but it occurs to me that ClojureScript might be a killer feature for ML in Clojure. I'm constructing a classification pipeline for Reddit comments, and to do so I need to build a large dataset by labelling sample comments. I've quickly made a specialized UI in ClojureScript for that. Now I'm able to label a few thousands examples per day. I doubt that generic annotation tools would have made such a high throughput possible, and I have a hard time imagining making such a UI in so little time in Python.

đź‘Ť 8
vlaaad12:05:38

can you show an example?

val_waeselynck12:05:25

(Sorry the content's in French, that's the data I'm working on).

val_waeselynck13:05:17

I've got keyword shortcuts for labelling actions. I can usually label at that speed because there's not much reading required.

vlaaad13:05:04

very interesting!

vlaaad13:05:30

but your main data processing is still in clojure?

vlaaad13:05:10

so you need to have “server” as a “feedback receiver” and a this is a “client”?

val_waeselynck14:05:20

And in my case, it's not like I lose much by requiring a client-server communication, because the data processing often occurs on a remote machine anyway.

vlaaad14:05:41

Just wanted to point out that cljfx exists 🙂 — it has declarative UI in java process, so no client-server communication is necessary if data processing happens on your machine

val_waeselynck15:05:02

I'm well aware 🙂 in my use case, another argument in favour of a browser-based UI is that Reddit content is designed to be viewed in the browser, with hyperlinks etc.

vlaaad06:05:29

There is a web-view in cljfx :)