This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-11-14
Channels
- # aleph (2)
- # asami (1)
- # aws (6)
- # beginners (65)
- # cider (12)
- # clj-kondo (11)
- # cljs-dev (1)
- # clojure (179)
- # clojure-dev (15)
- # clojure-europe (5)
- # clojure-losangeles (5)
- # clojure-nl (1)
- # clojure-spec (6)
- # clojuredesign-podcast (50)
- # clojurescript (27)
- # cryogen (31)
- # data-science (10)
- # emacs (2)
- # events (1)
- # fulcro (39)
- # helix (4)
- # luminus (3)
- # malli (5)
- # nrepl (4)
- # off-topic (3)
- # pathom (1)
- # reveal (10)
- # shadow-cljs (5)
- # spacemacs (3)
- # tools-deps (6)
- # vscode (1)
- # xtdb (3)
Most of my time as a data scientist these days is not spent thinking about which model should I use, or which packages or which language etc it’s mostly “how can I turn this into a DAG to make everything reproducible”. I think immutable data and a functional style of programming is ideal for creating not only the actual DAG pipeline, but also the individual steps — because reproducibility is a first-class concept 🙂 Clojure AFAIK doesn’t have something like this and this dagli that you posted @U050CT4HR seems to go in a nice direction of being a single library to build and contain every step of an ML pipeline
we use http://dvc.org at work and I’m personally in love because it’s language agnostic. I have a DAG getting data, cleaning, transforming, splitting, training models, saving artefacts including plots and metrics and saves everything in S3. we have steps written in babashka, R and will probably add a python deployment script — all in one workflow
Thanks for the response, @UFPEDL1LY! http://dvc.org was new to me, and believe it or not I’ve been looking around for something like it!
Other candidates:
• https://github.com/Factual/drake (written in Clojure(!), deprecated)
• https://www.digdag.io/
• https://airflow.apache.org/
• make
give it a try, you won’t regret it 🙂 the team behind it is also super approachable. they also have another tool called https://cml.dev which you use as a github action. I’m not sponsored by them btw, I wish haha
Will do, and thanks for the pointer to https://cml.dev!