This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-10-15
Channels
- # announcements (1)
- # babashka (81)
- # beginners (48)
- # calva (49)
- # clj-kondo (52)
- # cljdoc (7)
- # cljs-dev (39)
- # clojure (33)
- # clojure-australia (18)
- # clojure-europe (48)
- # clojure-italy (2)
- # clojure-morsels (2)
- # clojure-nl (3)
- # clojure-uk (6)
- # clojurescript (5)
- # community-development (2)
- # conjure (6)
- # cursive (3)
- # data-science (29)
- # datalog (4)
- # datomic (14)
- # events (1)
- # fulcro (1)
- # graphql (18)
- # gratitude (2)
- # helix (11)
- # introduce-yourself (2)
- # java (15)
- # keyboards (2)
- # lsp (6)
- # luminus (4)
- # membrane (32)
- # minecraft (1)
- # missionary (7)
- # nextjournal (2)
- # off-topic (28)
- # portal (28)
- # releases (1)
- # ring (1)
- # shadow-cljs (3)
- # sql (6)
- # xtdb (23)
I'd like to move some of my python data cleaning code into Clojure. For example things like https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html Is there any library you can recommend that has data cleaning functions in it?
I believe the go-to data wrangling libraries nowadays are the ones based on https://github.com/techascent/tech.ml.dataset. See for instance https://github.com/scicloj/tablecloth. I actually wrote https://github.com/zero-one-group/geni to move my data cleaning code from Python to Clojure. The https://github.com/zero-one-group/geni#resources is based on an existing Pandas cookbook. However, I would not recommend Geni if you have no experience with Spark.
@U6T7M9DBR You might want to join #data-science if youβre not already in there!
@U6T7M9DBR You are π My guess is @U050CT4HR thought we were someone else.
There is also a zulip data-science stream which is very active.
Thought this thread was happening in a different channel. Sorry! :face_palm::skin-tone-2:
You want to go here: https://clojurians.zulipchat.com/#narrow/stream/151924-data-science And for TMD/TC development stuff you want this: https://clojurians.zulipchat.com/#narrow/stream/236259-tech.2Eml.2Edataset.2Edev Clj datascience stuff is on Zulip - Slack is a deadzone
I'm trying to port some python code into clojure, is there anything I can use to translate a call to https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.boxplot.html?
I'd also be fine with rolling out the quartiles calculations manually but any starting pointer would be nice
@skuro I'd suggest taking a look at Oz (https://github.com/metasoarous/oz) (and vega-lite, more generally; Oz is among other things a tool for using vega-lite from Clojure).
Vega-Lite is a really wonderful (interactive) data visualization framework, where you specify the visualization as a data structure, so very compatible with the Clojure philosophy (code as data, data as code and all that; Which is also makes it possible to create specs in any language, and pass them on to a web page for rendering).
You can find out more here: https://vega.github.io/vega-lite/
If Oz doesn't fit your taste for whatever reason, there are a bunch of other Clojure tools using Vega and Vega-Lite, including hanami, saite, notespace, clerk, etc. So whichever tool you use, you kind of can't go wrong, since you can easily take the specifications and move them around between tools.
looking at the code I'm translating, it actually doesn't do any visualization per se, it's just using the default boxplot settings for outliers filtering by removing anything that stands outside of the boxplot whiskers
Good to hear π
Ah; I see. I think there might be some functions in the apache commons math standard lib that do this. There's quite a bit there actually.
There's also the fastmath clojure lib, and a few others.
Sure thing! FWIW, I vastly underappreciated for quite a while just how much is baked into the apache commons, but it's now often the first place I look since it's always right at hand as part of the standard lib. One of the really nice things about running on the JVM!
Obviously, the APIs aren't always super idiomatic, so it's nice to have the Clojure libs as well.