This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-03-27
Channels
- # announcements (2)
- # aws (31)
- # babashka (81)
- # beginners (82)
- # calva (38)
- # clj-kondo (41)
- # cljdoc (4)
- # cljs-dev (6)
- # clojure (101)
- # clojure-belgium (1)
- # clojure-europe (30)
- # clojure-germany (1)
- # clojure-italy (7)
- # clojure-nl (4)
- # clojure-norway (1)
- # clojure-spec (1)
- # clojure-uk (19)
- # clojurescript (16)
- # clojutre (1)
- # community-development (26)
- # core-logic (2)
- # data-science (26)
- # datomic (71)
- # events (3)
- # fulcro (55)
- # graalvm (2)
- # graphql (3)
- # joker (2)
- # kaocha (19)
- # luminus (2)
- # malli (6)
- # meander (3)
- # off-topic (6)
- # pathom (34)
- # random (1)
- # re-frame (2)
- # robots (1)
- # shadow-cljs (37)
- # sql (30)
- # tools-deps (21)
- # xtdb (4)
- # yada (25)
Hey folks, over at Applied Science we're playing around with accessing RData files from Clojure with a minimum of interop. We build a single-purpose library: https://github.com/appliedsciencestudio/rdata/ Before we polish it up for release, we'd like to get your feedback. :male-detective::skin-tone-3: 👷 We'd appreciate it if you try it out and let us know what you think.
For those interested in R (especially from Clojure), there is a free remote conference tomorrow: http://dc2020.netlify.com/
Our friend Danjela is interested in Clojure and data science and is looking for a teammeate for the Rails Girls Summer of Code project. https://www.reddit.com/r/Clojure/comments/fpkz98/rails_girls_summer_of_code_teammate/ Do you happen to know anyone who may like to join Danjela?
Hi, do you have any Clojure data science talk suggestions? I have seen the two listed in Oz's README (one on Oz, one on Vega Lite). I know Clojure. Want to learn data science.
I recommend:
1. The Book of Why for an introduction to causal inference ( )
2. Richard McElreath lectures () and his Statistical Rethinking book
3. Anything Tensorflow (books, documentation, examples)
4. Dragan Djuric's books and software
@hindol.adhya If you just want some basic intros, I have done one on Dragan's baby steps in data science and another on the basics of Oz https://www.youtube.com/playlist?list=PLpr9V-R8ZxiDUXIR2z8Y8wvhpoPyl0t_D
Do you have any math background @hindol.adhya?
How much math? If you mean probability and statistics, I know the basics like mean, mode and median. A little shaky on k-means, SVM and totally green on neural network etc.
OK; That's a good start. IMHO (not that my math background biases me or anything), math/prob/stats skills are really the bedrock of data science. So the more you can learn on that front the better.
I studied probability in school, and tutored basic stats (among other things), but this was my first introduction to higher level statistics and machine learning. What I really like about this book is that it endeavors to teach them together, seeing them as opposite sides of the same coin, which is %100 in my book: https://web.stanford.edu/~hastie/ElemStatLearn/printings/ESLII_print12.pdf
I wouldn't say this book is the easiest to get, but take a look and see how you take to it. If you are having a hard time getting through, you can pull at the bits that are you giving you trouble from other resources.
One book I studied as part of coursework is https://nlp.stanford.edu/IR-book/information-retrieval-book.html But this does not go too much in depth. I have a CS major.
This book only touched upon various clustering, supervised/unsupervised learning techniques.
The IR book does a surprisingly good job at introducing and motivating ML techniques, more so than many ML-specific books!
(Also, despite the hype, there is more to modern ML than just deep learning/NNs - e.g graphical models / Gaussian Processes / TDA to name just a few- and non-modern ML often works quite well 😉 )
^ 100% this! NN can do certain things really well, but it's often difficult to figure out what they're doing or why they're doing it. Good advise is to choose a model and approach based on the details of the situation, and not just grab the latest fad.
Using a NN when you have a reasonable and principled probabilistic model, taylored to the situation at hand, that can be interpretted, etc, is always the way to go if you have a choice.
I should mention, I am not trying to change careers or anything. I am interested, and will spend my own time learning.
One part I love is visualization. This I enjoy much more than exploratory data analysis.
Visualization is huge; A picture is worth a thousand words, right?
Good luck! Interested to see what other recommendations folks have!
@hindol.adhya I find it hard to answer because it seems to me a lot of different things are meant by "Data Science". On everything related to probabilistic modeling and inference (which arguably are a core component of Data Science), MacKay's ITILA (http://www.inference.org.uk/mackay/itila/book.html) is the most insightful introduction I've found, and probably the best-written scientific textbook I've read. Self-studying it has been a joy.
For probability, it is pretty hard to beat "The Probability Tutoring Book" https://smile.amazon.com/Probability-Tutoring-Book-Revised-Printing/dp/0780310519/ref=smi_www_rco2_go_smi_g8217842112?_encoding=UTF8&%2AVersion%2A=1&%2Aentries%2A=0&ie=UTF8
@hindol.adhya I find it hard to answer because it seems to me a lot of different things are meant by "Data Science". On everything related to probabilistic modeling and inference (which arguably are a core component of Data Science), MacKay's ITILA (http://www.inference.org.uk/mackay/itila/book.html) is the most insightful introduction I've found, and probably the best-written scientific textbook I've read. Self-studying it has been a joy.