This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-12-27
Channels
- # adventofcode (7)
- # announcements (31)
- # babashka (15)
- # beginners (14)
- # calva (45)
- # circleci (6)
- # clojure (27)
- # clojure-europe (19)
- # clojure-france (2)
- # clojure-gamedev (4)
- # clojure-uk (2)
- # clojurescript (26)
- # conjure (14)
- # data-science (6)
- # deps-new (7)
- # depstar (4)
- # emacs (13)
- # events (1)
- # fulcro (20)
- # graalvm (2)
- # hoplon (30)
- # joker (11)
- # london-clojurians (1)
- # malli (26)
- # pathom (2)
- # re-frame (13)
- # reagent (8)
- # reclojure (3)
- # reveal (8)
- # robots (4)
- # shadow-cljs (29)
- # sql (5)
- # tools-deps (28)
- # vim (4)
Anyone know if in geni (Spark) the data frames are typed or untyped?
Hi, David, author of Geni here. It uses Spark Datasets (see https://www.baeldung.com/java-spark-dataframe-dataset-rdd for a discussion). So it’s a typed view of DataFrames. However, the type information only comes in when you load the schema, so that you’ll get the type errors in run time.
Does it have an impact when you use datasets? Do you feel the burden of types in comparison to handling a collection of open Clojure maps?
Thanks for your answer and the library!
> Do you feel the burden of types in comparison to handling a collection of open Clojure maps? Not really, to me, it still feels like a dynamic language (or library in this case), because it all happens during runtime. But, just like Clojure, it’s strongly typed, so that you get type errors during run time. Also, I wouldn’t compare it to handling Clojure maps. Geni is for a different use case.. If your data is small enough, using collection of maps is probably better, because the reader of your code doesn’t have to learn Spark. But once you’re dealing with millions or billions of rows, you’d want to use Spark or similar libraries.