Fork me on GitHub
#flambo
<
2016-05-20
>
jrotenberg19:05:55

@ccann: how are you creating the schema?

jrotenberg19:05:46

seems like the dataframe api is way better for this on both ends

ccann19:05:03

(:import [org.apache.spark.sql.types DataTypes StructField StructType])
(defn my-schema
  []
  (let [coll [(StructField. "id" DataTypes/FloatType true (empty))
              (StructField. "field_a" DataTypes/FloatType true (empty))
              (StructField. "field_b" DataTypes/StringType true (empty))]
              fields (into-array StructField coll)]
    (StructType. fields)))

jrotenberg19:05:27

i think i figured out a (really hacky) way to create it dynamically

jrotenberg19:05:23

using the the json reading stuff

ccann20:05:22

working with spark dataframes from clojure has been a nightmare, for what it’s worth 🙂

ccann20:05:36

I’m being dramatic, but it’s not very pleasant

jrotenberg20:05:03

basically all of our code that gets touched by anyone else is in scala right now

jrotenberg20:05:16

so there are other nightmares to be had

jrotenberg20:05:36

a nightmare for every season

jrotenberg20:05:30

but i just inherited a codebase from a guy who is bailing to do something more interesting

jrotenberg20:05:35

lucky bastard

sorenmacbeth23:05:09

@ccann: nightmare because it requires a ton of java interop?