Fork me on GitHub
#data-science
<
2020-09-20
>
Daniel Tan12:09:17

anyone use flare?

Daniel Tan12:09:35

iirc clojure ml has way better perf than python so im looking into it

Aviv Kotek16:09:26

exploring some basics of .dataset, can't figure how to do simple df's arithmetic (same shape df's):

(let [cols (ds/column-names ds)
      units-total (ds/select-columns ds (filter #(s/starts-with? % "units_total") cols)) ;ds
      units-dwell (ds/select-columns ds (filter #(s/starts-with? % "units_dwell") cols))] ;ds
  (dfn/- units-total units-dwell))
=> #object[tech.v2.datatype.binary_op$fn$reify__10844 0xf2ea2e8 "tech.v2.datatype.binary_op$fn$reify__10844@f2ea2e8"]
i'd expect a new dataset of same shape with each cell contains the given calculation.

genmeblog18:09:51

@aviv dfn operates on something called a reader and returns a reader. Column is a reader. Dataset is not. So you have to select two columns, call function and insert such column to the dataset (new one).

Aviv Kotek18:09:54

Is there any way to operate on datasets? I would not like to assoc each time a new column with reduce:

(reduce (fn [m name]
          (let [name (str name)]
            (assoc m (str "units_biz_" name)
                     (dfn/- (ds (str "units_total_" name))
                            (ds (str "units_dwelling_" name))))))
        ds (range 2010 2018))

genmeblog18:09:38

You can treat columns as a vector and dfn operations as vectorized functions returning new vector.

Aviv Kotek18:09:50

you mean something like that:

(let [cols (ds/column-names ds)
      units-total (ds/value-reader (ds/select-columns ds (filter #(s/starts-with? % "units_total") cols)))
      units-dwell (ds/value-reader (ds/select-columns ds (filter #(s/starts-with? % "units_dwell") cols)))]
(ds/name-values-seq->dataset (into {} (map #(hash-map (str %3) (dfn/- %1 %2)) units-total units-dwell (range 2010 2018)))))
won't it be common to use data-sets? this is not neat at all..

Santiago10:09:21

curious about this too