Fork me on GitHub
#data-science
<
2020-02-05
>
agigao16:02:39

Hey there, I’m kinda having hard time using kixi.stats core functions median, summary etc without transducers and can some help me figure out - why?

...
(:require [kixi.stats.core :refer [median]])

(median (repeatedly 100 rand))
REPL:
1. Unhandled java.lang.ClassCastException
   class clojure.lang.LazySeq cannot be cast to class
   com.tdunning.math.stats.TDigest (clojure.lang.LazySeq and
   com.tdunning.math.stats.TDigest are in unnamed module of loader 'app')

otfrom17:02:41

@chokheli I think you do need to use them with transducers. This works:

(transduce (map identity) kixi.stats.core/median (repeatedly 100 rand))

otfrom17:02:10

and if you really hate transducers this works (but is ugly)

(kixi.stats.core/median (reduce kixi.stats.core/median (kixi.stats.core/median) (repeatedly 100 rand)))

otfrom17:02:41

as you need an inital value and the completion step to get the median out rather than the t-digest

bocaj23:02:38

I'm hunting for a standard way or a library to select out of (e.g Oracle) into avro, and avro into another db (e.g. Sql Server).

bocaj23:02:38

posting here, b/c of the avro case. Maybe others would find a library like this useful? Or, are there other reliable intermediary formats that keep types around?

bocaj00:02:54

is that a library?

bocaj00:02:42

Maybe I'm pushing a social problem into tech, but I won't be asking for direct access to a database, only asking a client to use a libary to export a table from their database into this shared format.

jsa-aerial00:02:09

It's a Spark module. Probably pretty heavy weight for what you are asking.

bocaj00:02:54

...yes but...reading up and it's looking good, just don't know about the runtime