Fork me on GitHub
#data-science
<
2017-10-23
>
pragsmike01:10:54

@gigasquid thanks for doing that legwork. graal seems more real now.

pragsmike01:10:28

I'm trying a somewhat different approach, using a boot task to connect to a long-running process so a Jupyter notebook is more like a REPL. I want to see if it's a useful workflow, and boot offers a mechanism (pods) that could allow notebooks to have their own classpaths.

didiercrunch01:10:26

I have never looked at boot but I assume it can uses the same process. As far as I am concerned, the importance is to settle with the clojupyter kernel. Like that any improvment to it will profit the entire community. I have added a chart (trying to) explaining what happen in lein-clojure; it should not be complicated. https://docs.google.com/document/d/1XnY0lsG_-Y2FC-P65udaBS2bzkcSQKs_6bIBdET3E5E/edit#

bpiel13:10:36

so, is "Graal" pronounced like "grail"?

bbss13:10:40

I don't know, but the Dutch word for grail is graal. And it sounds quite different. More like Grahl.

gigasquid14:10:46

I asked in their gitter room and they said we pronounce it gral with a long a

gigasquid14:10:19

but I’ve heard people pronounce it with a short a too 🙂

sparkofreason14:10:17

Master's thesis on implementing Clojure in Truffle. Outlines some of the key challenges and solutions. http://epub.jku.at/obvulihs/download/pdf/501665

sparkofreason14:10:16

I also don't see any evidence of a graal/truffle kernel for Jupyter, which seems like it would be potentially useful.

elise_huard15:10:12

just played with this, and came up to a scary looking error

Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared object '/home/elise/bin/graalvm-0.28.2/jre/languages/R/library/tseries/libs/tseries.so'
  /home/elise/bin/graalvm-0.28.2/jre/languages/R/library/tseries/libs/tseries.so: undefined symbol: d1mach_
Error: loading failed

bbss15:10:17

that's really cool that someone did their masters on it 🙂

elise_huard15:10:32

looks like some things are unimplemented or implemented differently (can see that file in Rmath)

elise_huard15:10:55

(I was trying some R timeseries stuff)

gigasquid15:10:28

@elise_huard yeah I not sure how compatible the implementation is. I’m sure there are some gaps. There is a bit about the status and limitations here https://github.com/graalvm/fastr

gigasquid15:10:46

FastR is intended eventually to be a drop-in replacement for GNU R. Currently, however, the implementation is incomplete. Notable limitations are:

Graphics support: FastR supports only grid and grid-based packages, graphics package is not supported. The FastR grid package implementation is purely Java based, see its documentation for more details and limitations.
Many packages either do not install, particularly those containing native (C/C++) code, or fail tests due to bugs and limitations in FastR. In particular popular packages such as data.table and Rcpp currently do not work with FastR.

elise_huard15:10:17

@gigasquid: thanks, and obviously no criticism on either Graal or Truffle or anyone else, I appreciate it's a difficult proposition, R having quite a sprawling ecosystem

rustam.gilaztdinov17:10:44

Anyone work with spark and sparkling? I’m trying to implement CountVectorizer with sql-dataset and getting this error

NoSuchMethodError org.apache.spark.sql.catalyst.expressions.GenericInternalRow.setByte(IB)V org.apache.spark.ml.linalg.VectorUDT.serialize (VectorUDT.scala:46)
Code
(defn cv [table]
  (-> (CountVectorizer.)
    (.setInputCol "bigram")
    (.setOutputCol "cv")
    (.setVocabSize 1000)
    (.setMinDF 2)
    (.fit table)))

(def cv-model (cv table))

(def table-cv (-> cv-model (.transform table)))

;; And after calling
(sql/show table-cv)
;; I'm getting this error

rustam.gilaztdinov12:10:55

One thing I learn today

lein pom

mvn dependency:tree -Dverbose=true