Fork me on GitHub
#data-science
<
2017-10-18
>
didiercrunch01:10:11

I have opened two threads regarding jupyter kernels: one asking about the handling of many projects with the same kernel and the other regarding the best way to distribute the kernels. Personally, I do not believe there is a "nice" way to handle many projects with the same kernel; we will need to abuse environment variables here. Tomorrow I'll try to hack my way with environment variables and bash scripts. https://groups.google.com/forum/#!topic/jupyter/X7Fhs0C9vLs https://groups.google.com/forum/#!topic/jupyter/PvcNxwAcbLw

gigasquid10:10:07

thanks @didiercrunch - I especially appreciate the documentation and information sharing with the group 😸

blueberry16:10:34

For those interested, CUDA GPU implementation of 50 recently introduced vectorized math functions has just landed into Neanderthal. Some are thousands of times faster than java.Math equivalents. https://github.com/uncomplicate/neanderthal/blob/master/src/clojure/uncomplicate/neanderthal/vect_math.clj

jsa-aerial21:10:21

Why is it always 'easy', and never simple?🙂

didiercrunch22:10:26

@blueberry you are a rock start! All your libraries are amazing. Continue like that.

pragsmike23:10:36

@didiercrunch yes, thanks for working on clojupyter. I'm still digging into the code, and understanding the jupyter protocol, how graphics are rendered, and especially interactive graphics are accomplished. I'm keen to integrate clojurescript in the client somehow, but Jupyter supports multiple clients (possibly simultaneously on the same kernel) and some of them probably won't support javascript or even a DOM. Those are the questions I'm looking to answer first.

didiercrunch23:10:36

it is used as a middleware.

didiercrunch23:10:49

If the return value of a cell is a specific "record" then clojupyter send a message that is understood by jupyternotebook as a html/image blob

didiercrunch23:10:56

(latex too actually)

didiercrunch23:10:34

If you can use http://plot.ly, I suggest you to use.

didiercrunch23:10:39

For what I understand of http://plot.ly, you give it data and it returns you a url that represent the chart you want to plot. You can embed it in the jupyter notebook as html

didiercrunch23:10:05

I imagine it uses iframe for interactive charts

didiercrunch23:10:54

As for adding javascript in the jupyter notebook, that will be needed to use parinfer

jsa-aerial23:10:41

I will again make a suggestion as to where you might better expend the cycles you have to spend. I think a more fruitful approach is to start with the reagent version of gorilla (that will give you a state of the art SPA starting point). Replace the plotting and charting stuff with Vega-Lite - which is 1. totally data driven, 2) written by the same people who did Vega (the Interactive Data Lab at University of Washington) and they clearly know what they are doing and have a lot of resources behind this, 3) is simple, declarative and you can use nothing more than CLJ/CLJS maps to drive it, 4) is already packaged on CLJSJS, 5) would fit nicely into gorilla from what I recall from hacking on gorilla plotting before. Then look into how Proto-Repl would fit into the client side (it is all JS, so should in theory work). Keep all the html, LaTex, Image, etc bits of gorilla.

jsa-aerial23:10:42

Proto-Repl would give a real editor. The result could be one of the most potent notebook type things available. Of course this would not be python oriented or directly usable by python so, if that is a goal this might not be what you want.

jsa-aerial23:10:04

The thing about Vega-Lite that is particularly apropos here, is that it is designed specifically for data science and statistical dynamic visualization and interaction

jsa-aerial23:10:33

It is indeed a 'dumb'/misleading name, but it really is basically what is needed

didiercrunch23:10:48

vega-lite looks very good

jsa-aerial23:10:28

Having been using it now for about a month, I can say it is extremely good and only going to get better based on comments from that group

jsa-aerial23:10:19

You can even take their examples, run them through json->clj and use the resulting map to send back to client for display/testing/hacking. It just works

didiercrunch23:10:34

nevertheless, I like better jupyter than gorilla

didiercrunch23:10:44

It is more professional.

jsa-aerial23:10:09

Sure, gorilla is no where near as developed. The idea above would be to make it really impressive

didiercrunch23:10:30

moreover now there is a jupyterlab project that is jupyter on steroid

pragsmike23:10:31

@jsa-aerial I'll take your word for the choice of VL. What if we could use it in Jupyter client?

jsa-aerial23:10:36

Don't take my word - have a look at what they have done and are continuing to do

didiercrunch23:10:58

I am sure we could workout a way to make vegas work on jupyter

jsa-aerial23:10:04

You need Jupyter client to work with JS

jsa-aerial23:10:38

If you can do that, then sure, you can have VL in Jupyter.

pragsmike23:10:41

that'd take some hacking, but also consensus in the jupyter community

jsa-aerial23:10:47

But, basing something in a true reactive driven client, would yield all sorts of extra benefits.

pragsmike23:10:09

I was hoping it would be straightforward to add some sort of rendering hook for a mime type in the jupyter client, where we could just run arbitrary javascript on a div

didiercrunch23:10:33

maybe it is in the new jupyterlab notebook

pragsmike23:10:45

my knowledge of how jupyter works is near zero as yet

didiercrunch23:10:26

I propose to hack our way around with the current jupyter notebook architecture. By that I mean having an easy installation support for the kernel and some http://plot.ly capabilities. It cost nothing and we will learn things. Then, we will be able to make an informe decision to where to go. I believe it is important to leverage existent code and to deliver results rapidly

jsa-aerial23:10:24

The last sentence is spot on. But that is why I suggested you start with what someone already did - reagent-ify gorilla

pragsmike23:10:46

anyway @jsa-aerial you've answered one of my questions. now we just have to figure out how to carve out a space for it to live. I'm going to need to know how to work Jupyter for my job now anyway. Other folks at my company use Jupyter, and gorilla would not be well-received there.

pragsmike23:10:47

I think it's likely that any work done to integrate VL in gorilla would easily drop-in to Jupyter, if we can carve out a space for JS to run in it.

jsa-aerial23:10:17

Since gorilla already is JS client, integrating VL with gorilla would be quite simple

jsa-aerial23:10:36

Well, if you have to use Jupyter, you have to use Jupyter.

sparkofreason23:10:06

The BeakerX project (https://github.com/twosigma/beakerx) as similar goals as you are discussing, if nothing else may serve as a good JVM-based example. They have a pretty decent visualization library in Java and a working Clojure kernel, as well as the beginnings of a way to move data between kernels.

pragsmike23:10:24

How about two parallel efforts: one to make gorilla awesome, and one to carve out a space inside Jupyter client where some of that awesomeness can run. A complication is that there isn't just one "jupyter client", but I'm willing to let the others get not-so-awesome rendering (like static PNGs) where the more capable one would get interactivity.

jsa-aerial23:10:52

The issue with that is we end up splitting resources and getting less for more...

jsa-aerial23:10:38

I wish I had the cycles, I really think the reagent gorilla + VL + Prot-Repl would be quite amazing

sparkofreason23:10:14

"Make gorilla awesome" is attractive. On the other hand, it may be duplicating a lot of work. I'd love to see a Jupyter kernel that 1) leveraged nREPL a la gorilla, and 2) achieved the BeakerX vision of sharing data between kernels, thus having cake and eating it too. Some of the Python viz libs in particular are fantastic and interactive.

jsa-aerial23:10:35

Since you would be leveraging a lot of stuff already done, I'm not sure about the duplication. And VL is actually rather unique. I get the feeling people really haven't explored what it can do.

sparkofreason23:10:07

There are at least a couple of Jupyter-based plugins for atom. I gave a brief (unsuccessful) try to these when I was playing with BeakerX: https://atom.io/packages/jupyter-notebook and https://atom.io/packages/hydrogen

sparkofreason23:10:10

Another interesting project, sort of a "meta-kernel" that allows data sharing: https://vatlab.github.io/SoS/

pragsmike23:10:05

ok, now I have to go learn about BeakerX. @jsa-aerial I hear you about splitting effort, but reaching the Jupyter community has a huge payoff.

jsa-aerial23:10:39

Possibly - but don't expect that to make python folks want to try clojure. I thought the idea was to focus on clojurists first to 'keep' them from going elsewhere because they had no choice...