Fork me on GitHub
#nextjournal
<
2021-11-05
>
Daniel Slutsky00:11:09

@mkvlr congrats for the Clerk release! ❤️ Is there any current way to serve local files alongside the html (e.g., images, data files for vega plots)? I imagine I could somehow extend the webserver to handle some routes as requests for local files, on the user side.

mkvlr10:11:33

thank you! Clerk will already serve results separately from the main page. Do you have an example where this doesn’t do what you need?

Daniel Slutsky11:11:03

@mkvlr thanks. One example is being able to include an image that lives in a given path relative to the main page. Another example is allowing a vega spec like the following to work:

{ "data": {"url": "data/dataset.csv"},
  "mark": "bar",
  "encoding": {
    "x": {"field": "a", "type": "nominal"},
    "y": {"field": "b", "type": "quantitative"}}} 
Here, dataset.csv is a file in the local environment that contains the data for the plot. Does it make sense?

mkvlr11:11:52

hmm, wondering what relative to the main page means here, since Clerk serves the notebook on root. Would that csv live on the classpath or not?

mkvlr12:11:43

it would be easy to let clerk serve files from both but will need to think about the consequences and if there's not a simple way to already do this in user space.

Daniel Slutsky12:11:01

from the user space -- do you mean the user extending the clerk webserver so that it handles the appropriate routes by serving the files?

Daniel Slutsky12:11:18

that might be a good idea.

Daniel Slutsky12:11:28

@mkvlr good question whether files should live on the classpath. what do you think?

Daniel Slutsky12:11:42

One thing we have been wondering about is whether different tools and libraries could agree on a convention regarding that question, of where files should be placed. This would allow a data visualization library (e.g., the upcoming Viz.clj) to place the data files in a path that would make sense for various tools (e.g., Clerk). One very simplistic attempt to suggest some agreement is this scicloj/tempfiles library: https://github.com/scicloj/tempfiles

mkvlr13:11:29

hmm, I think it might be better to avoid baking decisions like this into Clerk. Folks can easily run their own webservers alongside Clerk and do whatever they want today.

2
mkvlr13:11:14

and embedding Clerk on a path by calling some library functions directly should also not be hard, there's probably ways to make it easier down the road.

Daniel Slutsky13:11:26

thanks. the challenge, I think, is making the page behave the same during the interactive REPL session and when served statically, with no server. for that, we may probably need to hook some behaviour into the routes of the Clerk server (through some additional middleware maybe?) does it make sense?

mkvlr13:11:44

that already works with Clerk (if you serve Vega specs without linking to relative paths)

mkvlr13:11:43

these static build concerns are also why I think it might be simpler to inline data into the Vega specs on the Clojure side

Daniel Slutsky14:11:50

isn't that slower in your experience, without serving the file?

Daniel Slutsky14:11:32

> without linking to relative paths do you mean including the data in the spec?

👍 1
mkvlr14:11:39

re being slower: if the file doesn't change you can also upload it somewhere and embed it via a full url?

Daniel Slutsky14:11:52

but typically, we want a good dynamic behaviour in live data explorations, while we are trying different variations of the data. the file is just a way to let vega consume it. Here is a test case:

(defn data []
  (for [i (range 99999)]
    {:x i
     :y i}))

(v/vl
 {:data {:values (data)}
  :mark :point
  :encoding {:x {:field :x :type :quantitative}
             :y {:field :y :type :quantitative}}})
I

Daniel Slutsky14:11:33

In my experience, serving this is as a file gives a quicker experience.

mkvlr14:11:23

Clerk will serve this via a separate request

Daniel Slutsky14:11:53

What does it mean?

mkvlr14:11:50

that it will be a separate file. What we can do to improve performance is doing the clj to json conversion on the server instead of in the client. We're working on a small extension to the viewer api that will enable this.

Daniel Slutsky14:11:51

So the vega spec is edited to replace the :data part with a :url part?

Daniel Slutsky14:11:21

(BTW, a CSV format would be smaller and possibly faster to parse than JSON. An even faster option is Arrow. https://github.com/vega/vega-loader-arrow )

mkvlr14:11:08

I'll play with this, thanks. Our goal is definitely to keep fetching and serving data open to let folks change it.

🙏 1
Carsten Behring14:11:00

I think the general need for "data exploration" is data visualisations of "more formats" then SVG • "dynamicaly generate PNGs and visualize them" • visualize java.awt.image.BufferedImage in Browser (by converting on the fly to png) • other r*aster graphic* formats

Carsten Behring14:11:41

Maybe that is the "key equirement". Handle raster graphic in some form. (and this can happen for example via converting them to files and serve those)

mkvlr14:11:36

as Clerk is running on localhost @U07SQTAEM mentioned that you can serve local files via file://

👍 1
Carsten Behring14:11:02

See here the last cell: https://nextjournal.com/a/PLr1zAamgqN5maQxNygW2/edit Nextjournal support this by serving anything in "results"

mkvlr14:11:07

the image stuff we’re also talking about, is coming soon

mkvlr14:11:36

the notebook is private, can you publish it or create a share url?

Carsten Behring14:11:36

Done, public now

Carsten Behring14:11:42

The "image linking" keeps ideally working when "rendering as static page" as @U066L8B18 says. We use notespace as well for generating static html pages and host them somewhere.

Carsten Behring14:11:21

What could maybe wrok is: • really on file:// urls while developping the notebook

Carsten Behring14:11:53

and have a "hook" into the build-static-app! which allows to tell it all the static pages and it copies the files and changes the links while building the static app

Carsten Behring15:11:45

Maybe this could be supported with a special viewer method, which marks this links. In general we would: 1. "generate dynamically the file on disk" <file://tmp/xxxxx.png> 2. Include it in the page: a. either as hiccup: [:a {:ref= "} b. OR BETTER with a specific "viewer": (view-local-file "<file://tmp/xxxx.png>"} (which would generate the hiccup and remember to copy the files during build-staticapp! and change the link to become "relative" The view-local-file could evn be smartly use an UUID as file name, so uniqueness is gurantied and a flat structure does work. This would be so general, that it would work for all cases, I think.

Daniel Slutsky16:11:10

I see the method of serving data as "blobs" through /_blob/* URIs, that access the document metadata at the backend. https://github.com/nextjournal/clerk/blob/25b815c4879c4d81c407cd13a96b28d8309414f2/src/nextjournal/clerk/webserver.clj#L68 It looks like a small variation of this could serve the need of passing images, etc.: creating an endpoint, let us call it data, /_data that is very similar to /_blob. Instead of using serve-blob, it passes the raw data directly, without wrapping it with viewers. Let us call this functionality serve-data. Something like that does work for me in my current experiments:

(defn serve-data [{:keys [uri]}]
  (let [data-id->result (-> @!doc meta :_data)
        data-id      (str/replace uri "/_data/" "")]
    (println (pr-str
              [data-id->result data-id]))
    (if (contains? data-id->result data-id)
      {:status 200
       :body   (data-id->result data-id)}
      {:status 404}))) 
The benefit of serving resources (e.g. images) this way is that the route for the resource at the interactive stage is identical to the one when serving the notebook statically, assuming we provide it as a file alongside the main html page. I'm wondering what you'd think about that. Would it make sense to add something like that to Clerk? Or maybe make the routing extensible, so that an endpoint like that can be added on the user side?

🙌 1
Carsten Behring16:11:14

I think this could be made even more universal by allowing custom content-type in the response. So could even serve for "more then images". It is true that "serving of dynamically generated images" would be our main use case. So the public interface to this could be either a "_data" endpoint as @U066L8B18 said, or something even more abstract, like a "viewer" for BufferedImage. (and then the user don't care at all, how this is served or saved) But it should "keep working" in static export.

👍 1
mkvlr17:11:34

but there’s no Clerk server to serve anything in the static builds. So I think it might be better to keep these things as separate concerns. Clerk is also not in the business of uploading the static build to some url. So if you’d like to make some static files available I think it’s most flexible if we keep this out of Clerk and let Clerk work with the resulting urls.

Daniel Slutsky18:11:45

If I want to serve some data rough the blob path, but pass it as-is, without wrapping with viewer information, would it be easy to do with the current API (that is,, without patching the webserver as I did above with serve-data)?

mkvlr18:11:01

I’ll try this soon

mkvlr18:11:36

there might be a tiny change needed, one goal with the custom fetch-fn as used in the table viewer is to enable serving things without wrapping and with different content types, see https://github.com/nextjournal/clerk/blob/139388304590e6acb9127b12ba06736e0df15c52/src/nextjournal/clerk/viewer.cljc#L162

Daniel Slutsky18:11:33

Thanks, I'll look!

mkvlr18:11:01

I also have this on my list of things to look at tomorrow

mkvlr19:11:17

ok looking at http kit this should be a tiny change, I'll try this in a sec

🙏 1
Carsten Behring21:11:13

I still think that Clerk should have a feature, which would allow to serve dynamically rendered raster images. There are some plotting libraries / functions in the data science field (Clojure, Java, Python, R) which are pre-web or otherwise desktop oriented. And they have visualisation functions which produce raster images only. There should be a way to visualize them with Clerk IMHO, which is not external hosting, as this is "not dynamic".

Carsten Behring21:11:17

And I can as well imagine use cases for dynamically created "sound files", or creation of animated gifs in clojure code.

mkvlr15:11:53

I have this wip, still will need some tweaking to make the api nicer though but the idea is there

Daniel Slutsky16:11:26

Thank you so much. I'm still trying to understand what is happening here (probably since I haven't understood the describe function yet). Will this nest nicely inside another structure (e.g., some Hiccup or Vega spec)?

mkvlr18:11:24

it’s just a sketch at this point, needs some more thinking to make it nice. But should work with nesting then, yes. Though it will depend on what’s nested in what I think.

🙏 1
👍 1