This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-02-20
Channels
- # announcements (1)
- # architecture (14)
- # asami (21)
- # babashka (1)
- # beginners (44)
- # biff (6)
- # calva (24)
- # clojure (16)
- # clojure-europe (12)
- # clojurescript (32)
- # cursive (23)
- # datascript (5)
- # honeysql (8)
- # hyperfiddle (1)
- # malli (1)
- # nextjournal (34)
- # nrepl (4)
- # off-topic (64)
- # re-frame (12)
- # reagent (1)
- # releases (2)
- # reveal (41)
- # shadow-cljs (137)
- # spacemacs (4)
- # xtdb (5)
I wrote a little function which "removes a symbol" from Clerks cache (in-memory and from disk).
It could be used as a one-arity function on clear-cache!
It can be useful as an alternative for timestamping a symbol which comes from IO (disk, database, web service).
If you find it useful, I do a PR
I noticed an issue of the caching when we have 2 notebook files in the same folder. In case these different files use the same variables, the caching can get confused as the hashes of 2 codeblocks in the 2 files might be the same.
An easy fix could be to include the namespace in the hash calculation (or the file name)
do you have a repro? Hashes should intentionally be the same if both forms are the same and have the same depedencies.
I have a case of two notebooks, where a vega image appears in the "wrong" output. And indeed I copy/pasted the code from one to the other notbook file, so they have identical text. (referring to the same variable), but the variable get different values (in normal, non-clerk evaluation)
(clerk/vl
{:$schema ""
:config {:axis {:grid true :tickBand "extent"}}
:width 600
:height 600
:data {:values (vec pps-scores)}
:encoding {:x {:field "x" :type "ordinal"}
:y {:field "y" :type "ordinal"}}
:layer [{:encoding {:color {:field "pps"
:legend {:orient "top"
:direction "horizontal"
:gradientLength 120}
:title "PPS"
:type "quantitative"}}
:mark "rect"}
{:encoding {
:text {:field "pps" :type "quantitative"}}
:mark "text"}]})
The same snippet is present in both notebook files.
But in one pps-scores
refers to kaggle/pps-score
and in the other to predict-heart-attack/pps-score
But a hash based on the text is identical for the text block in both files.To change the hashing and including the ns has fixed it for my concrete case (I think ...): This results in a kind of "partition" of the space of hashes in case of several notebook files. -> The common disk based cache should have 2 different hashes and results, even if the text of the code block is identical. Not sure, how this affects the in-memory cache, though.
(defn hash-codeblock [->hash {:keys [hash form deps]}]
(let [hashed-deps (into #{} (map ->hash) deps)]
(sha1-base58 (pr-str *ns* (conj hashed-deps (if form form hash))))))
I tried again, to run both notebooks via:
rm -rf .clerk/
clojure -X:nextjournal/clerk
with and without the "ns" inside hash-codeblock
and having it included fixed the issue.
This screenshot of both notebooks shows the "wrong" plot (its identical from the other): But it should have completely different variables. The "text" of the code above the plot is identical...
kaggle/pps-score
and `predict-heart-attack/pps-score` should get different hashes if they’re different and lead to a different hash of the forms depending on them
should... look at this:
The data of the plot and the plot do not match at all.
The plot is "from the other notebook" ...
So it goes wrong in the code block which produces the vega lite svg. The data before is ok.
The code is here:
https://github.com/behrica/ds-notebooks/blob/main/notebooks
running clojure -X:nextjournal/clerk
will produce the 2 html files, and the kaggle.html
has a wrong plot.
(wrong axes already, and it is a "copy" of the plot from predict_heart_attack.html)
ok, seem good idea.
I think it is a dependency analysis problem. (not hashing as such) For the code block:
(clerk/vl
{:$schema ""
:config {:axis {:grid true :tickBand "extent"}}
:width 600
:height 600
:data {:values (vec pps-scores)}
:encoding {:x {:field "x" :type "ordinal"}
:y {:field "y" :type "ordinal"}}
:layer [{:encoding {:color {:field "pps"
:legend {:orient "top"
:direction "horizontal"
:gradientLength 120}
:title "PPS"
:type "quantitative"}}
:mark "rect"}
{:encoding {
:text {:field "pps" :type "quantitative"}}
:mark "text"}]})
It does not detect the dependency to var`pps-scores`
;; => [(clerk/vl
;; {:$schema "",
;; :config {:axis {:grid true, :tickBand "extent"}},
;; :width 600,
;; :height 600,
;; :data {:values (vec pps-scores)},
;; :encoding
;; {:x {:field "x", :type "ordinal"}, :y {:field "y", :type "ordinal"}},
;; :layer
;; [{:encoding
;; {:color
;; {:field "pps",
;; :legend {:orient "top", :direction "horizontal", :gradientLength 120},
;; :title "PPS",
;; :type "quantitative"}},
;; :mark "rect"}
;; {:encoding {:text {:field "pps", :type "quantitative"}}, :mark "text"}]})
;; {:form
;; (clerk/vl
;; {:$schema "",
;; :config {:axis {:grid true, :tickBand "extent"}},
;; :width 600,
;; :height 600,
;; :data {:values (vec pps-scores)},
;; :encoding
;; {:x {:field "x", :type "ordinal"}, :y {:field "y", :type "ordinal"}},
;; :layer
;; [{:encoding
;; {:color
;; {:field "pps",
;; :legend {:orient "top", :direction "horizontal", :gradientLength 120},
;; :title "PPS",
;; :type "quantitative"}},
;; :mark "rect"}
;; {:encoding {:text {:field "pps", :type "quantitative"}}, :mark "text"}]}),
;; :ns-effect? false,
;; :deps #{nextjournal.clerk/vl},
;; :file "notebooks/kaggle.clj"}]
"deps" only contains nextjournal.clerk/vl
Therefore the hashing does not contain a hash for pps-scores
which results in the hash of the full text block of both files being the same. (-> re-use across files)
This explains why adding "ns" into the hashing fixes it (by coincidence)Lets continue here: https://github.com/nextjournal/clerk/issues/94
ok, very good. I just found a "minimal issue", but maybe not needed:
(->
"(clerk/vl
{
:data pps-scores
})"
read-string
h/analyze)
But maybe it can confirm your fix as well.