Fork me on GitHub
#nextjournal
<
2022-02-14
>
Carsten Behring08:02:55

Can it be that Clojure "fn" are never cached by Clerk? So in a code of:

(defn my-fn ....)

(def a-result (a-function-using-my-fn   my-fn ))
a-result would be re-evaluated every time, even though the body of my-fn is unchanged ?

Carsten Behring08:02:34

The concreate code is this:

(def pipe-fn
  (ml/pipeline
   (mm/replace-missing [:BsmtCond :PoolQC] :value :NA)
   (mm/select-columns [:OverallQual :GarageCars :BsmtCond
                       :GrLivArea :1stFlrSF :2ndFlrSF :TotalBsmtSF :GarageArea :Neighborhood :YearBuilt
                       :SalePrice])
   (fn [ctx]
     (assoc ctx : (load-hp-data "train.csv.gz")))
   (mm/transform-one-hot [:OverallQual :GarageCars :Neighborhood :BsmtCond :PoolQC] :full)

   (mm/set-inference-target :SalePrice)
   {:metamorph/id :model}
   (mm/model {:model-type :smile.regression/gradient-tree-boost
              :max-depth 50
              :max-nodes 10
              :node-size 8
              :trees 2000})))



(def result
  (ml/evaluate-pipelines [pipe-fn] splits ml/rmse :loss))

Carsten Behring08:02:40

pipe-fn is a function and it gets passed to evaluate-pipelines I see that evaluate-pipelines is re-run even without any change in the file. on nextjournal.clerk/show!

Carsten Behring09:02:29

After enable the exceptions, I get indeed this: The type of pipe-fn is "un-freezable" and so it does not cache it.

:freeze-error #error {
 :cause "Unfreezable type: class scicloj.metamorph.core$pipeline$local_pipeline__41340"
 :data {:type scicloj.metamorph.core$pipeline$local_pipeline__41340, :as-str "#function[scicloj.metamorph.core/pipeline/local-pipeline--41340]"}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "Unfreezable type: class scicloj.metamorph.core$pipeline$local_pipeline__41340"
   :data {:type scicloj.metamorph.core$pipeline$local_pipeline__41340, :as-str "#function[scicloj.metamorph.core/pipeline/local-pipeline--41340]"}
   :at [taoensso.nippy$throw_unfreezable invokeStatic "nippy.clj" 1003]}]

Carsten Behring20:02:53

Ok, I went more in detail and understood the reason for the issue. Clerk fails to cache any result which contains a "fn", so expression getting evaluated repeated. Simples show case:

(defn my-fn [] nil)

(def b {:fn my-fn
        :y (do (println "slow") (Thread/sleep 10000) :a)})

Carsten Behring20:02:11

'b' does not get cached, so it evaluated on every call to show! even without code change,

mkvlr07:02:37

it seems we can also improve this situation by falling back to an in-memory cache when the nippy cache (which is also persistent across JVM restarts) fails. This should make the caching work as long as you’re looking at the same notebook even if nippy cannot freeze & thaw it.

Carsten Behring09:02:31

yes. I was thinking about that. It is true that for me the use case of persistent caching was not super important, as none of the other notebooks has it.

Carsten Behring09:02:01

for sure "only nippy" seems too restrictive to me.

Carsten Behring09:02:01

I had the issue with "fns" and nippy before. functions cannot be serialized, as far as i remember.

mkvlr09:02:08

yep, I’m aware of that. So far not caching functions hasn’t been a problem since I’ve not yet run into code where the result is a function but evaluation of it takes a long time. Does that apply to your code above?

mkvlr09:02:43

though that you’re seeing the unfreezable error makes me think it’s a different problem and can be solved by tweaking the allow list

Carsten Behring09:02:30

Yes. The "result" above contains functions. It is the result of a ML model training, so takes long.

Carsten Behring09:02:52

I see this a very frequent situation. It is idiomatic Clojure to pass maps around which contain fns, as fns are "first class".

Carsten Behring09:02:30

> though that you’re seeing the unfreezable error makes me think it’s a different problem and can be solved by tweaking the allow list

Carsten Behring09:02:54

why different ? Fns are "unfreezable", no ?

Carsten Behring09:02:23

I uncommented the "error printing" in clerk to see them.

Carsten Behring09:02:48

They are indeed hidden in current Clerk code.

Carsten Behring09:02:45

I just noted that the training was done repeatedly (due to logging of the training process itself and slowness of notebook evaluation) It all worked, just slow due to repeated execution (as cache as not working for the fns inside my result)

mkvlr09:02:22

yeah, we should certainly warn when we can’t freeze values

mkvlr09:02:09

but you can see Clerk does not reevaluate these:

(ns test)

(defn my-fn [x]
  (prn :my-fn x)
  (:hello x))


(defn a-function-using-my-fn [f x]
  (prn :a-function-using-my-fn)
  (f x))

(def a-result
  (do
    (prn :a-result)
    (a-function-using-my-fn my-fn {:hello :world})))

Carsten Behring09:02:43

Indeed, in your code it works. The issue is if a var contains a fn.

(defn my-fn [] nil)

(def b {:fn my-fn
        :y (do (println "slow") (Thread/sleep 10000) :a)})

Carsten Behring09:02:25

In this b will be re-evaluated every time we do clerk/show!

Carsten Behring10:02:24

So my initial comments was wrong.

Carsten Behring10:02:40

So it allows indeed to serialize fn with nippy. I can make a PR to add it.

Carsten Behring10:02:45

The "drawback" of freezing fns is the need of "identical code" in freeze and un-freeze. But "cleaning the cache" can guarantee this in clerk. Just to be documented, maybe.

mkvlr11:02:51

looking into this now

Carsten Behring12:02:25

I did this PR https://github.com/nextjournal/clerk/pull/81 but might need more testing. Inside a single JVM run it solves the freezing for fns it seems. It solved my issue, at least.

mkvlr12:02:11

yep, saw it. I’m afraid this might also create a bunch of other problems

mkvlr12:02:25

> All JVM instances that could end up deserializing a fn instance are required to be launched from the same precompiled jar. from https://tech.redplanetlabs.com/2020/01/06/serializing-and-deserializing-clojure-fns-with-nippy/

Carsten Behring12:02:13

Not sure if this is a big problem in Clerk. It requires that "freeze" and "un-freeze" are called by the "same code" (at least the same code where the seralized function come from) In the typical situation of usage of the Clerk cache (I freeze "now", and un-freeze 2 minutes later) this is given. The Clerk cache is not typicaly used for long-term storage, is it ? (with a high chance of code changes in between) And "clean cache" will fix it in any case.

Carsten Behring12:02:46

But indeed "my issue" could be solved by an in-memory cache as well. (so not using nippy at all)

mkvlr12:02:48

yes the intention of Clerk’s cache is to be used for long term storage

mkvlr12:02:04

which is why we persist to disk

Carsten Behring12:02:20

We seem to have three options to "get a result" for a form: • re-compute • in-memory cache without nippy • current persistent nippy based cache all have pros and cons...

mkvlr12:02:22

haven’t showcased or talked about this a lot but this is coming soon

Carsten Behring12:02:23

interesting, was not on my radar so far,

Carsten Behring12:02:36

"try nippy else re-compute" as current, rules Clerk out for analysis with long running operations. It seems to me that nippy has quite some more gaps in type coverage. "general serialisation of all types" is a hard problem But maybe we can start by logging , so we see when "try nippy" fails and re-compute was triggered.

Carsten Behring12:02:41

I don't want to be forced to restrict my data, only to make Clerk caching work (or accept re-compute)

mkvlr12:02:07

yep, understood and should be able to make this work

mkvlr12:02:19

the in-memory cache doesn’t have the same restrictions as nippy ofc

mkvlr12:02:26

I have a working fix

mkvlr12:02:37

will clean it up and push after lunch

👍 1
Carsten Behring13:02:26

I will test it, I have a good test case.

Carsten Behring13:02:14

Maybe it can be even something to be later configurable per form: • cache in-memory • persistent cache • no cache with a per namespace default

mkvlr13:02:21

can you take this for a spin please?

Carsten Behring15:02:59

#82 fixes my problem. (at least while in the same JVM process). So both #80 and #82 fix my issue with the fns, in 2 different ways. (while in the same JVM process) I cannot see a speed difference (in-memory cache vs nippy on-disk cache), not seems immidiate. #80 should make the cache work even across JVM runs, but I can not test this due to issue with freeze / unfreeze of tech datasets

mkvlr15:02:46

@U7CAHM72M excellent, thanks! I’ll go with #82 for now, as that fixes an obvious bug. Feel free to play with the nippy fns approach in userspace, see https://github.com/nextjournal/clerk/pull/81#issuecomment-1040408571.

Carsten Behring19:02:39

@U5H74UNSF Maybe there is an other bug lurking... It seems to me that the caching only works "once", on the overnext evaluation it is evaluated again. It seem, that the memory-cache only works "once".

Carsten Behring19:02:19

Not true... I need to test more. Seems to work. Is there a way I can make my own install of clerk ?? The "bb release:jar" is not working

mkvlr20:02:35

can’t you use :local/root or :git/sha?

mkvlr20:02:15

and clojure -T:build jar doens’t work?

Carsten Behring20:02:13

No:

clojure -T:build jar
Cloning: [email protected]:nextjournal/cas
Downloading: org/slf4j/slf4j-nop/maven-metadata.xml from central
Error building classpath. Unable to clone /home/carsten/.gitlibs/_repos/ssh/github.com/nextjournal/cas
ERROR: Repository not found.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

mkvlr20:02:07

it’s private

mkvlr20:02:37

why do you need a jar as opposed to using it via :local/root or :git/sha?

Carsten Behring21:02:00

yes, good idea 👍