nextjournal

paulbutcher 2021-11-28T12:13:55.140800Z

I think I might have found an issue with Clerk’s cacheing (but I’m still only getting my feet wet with it, so it’s possible that I may be missing something). This only seems to happen when dealing with “large” datasets (every attempt I’ve made at creating a small reproduction fails) so I’ve created a cut-down version the project I’m working on here: https://github.com/paulbutcher/clerk-cache-issue When this runs the first time, everything behaves as expected. But if I make a change to the definition of tx (e.g. by commenting out the (map #(map toDouble %)) ) the result of the last line doesn’t change as expected (see screenshot below - note that the values are still Doubles, even though the line which converts them to is commented out above).

paulbutcher 2021-11-28T12:15:07.141300Z

It’s quite possible that I’m misunderstanding something though, so please let me know if I am!

mkvlr 2021-11-28T13:04:09.142Z

which version of Clerk is this?

mkvlr 2021-11-28T13:05:17.142900Z

oh I see it in the repo, I'll take a look

👍 1
mkvlr 2021-11-28T15:14:35.143500Z

ok, thanks was able to reproduce and pushed a fix. Thanks for reporting this. {:git/sha "ba194851dd21559351b5af29f168aeaa80c436ff"} should fix the issue.

mkvlr 2021-11-28T15:15:06.143800Z

btw, there’s no need to hide all those results, clerk will only render a preview of the first 20 elements anyway

paulbutcher 2021-11-28T15:26:09.144Z

Thanks for the quick turnaround, I’ll give it a go.

paulbutcher 2021-11-28T15:26:34.144200Z

Regarding hiding the results, I was doing that to speed things up (things get very slow if I don’t)?

paulbutcher 2021-11-28T15:30:39.144400Z

So with results hidden:

Clerk evaluated 'notebooks/cache-issue.clj' in 3533.945033ms.
Without results hidden:
Clerk evaluated 'notebooks/cache-issue.clj' in 41643.440872ms.

👀 1
paulbutcher 2021-11-28T15:32:19.144700Z

Fix looks good 👍🍾

👍 1
mkvlr 2021-11-28T16:21:59.145400Z

I'll look into the performance

paulbutcher 2021-11-28T16:27:59.145900Z

Thank you 👍

mkvlr 2021-11-28T17:37:00.148700Z

@paulbutcher I can’t reproduce the times you’re seeing. I think what you’ve measured above might be a difference between cached and non cached results. I’m seeing the following on my macBook M1 after clearing the cache with all results shown:

Clerk evaluated '/Users/mk/dev/clerk/notebooks/cache_issues.clj' in 2566.223875ms.
And then when it’s all cached:
Clerk evaluated '/Users/mk/dev/clerk/notebooks/cache_issues.clj' in 143.052666ms.

paulbutcher 2021-11-28T17:55:23.150200Z

@mkvlr Thanks for looking. I’m pretty sure that this isn’t what’s going on though. I’ve updated the GitHub repository with two versions of the same notebook, one called results-hidden.clj and one called results-shown.clj. This is what I see on my 2019 MacBook Pro:

paulbutcher@Pauls-16in-MBP clerk-cache-issue % clj
Clerk webserver started on 7777...
Starting new watcher for paths ["notebooks"]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See  for further details.
Clojure 1.10.3
user=> (clerk/clear-cache!)
:cache-dir/deleted ".cache"
nil
user=> (clerk/show! "notebooks/results-hidden.clj")
Clerk evaluated 'notebooks/results-hidden.clj' in 3692.638682ms.
nil
hello=> (clerk/clear-cache!)
:cache-dir/deleted ".cache"
nil
hello=> (clerk/show! "notebooks/results-shown.clj")
Clerk evaluated 'notebooks/results-shown.clj' in 42053.739691ms.
nil
hello=> 

mkvlr 2021-12-01T09:38:18.153600Z

@paulbutcher {:git/sha "95a34b87881f6b329db2300e78d961dfc73f2993"} should fix the issue

paulbutcher 2021-12-01T10:36:43.153800Z

Looks good. Thanks @mkvlr 👍

🙌 1
mkvlr 2021-11-28T18:02:55.150400Z

hmm, can you try if running (alter-var-root #'clerk/worth-caching? (fn [_] (fn [_] true))) helps?

paulbutcher 2021-11-28T18:05:33.150600Z

Ah! Yes, that does make a significant difference:

paulbutcher@Pauls-16in-MBP clerk-cache-issue % clj
Clerk webserver started on 7777...
Starting new watcher for paths ["notebooks"]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See  for further details.
Clojure 1.10.3
user=> (alter-var-root #'clerk/worth-caching? (fn [_] (fn [_] true)))
#object[user$eval12748$fn__12749$fn__12750 0x1e5e0e75 "user$eval12748$fn__12749$fn__12750@1e5e0e75"]
user=> (clerk/clear-cache!)
:cache-dir/deleted ".cache"
nil
user=> (clerk/show! "notebooks/results-hidden.clj")
Clerk evaluated 'notebooks/results-hidden.clj' in 3844.171938ms.
nil
hello=> (clerk/clear-cache!)
:cache-dir/deleted ".cache"
nil
hello=> (clerk/show! "notebooks/results-shown.clj")
Clerk evaluated 'notebooks/results-shown.clj' in 3252.519161ms.
nil
hello=> 

mkvlr 2021-11-28T18:07:10.150800Z

great thanks. I think I know where to look then

paulbutcher 2021-11-28T18:07:23.151Z

👍

mkvlr 2021-11-28T18:36:32.151200Z

ok, so what’s happening is that when Clerk doesn’t consider a value worth caching (the meaning of this should be improved) we compute the valuehash of that object instead of writing it to disk via nippy. We use this valuehash to signal the frontend when a value changed. This is much much slower however but it shouldn’t be.

mkvlr 2021-11-28T18:40:41.151500Z

think we can just use nippy for computing the hash as well to make it fast

paulbutcher 2021-11-28T19:17:57.152800Z

Glad I was able to help uncover it 👍

🙏 1
mkvlr 2021-11-28T19:33:33.153300Z

me too, thank you! 🙏