Hi! I have found some fascinating behavior in a datalevin collection -- I fetched a list of 5 small-ish objects out of a kv store and proceeded to clojure.walk/stringify-keys on it. It then proceeded to burn through 16GB of RAM, and spend over 5 minutes processing. I would catch this stack trace many times when performing kill -3 on the process:
"nREPL-session-cd302eab-88bc-41f7-9ff3-9f66423bc460" #66 [974563] daemon prio=5 os_prio=0 cpu=200025.86ms elapsed=220.03s tid=0x000076310c0063f0 nid=974563 runnable [0x00007631179fc000]
java.lang.Thread.State: RUNNABLE
at clojure.core$deref.invokeStatic(core.clj:2337)
at clojure.core$deref.invoke(core.clj:2323)
at datalevin.spill.SpillableVector.cons(spill.clj:115)
at datalevin.spill.SpillableVector.cons(spill.clj)
at clojure.lang.RT.conj(RT.java:697)
at clojure.core$conj__5474.invokeStatic(core.clj:87)
at clojure.core$conj__5474.invoke(core.clj:84)
at clojure.core.protocols$fn__8275.invokeStatic(protocols.clj:167)
at clojure.core.protocols$fn__8275.invoke(protocols.clj:123)
at clojure.core.protocols$fn__8229$G__8224__8238.invoke(protocols.clj:19)
at clojure.core.protocols$seq_reduce.invokeStatic(protocols.clj:31)
at clojure.core.protocols$fn__8262.invokeStatic(protocols.clj:74)
at clojure.core.protocols$fn__8262.invoke(protocols.clj:74)
at clojure.core.protocols$fn__8203$G__8198__8216.invoke(protocols.clj:13)
at clojure.core$reduce.invokeStatic(core.clj:6965)
at clojure.core$into.invokeStatic(core.clj:7038)
at clojure.walk$walk.invokeStatic(walk.clj:50)
at clojure.walk$postwalk.invokeStatic(walk.clj:53)
at clojure.walk$postwalk.invoke(walk.clj:53)
at clojure.core$partial$fn__5927.invoke(core.clj:2641)
at clojure.walk$walk.invokeStatic(walk.clj:46)
at clojure.walk$postwalk.invokeStatic(walk.clj:53)
at clojure.walk$postwalk.invoke(walk.clj:53)
at clojure.core$partial$fn__5927.invoke(core.clj:2641)
at clojure.core$map$fn__5954.invoke(core.clj:2772)
at clojure.lang.LazySeq.force(LazySeq.java:50)
at clojure.lang.LazySeq.realize(LazySeq.java:89)
at clojure.lang.LazySeq.seq(LazySeq.java:106)
at clojure.lang.Cons.next(Cons.java:41)
at clojure.lang.RT.next(RT.java:733)
at clojure.core$next__5470.invokeStatic(core.clj:64)
at clojure.core.protocols$fn__8275.invokeStatic(protocols.clj:168)
at clojure.core.protocols$fn__8275.invoke(protocols.clj:123)
at clojure.core.protocols$fn__8229$G__8224__8238.invoke(protocols.clj:19)
at clojure.core.protocols$seq_reduce.invokeStatic(protocols.clj:31)
at clojure.core.protocols$fn__8262.invokeStatic(protocols.clj:74)
at clojure.core.protocols$fn__8262.invoke(protocols.clj:74)
at clojure.core.protocols$fn__8203$G__8198__8216.invoke(protocols.clj:13)
at clojure.core$reduce.invokeStatic(core.clj:6965)
at clojure.core$into.invokeStatic(core.clj:7038)
at clojure.walk$walk.invokeStatic(walk.clj:50)
at clojure.walk$postwalk.invokeStatic(walk.clj:53)
at clojure.walk$stringify_keys.invokeStatic(walk.clj:102)
at clojure.walk$stringify_keys.invoke(walk.clj:102)
If I do an encode/decode operation using CBOR or something (presumably converting all vectors into standard clojure vectors), then the stringify operation comes back in 215msec and doesn't use up all my RAM. Has anyone seen anything like this before?If anybody wants to reproduce this I'm happy to send a nippy-thawed file and some sample code.
we do not support storing arbitrary objects
it's on the roadmap
so you don't want to put random objects in it
Ahh fair. So best if I encode them before storing them?
Correct. Even after https://github.com/juji-io/datalevin/issues/234 is implemented, you will still need to do some work.