2024-10-09 datalevin | Clojure Slack Archive

datalevin

2024-10-09T08:35:37.715709Z

2024-10-09T08:36:16.430399Z

Just casually add offset and limit, this is brilliant! 🤯

2024-10-09T10:17:02.117509Z

Just changed to 0.9.12 using order-by and limit, it’s so fast! This was the last area where I couldn’t get sqlite/postgresql level performance. Thank you!

🙂 1

2024-10-09T16:52:12.337919Z

Hello! I have to do a big transaction (let's say 100-200K datoms). Is there any significant difference if I use vectors in eav format or entity maps? And more important if there is much repeating data like same vectors [123 :entity/attribute "value-1"]... will it slow down the transaction and will it be better to clean data before?

Huahai 2024-10-09T18:50:52.685749Z

A few hundred K datoms are not big. Our default batch threshold is 1 million datoms.

Huahai 2024-10-09T18:54:23.062819Z

Cleaning up repeated datoms is going to be helpful. Less datoms to transact. In fact, if you have repeated datoms, what happens is that old ones will be deleted and new ones added. So it's a lot of wasted work.

Huahai 2024-10-09T18:55:32.987929Z

So yes, it will slow down and it's better to clean up first.

👌 1

2024-10-09T20:45:19.761149Z

When we were doing batches on large data sync, we would run distinct on the batch first. But depends how much duplicate data you have. The important part with distinct is it maintains the order of items which is important for new entities that have relations between each other.

👍 1

2024-10-09T21:30:24.034159Z

Oh okay! Thank you for answer. I thought maybe cleaning happens somewhere deep inside the code I never saw. But yes, that is better to leave such things on user side.

Huahai 2024-10-09T06:17:55.384379Z

0.9.12 is released https://github.com/juji-io/datalevin/blob/master/CHANGELOG.md#0912-2024-10-08

🎉 8

Clojurians Log v2

datalevin