Just casually add offset and limit, this is brilliant! ๐คฏ
Just changed to 0.9.12 using order-by and limit, itโs so fast! This was the last area where I couldnโt get sqlite/postgresql level performance. Thank you!
Hello! I have to do a big transaction (let's say 100-200K datoms). Is there any significant difference if I use vectors in eav format or entity maps? And more important if there is much repeating data like same vectors [123 :entity/attribute "value-1"]... will it slow down the transaction and will it be better to clean data before?
A few hundred K datoms are not big. Our default batch threshold is 1 million datoms.
Cleaning up repeated datoms is going to be helpful. Less datoms to transact. In fact, if you have repeated datoms, what happens is that old ones will be deleted and new ones added. So it's a lot of wasted work.
So yes, it will slow down and it's better to clean up first.
When we were doing batches on large data sync, we would run distinct on the batch first. But depends how much duplicate data you have. The important part with distinct is it maintains the order of items which is important for new entities that have relations between each other.
Oh okay! Thank you for answer. I thought maybe cleaning happens somewhere deep inside the code I never saw. But yes, that is better to leave such things on user side.
0.9.12 is released https://github.com/juji-io/datalevin/blob/master/CHANGELOG.md#0912-2024-10-08