Hi everyone, I'm hoping to use datalevin as the backend for durable event streaming. Is this possible? My use case peaks at around 5k msg/s. I was thinking of using a list-dbi, but the constraint of 512 bytes per event is tough
Why would you want to use list dbi?
Event streaming probably doesnβt need durability, so you can use non durable env flags, such as :nosync, writemap+mapasync, these can handle lots of loads
you can also add :append flag when writing, so it's a bit more faster
you can also use transact-kv-async if you care about durability, it can reach performance on par with those using non-durable flags, maybe slightly slower.
I considered list dbi cause I was looking for something that allows appending to the same key.
If I were to :put data with :appenddup , how can I get back all associated values for a key?
get-list
I have not used :appenddup, don't know if it works
why do you want to append to the same key?
list dbi is limited, as the value has the same size limit as the key, so it's not really suitable for a lot of cases
use tuples as keys is better
I see. I want to be able to quickly get all the events (about 200k) for a particular market that's why
no difference
use tuple as keys can do the same, and there's no limit on the value size
or if you can use a separate table for the bigger values, like we do in Datalog store, it's a bit complicated, but it works
If I were to use tuples, What would be the right approach to getting all events for a market? Is it something like:
(d/get-range
db "stream"
[:closed ["producer-17-02-2025" "market-id"]])
are these all strings?
yup. each event has an integer id too, but the range is unknown
ok, it seems add the int id with 0 and a huge number for the end-range does it
since there's date in there, won't it be problematic when you try to do range on dates?
For this particular use case, I'm only every interested in all events so far for a market. the date is just part of the producer name which could be a uid in the future
I would put a market-id at first position of tuple then
(d/get-range db "stream" [:closed [["market-id-1" :db.value/sysMin] ["market-id-1" :db.value/sysMax]] [:string] :string)
assuming your values are string
Lastly, regarding async transactions
-> does calling sync finish up all async transactions too
-> does awaiting the latest transact-kv-async guarantee that the previous calls are also done? (I assume they're batched up in order)
so this is a homogeneous tuple
no
> :db.value/sysMin great. I didn't know we could have that in a transaction
> no To both?
first -> is no, sync does not interact with async transaction
to do what you want to do, deref the last future
so second -> is yes
yes, they keeps the order, each async call is a future, you can deref the last one to make sure everyone is committed.
or you can put in a callback for the last call
Awesome. Thanks for the help as always
you are welcome