Fork me on GitHub
#xtdb
<
2022-02-08
>
Martynas Maciulevičius11:02:24

What if I want to read multiple entries from the single database snapshot? Can I reuse db if I'd only be reading?

jarohen11:02:02

Yep - reuse of a db instance, in serial, is fine - encouraged, even 🙂

Martynas Maciulevičius11:02:01

Even if I would have multiple threads reading from it?

jarohen11:02:11

Reuse of a db snapshot isn't really applicable if you're writing as well - because it's a snapshot, it won't reflect later writes

jarohen11:02:56

Not from multiple threads at the same time, no - the queries on any one db instance should be in serial

jarohen11:02:16

If you want to query from multiple threads, it's best to take a db snapshot per thread

Martynas Maciulevičius11:02:47

Somebody else may be writing but I want to make a decision as fast as I can. So I thought whether I could take a snapshot once and then parallelize the calculation. i.e. the write won't come to the paralellized db instance.

jarohen11:02:41

You can request that the snapshots from each thread be at the same 'basis', so that all the threads are seeing a consistent, immutable view of the database

Martynas Maciulevičius11:02:13

Ah, that could work too.

jarohen11:02:25

I'd probably ask for xt/latest-completed-tx on the node, and then pass that to each thread to use in its db call

🙌 1
tatut12:02:57

> the queries on any one `db` instance should be in serial clarification: does that mean you shouldn't use the same db to do other queries while iterating over the results of open-q ? I have code like that and it seems to work, is there some caveat

tatut12:02:22

in the same thread still

refset12:02:54

> I have code like that and it seems to work, is there some caveat the only caveat I can think of is that completely unrelated queries will be clobbering each others' hot cache entries

tatut12:02:28

thanks, it seemed obvious to use the same db handle that I opened using open-db as that docstring states it reserves shared resources to make multiple requests faster... haven't measured if there's an impact wrt hot cache vs opening multiple dbs

jarohen12:02:55

@U11SJ6Q0K the db instance (particularly ones from open-db) does reuse some state/resources between queries, as you say - so, to be on the safe side, I've usually taken out separate db instances. That said, there's been a lot of refactoring in that area since the last time I seriously analysed the implications, it may well be sufficiently safe now

jarohen12:02:15

just not something that we guarantee 🙂

tatut12:02:22

perhaps there should be scarier warnings in the docstrings, I've had some crashes as I didn't understand the threading implications simply from the fn docs

1
Martynas Maciulevičius16:02:05

What happens when I have kafka configured both as transaction log and document store? We have this config from long time ago and I want to know if we need to change it. Is it in-memory or does it request the data from kafka every time? How does it add data to it? Does it submit when I write into the node? Or should we instead push data to kafka ourselves? The doc https://docs.xtdb.com/storage/1.20.0/kafka/ says that it could be due to historical reasons. But what happens if I use it this way?