xtdb 2020-02-19 | Slack Archive

dotemacs17:02:45

How do you evict all the documents from Crux, including the already deleted ones? Especially the already deleted ones. Thanks

refset17:02:58

Hmm, finding all deleted documents across all of time is an interesting problem! What's the use-case? Testing / dev?

dotemacs17:02:10

Dev, only dev

refset18:02:04

Is rm'ing both logs and all indexes not an option? I.e. killing it all

dotemacs18:02:28

I was hoping for something at the REPL…

refset18:02:25

Ah, fair enough. I think (sh ...) is your best bet for the moment, unfortunately. There might be a ns somewhere in the repo with something you could copy-paste /cc @U050V1N74

dotemacs18:02:22

So the way I came across it is this: - I added a document, with today’s valid-time - deleted the document - added it again, but with a much older valid-time Now because the id of the document is the same, the most recent document in history is deleted, so “it’s gone”.

refset18:02:42

When using put with a start valid time and an already crowded timeline you almost certainly want to specify an end valid time also (which could be MAX), to avoid that kind of "it's gone" behaviour

dotemacs18:02:55

Yea, that’s a workaround. Thanks

👍 4

jarohen08:02:46

only similar thing we use is a test fixture that creates a data directory, runs its tests, then removes the directory when it's done, if that's helpful? https://github.com/juxt/crux/blob/master/crux-test/src/crux/fixtures.clj#L55

dotemacs09:02:38

It might be in the future. My setup is Kafka in Docker, so I’m not sure how useful this would be in that scenario, just yanking the directory where Kafka keeps its offsets info… Kafka version of your snippet might be going into Kafka topics and purging the messages in them…

ordnungswidrig20:02:34

I wonder of there is a suitable way to use crux with aws lambdas. My understanding is that the labmdda would need to process as messages from kafka on spinup. Is that true?

refset11:02:07

Hi @U054UD60U - unless your data set is a few MB or less then I don't think there are any useful ways to use aws lamdas with Crux as it stands today. > My understanding is that the labmdda would need to process as messages from kafka on spinup. Is that true? Essentially yes that is the case. Specifically, all messages from the very beginning of the tx-log need to be processed each time the lambda starts up. Sorry it's not a more exciting answer...but I would be glad to continue the conversation about Crux + "serverless" patterns

ordnungswidrig13:02:03

Interesting. would there be the options to share snapshots of the kv store such that a node (not necessarily a lambda) could catch up quicker?

refset15:02:45

Yep that's absolutely possible and a sample with all the code to do that is already in repo (for k8s + s3); https://github.com/juxt/crux/blob/master/docs/example/standalone_webservice/bin/restore.sh

refset15:02:03

It won't help the lambda use-case as it would spend too much time & bandwidth backing-up and restoring. The only exception to doing it is when upgrading Crux to a new version of the index - in that case you still need to replay from the beginning of the tx-log.

jonpither20:02:44

you could spin up a Crux node outside the lambdas, and have the lambdas talk to it over the HTTP api

ordnungswidrig10:02:24

On the other hand my preferred german hoster has a cloud option with 2GB ram for €3/month. Should be sufficient for side projects.

😎 4

2020-02-19

Channels