datahike

octahedrion 2024-02-15T10:34:09.773179Z

is the dynamodb storage backend for Datahike reasonably up to date ?

octahedrion 2024-02-19T08:46:20.571089Z

also dynamodb comes by default with aws services doesn't it ?

whilo 2024-02-19T18:35:38.648989Z

what do you mean with aws services?

octahedrion 2024-02-20T05:44:08.719249Z

that you can access a dynamodb from anywhere by default without having to install it I thought

octahedrion 2024-02-20T05:44:29.862549Z

(I'm not that familiar with aws)

whilo 2024-02-20T05:44:44.032389Z

yes, you can

whilo 2024-02-20T05:44:52.518919Z

it is offered as a service

timo 2024-02-15T10:47:51.054449Z

which one?

octahedrion 2024-02-15T11:03:23.278729Z

there's more than one ? any

timo 2024-02-15T11:07:24.201039Z

I just need to know which one you are referring to so I can take a look

octahedrion 2024-02-15T11:08:44.538989Z

ok it looks like this is the only one it has https://github.com/csm/konserve-ddb

timo 2024-02-15T11:10:56.723959Z

that one hasn't seen updates for 5 years so no. Datahike has seen a lot of development in recent years. Everything older than a year won't work with latest datahike version I assume

whilo 2024-02-15T20:09:24.476199Z

If there is interest, it would not be hard to create one.

whilo 2024-02-15T20:10:04.315119Z

This is what was needed to support a S3 backend https://github.com/replikativ/konserve-s3/blob/main/src/konserve_s3/core.clj.

whilo 2024-02-15T20:10:39.096699Z

The main limitation this one still has is that it does not use an async client as well when the API would prefer it, but just wraps sync calls in go blocks. That is totally fine though to get started.

whilo 2024-02-15T20:11:01.220689Z

@octo221 Would you be interested in helping with that?

whilo 2024-02-15T20:13:45.338209Z

Maybe the S3 backend would also work, it depends mostly on your latency requirements I think.

whilo 2024-02-15T20:14:05.658609Z

Would your backend run in a AWS data center?

octahedrion 2024-02-16T07:33:07.895759Z

why S3 for dynamodb ? edit: oh you mean as an alternative would https://github.com/taoensso/faraday be of use ?

whilo 2024-02-16T08:02:36.386779Z

that would do, although direct java API calls should also be fine. whatever is simple to pick up and reliable

whilo 2024-02-16T08:02:45.914989Z

(and has no significant overhead)

whilo 2024-02-16T08:03:12.811799Z

a backend does not require a lot of code, so we can also iterate on it if needed

octahedrion 2024-02-16T08:35:13.058169Z

what does it need to store and retrieve ? is it just [e a v t op] ?

whilo 2024-02-16T08:35:48.832829Z

just blobs

whilo 2024-02-16T08:36:05.689919Z

key column which will be uuids as strings and a blob it will assign to it

whilo 2024-02-16T08:36:30.589429Z

so there will be a blob column, but normally dynamo adds columns automatically as far as i understand

whilo 2024-02-16T08:36:55.380359Z

these will hold tree fragments of the indices, which then contain the datoms

octahedrion 2024-02-16T08:37:51.279619Z

ohhhh yessss I remember

whilo 2024-02-16T08:49:18.184209Z

this is how it looks like for SQL if that helps https://github.com/replikativ/konserve-jdbc

whilo 2024-02-16T09:07:27.117349Z

@octo221 can you contextualize how you would like to use dynamodb?

octahedrion 2024-02-16T09:27:28.041599Z

I don't have a preference really I just wondered. If there's an existing AWS backend that does the job then that's fine

octahedrion 2024-02-16T14:11:39.029349Z

(unless there's some advantage to dynamodb)

whilo 2024-02-16T18:32:44.107009Z

latency

whilo 2024-02-16T18:32:56.621799Z

you pay more, but it will be faster

whilo 2024-02-16T18:33:27.973629Z

the first thing to improve though is to add proper async support to konserve-s3, it will reduce latency on S3

whilo 2024-02-16T18:33:42.676259Z

at least when you transact bigger batches of datoms