datahike

2024-11-12T10:27:28.249859Z

Have you forgot to add dependency to datahike-dynamodb or konserve-dynamodb ? To make is start I have to explicitly add [software.amazon.awssdk/aws-core "2.29.10"] to my project dependencies..

whilo 2024-11-12T18:50:37.135759Z

datahike-dynamodb depends on konserve-dynamodb, so that should be enouhg.

2024-11-12T13:21:26.727529Z

And I created a PR about local testing endpoint: https://github.com/replikativ/konserve-dynamodb/pull/1

2024-11-12T16:33:32.911639Z

How was konserve-dynamodb tested? It fails for me on core.clj:64 :

java.lang.IllegalArgumentException: No matching method describeTable found taking 1 args for class software.amazon.awssdk.services.dynamodb.DefaultDynamoDbClient
Maybe need to add type hints? 🤔

alekcz 2024-11-12T16:36:34.349149Z

That looks like a type hint related error. Also check that the API is up to date

2024-11-12T18:07:44.392549Z

I tried to add typehints. Can you deploy it to clojars? https://github.com/replikativ/konserve-dynamodb/pull/1

2024-11-12T18:08:27.792239Z

Because with any AWS SDK version it fails with same error

whilo 2024-11-12T18:48:04.947329Z

Hey, thanks! Are you already native compiling this?

whilo 2024-11-12T18:48:33.839309Z

I have not removed reflection warnings yet, they are quite a few and I first wanted to know whether it satisfies your latency requirements.

whilo 2024-11-12T18:49:27.844229Z

It is great if you can help with removing the warnings. You uncomment the warn-on-reflection flag at the top to see all of them in the REPL.

whilo 2024-11-12T18:50:06.676649Z

They will have to be removed for native compilation or additional steps tracing steps are needed to native compile it (I have never done them yet).

2024-11-12T19:00:33.746589Z

Yes, I am compiling and getting this errors only already in AWS stack

2024-11-12T19:01:17.521939Z

Okay let's try it

whilo 2024-11-12T19:02:47.796729Z

We can work through the warnings together. konserve-s3 also still needs those removed unfortunately, then both can be run in native-images in lambdas, which I think would be great.

2024-11-12T19:09:43.628099Z

I can't understand how it works in tests now... without docker and with nils in credentials environment variables...

whilo 2024-11-12T19:49:09.562819Z

In Clojure I would always first make sure to get an nrepl in the target environments for development, even if you then compile it away for production.

whilo 2024-11-12T19:49:42.322989Z

So if you can run a lambda with nrepl somehow that would be what I would do. If this doesn't work I would use an EC2 instance to simulate it.

whilo 2024-11-12T19:49:59.963369Z

That way you get a feeling for how things work (also performance wise) during development.

2024-11-12T19:53:42.512229Z

Nrepl to Lambda is impossible I think. Even with their official docker image for local testing

👍 1
whilo 2024-11-12T19:54:59.663839Z

@viesti you are an expert in this stuff, what do you think?

whilo 2024-11-12T19:55:34.932539Z

This is one reason why I don't like lambdas, they are a too restricted programming environment. But I get their value proposition.

🤷‍♂️ 1
viesti 2024-11-12T20:14:54.565719Z

Yeah, it's a tradeoff, for being able to pause processes lambda runs processes only when handling events, so for having something like nrepl, you would need transport that calls the lambda api with nrepl operation, and translates back the response, and limit concurrency to 1, so you don't hop between processes when sending nrepl commands (although a kind of scatter/gather would be interesting :))

2024-11-12T20:40:08.027839Z

No guys, I can't fix it because I do not understand now how do konserve works... i got rid of all reflection warnings, but now tests are failing(( Code is here but I do not think it is helpful: https://github.com/aldebogdanov/konserve-dynamodb/blob/main/src/konserve_dynamodb/core.clj

whilo 2024-11-12T20:53:33.158929Z

Export AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY and then bin/run-unittests. They pass for me.

2024-11-12T20:59:06.059579Z

What pass? Tests with my code? Impossible

2024-11-12T21:00:13.241339Z

I mean clj -M:test -m kaocha.runner

whilo 2024-11-12T21:10:08.437999Z

This is how I run the tests on my machine in the current konserve-dynamodb repo. I will try to pick up your code and see...

whilo 2024-11-12T21:12:29.314919Z

I just need to fix something else first.

👌 1
whilo 2024-11-12T23:28:11.768049Z

Fixed your code, it was calling assoc on an immutable map, I converted it to Clojure https://github.com/replikativ/konserve-dynamodb/commit/b52bb04c14e44d73b1450ead78766e62bbfbb2a6

whilo 2024-11-12T23:28:36.316149Z

You can try the code from the comment section at the bottom of the file. I often keep them around to make testing easy.

whilo 2024-11-12T23:28:49.681569Z

(assuming you have the environment variables set)

whilo 2024-11-12T23:35:05.551299Z

Just bump to datahike-dynamodb 0.1.4 now, that should do the trick hopefully.

👍 1
2024-11-13T06:26:15.855269Z

(assuming you have the environment variables set)Anyway I think better to test mostly against their official local dynamodb docker

octahedrion 2024-11-28T13:49:49.522229Z

AWS deploy measures the uncompressed binary size. I'm not keen on using UPX because I how can I trust that the compressed binary will compute the same as the original, and it's a potential security hole because it could inject code or introduce bugs. I'd much rather that the original generated binary be as small as possible.

2024-11-28T13:52:47.343499Z

UPX is not like ZIP. It reduce uncompressed size. It's open-source with 14.6k stars. Do you really think it injects backdoor and nobody care? ))

👍 1
octahedrion 2024-11-28T14:11:13.692599Z

I understand what it does (I'm replying to more than one message above) but I'd still prefer to make the binary small to begin with (but it's useful to have UPX as an option)

whilo 2024-11-28T17:39:30.173979Z

Ok. I think either way making the binary smaller will help a lot. The first step towards this is to remove all resolve calls, because they make all the code on the classpath reachable at runtime, which means it cannot be removed by the GraalVM compiler. I pointed you to the chat with borkdude where he describes how to find all the resolve calls @octo221. So clone Datahike (and potential dependencies if they also call resolve) and replace the calls with stubs.

octahedrion 2024-11-25T14:07:03.218129Z

My native-image compiled stack has increased in size by around 60MB since adding Datahike, which tips it over the scales for deployment to AWS. Is there anything I can do to reduce it ?

whilo 2024-11-25T17:37:37.300279Z

Babashka has also increased in size and is now 80 MiB+. When you zip it is much smaller though, not sure whether this will help you.

whilo 2024-11-25T17:38:15.810949Z

I think making it smaller is definitely worthwhile, I just didn't have the time yet.

2024-11-25T22:16:55.273769Z

Compress native image with UPX (https://upx.github.io) or something similar. My executable reduced from ~450MB to 120-145MB

whilo 2024-11-26T00:58:25.797889Z

Nice 🙂

octahedrion 2024-11-29T15:13:06.190839Z

hmmm even if my code never causes Datahike to use the path to resolve ? Surely if my queries only use symbols from clojure.core it would be fine wouldn't it ? I'm reading the build reports for NI build with and without datahike using optimization level -O1 , and it's byte[]s for embedded resources & code metadata which dominate the image heap, due to the number of extra classes, which come simply I think from dependencies. There's not much that can be done about that.

whilo 2024-11-29T18:12:35.267909Z

It is a runtime feature and the query engine as well as the transactor can interpret arbitrary incoming data structures, which would call resolve. So it cannot be compiled away AOT.

2024-11-13T09:51:44.192309Z

So it is very nice, but I added :consistent-read? option to konserve-dynamodb in PR: https://github.com/replikativ/konserve-dynamodb/pull/2 Please taka a look. And I can't understand about datahike-dynamodb 0.1.4 You do not need to update spec?

2024-11-13T09:51:56.249569Z

Ah ok, looks like there are inly required fields

whilo 2024-11-13T17:40:29.011259Z

You had two syntax errors in there, but I fixed them and am releasing it now.

whilo 2024-11-13T17:40:45.191219Z

I would love to hear how it is doing, what is working well, what isn't etc.

2024-11-13T17:48:48.319189Z

It going with problems. So, I decided also to try snap-start with Java21, but somehow encountered "Table was written with newer version of konserve"))) that is 0.7.317 instead of 0.7.314 in datahike. I have no idea how it happened(( but if I exclude konserve from datahike and left it to be 0.7.317 from konserve-dynamodb , my final uberjar grows more more then 150%...

whilo 2024-11-13T17:53:45.196569Z

Ok, not sure what caused the version mismatch, there must have been a downgrade somewhere.

whilo 2024-11-13T17:53:51.018199Z

Why does the native image not work?

2024-11-13T18:01:44.996109Z

> Ok, not sure what caused the version mismatch, there must have been a downgrade somewhere. datahike itself dependent from 0.7.314

whilo 2024-11-13T18:04:01.287319Z

I see. Bumped it as well.

2024-11-13T18:04:40.451429Z

> Why does the native image not work? I am not completely sure, but looks like uncostistency of database reads, so while new option isn't released I decided to try. And last, but not the least... looks like Container images have terrible cold start, despite AI said it will be around 100-200ms for my size on subsequent starts

whilo 2024-11-13T18:05:14.953509Z

Do you need a container for the native image?

whilo 2024-11-13T18:05:34.794159Z

I guess you do, but the native image is statically linked with very few dependencies.

whilo 2024-11-13T18:05:45.082179Z

So it should almost run on a bare operating system.

2024-11-13T18:06:48.562389Z

Do you know any different approaches? After compression via UPX executable was around 140 megabytes...

2024-11-13T18:06:56.994459Z

Before compression 400+

whilo 2024-11-13T18:07:58.640319Z

I have not worked on decreasing the size of the binary yet, but that should not affect startup time or linking dependencies.

whilo 2024-11-13T18:08:15.541119Z

@viesti any ideas of how to get the native image to start quickly?

whilo 2024-11-13T18:08:36.901339Z

I have not done anything with Clojure in lambdas yet to be honest.

2024-11-13T18:08:38.240419Z

I think startup time there was taken by loading container image from ECR...

whilo 2024-11-13T18:08:51.491299Z

Aha, I see.

whilo 2024-11-13T18:09:01.048029Z

That sucks.

😞 1
whilo 2024-11-13T18:09:31.852719Z

The problem is that Datomic allows to resolve functions at runtime and that makes all of the Clojure namespaces reachable for the native image.

whilo 2024-11-13T18:10:18.218289Z

Which means the binary gets really big. @borkdude and I were discussing this some time ago. One way to fix this would be to treat the resolve calls differently during native compilation.

viesti 2024-11-13T18:12:01.526669Z

wonder what the base image used was, having something like distroless could help, but yes, there is some time that loading from docker registry takes, but, lambda has their own caching too https://arxiv.org/abs/2305.13162, same cache used in snapstart IIRC, works best for frequently invoked lambdas, if you have say a day between, you might get evicted from those caches

2024-11-13T18:13:41.529179Z

@viesti Hmm... I used amazonlinux:2023 as base image, so image was around 163 MB when executable took ~145 from it..

whilo 2024-11-13T18:16:00.242649Z

Right, at that point the executable size is a problem. It is not that I don't care, it was just not high priority yet.

viesti 2024-11-13T18:16:45.013719Z

in the paper linked, they mention that they have their own tiered cache, that comes after the OCI image is downloaded

whilo 2024-11-13T18:16:46.562429Z

There is a tradeoff between late dynamic runtime binding and ahead of time optimization there.

whilo 2024-11-13T18:18:29.657459Z

One way to address this is to force people to explicitly register functions, e.g. in a (def resolvables (atom {'my.ns/function (fn [...] ...)})). And then redirect resolve to only lookup stuff in there.

viesti 2024-11-13T18:19:08.690469Z

in section > 2 Block-Level Loading

whilo 2024-11-13T18:19:16.158829Z

This will break Datomic compatibility though. Maybe that is fine, I have no clue how they solved this in Datomic cloud.

viesti 2024-11-13T18:19:43.390969Z

the other day was wondering, if native-images could be split into some base and app libraries...

2024-11-13T18:21:36.398459Z

> I have no clue how they solved this in Datomic cloud. Are you sure it is solved in Cloud setup? Cloud version is much cutted

whilo 2024-11-13T18:22:43.134419Z

@sasha_bogdanov_dev if you want to check whether the base image shrinks a lot you can replace these resolve calls with stubs and compile again. You might have to identify some more places where this is happening, e.g. the resolve-method below also does runtime reflection to load class information (which I guess is problematic).

whilo 2024-11-13T18:22:58.115369Z

Yeah, I guess they have just cut it down.

whilo 2024-11-13T18:24:26.155669Z

I don't care too much about being compatible with it to be honest. I think it is good if you can decide to use Datomic at some point if Datahike is a limitation, but Datahike doesn't have to follow all the design decisions of Datomic. We are now at a point where there is sufficient value added (with all the backends and our more decoupled distributed index space) that we can follow our own path.

2024-11-13T18:25:27.066609Z

Maybe you can use something like macro or another way of building dependently of flag like DATOMIC_COMPATIBLE_RESOLVES ?

whilo 2024-11-13T18:25:40.710269Z

Yes, that is possible.

whilo 2024-11-13T18:25:56.360279Z

I think we also only want to get rid of them for native images.

2024-11-13T18:27:36.959439Z

Taht makes sense. Anyweay in native image you always know on compile time what functions you will need

whilo 2024-11-13T18:27:46.431449Z

You are compiling a local clone of Datahike for AOT, right?

whilo 2024-11-13T18:28:31.222099Z

If you can replace the functions calling resolve with stubs there it might do the trick.

whilo 2024-11-13T18:29:16.221509Z

Then we can add an abstraction into datahike to make it work in the native image in general.

whilo 2024-11-13T18:34:48.315079Z

I will be back later.

2024-11-13T18:56:15.362639Z

> You are compiling a local clone of Datahike for AOT, right? No.. I am not using local clones (btw need to migrate from lein to deps, but I was careful about native-image plugin), but compiling AOT, yes. It doesn't want to work without :aot :all (( Maybe it could, but most of libraries have to be AOT

whilo 2024-11-13T23:32:21.310629Z

I have been able to get the binary size down to 16 MiB by removing all resolve (and a few maybe related reflection warnings), but now it does not run anymore

christian@dyson:~/Development/datahike$ ./dthk --help
Exception in thread "main" java.lang.ExceptionInInitializerError
	at clojure.lang.Namespace.<init>(Namespace.java:34)
	at clojure.lang.Namespace.findOrCreate(Namespace.java:176)
	at clojure.lang.Var.internPrivate(Var.java:156)
	at datahike.cli.<clinit>(Unknown Source)
Caused by: java.io.FileNotFoundException: Could not locate clojure/core__init.class, clojure/core.clj or clojure/core.cljc on classpath.
	at clojure.lang.RT.load(RT.java:462)
	at clojure.lang.RT.load(RT.java:424)
	at clojure.lang.RT.<clinit>(RT.java:338)
	... 4 more

👍 1