Have you forgot to add dependency to datahike-dynamodb or konserve-dynamodb ?
To make is start I have to explicitly add [software.amazon.awssdk/aws-core "2.29.10"] to my project dependencies..
datahike-dynamodb depends on konserve-dynamodb, so that should be enouhg.
And I created a PR about local testing endpoint: https://github.com/replikativ/konserve-dynamodb/pull/1
How was konserve-dynamodb tested? It fails for me on core.clj:64 :
java.lang.IllegalArgumentException: No matching method describeTable found taking 1 args for class software.amazon.awssdk.services.dynamodb.DefaultDynamoDbClient
Maybe need to add type hints? 🤔That looks like a type hint related error. Also check that the API is up to date
I tried to add typehints. Can you deploy it to clojars? https://github.com/replikativ/konserve-dynamodb/pull/1
Because with any AWS SDK version it fails with same error
Hey, thanks! Are you already native compiling this?
I have not removed reflection warnings yet, they are quite a few and I first wanted to know whether it satisfies your latency requirements.
It is great if you can help with removing the warnings. You uncomment the warn-on-reflection flag at the top to see all of them in the REPL.
They will have to be removed for native compilation or additional steps tracing steps are needed to native compile it (I have never done them yet).
Yes, I am compiling and getting this errors only already in AWS stack
Okay let's try it
We can work through the warnings together. konserve-s3 also still needs those removed unfortunately, then both can be run in native-images in lambdas, which I think would be great.
I can't understand how it works in tests now... without docker and with nils in credentials environment variables...
In Clojure I would always first make sure to get an nrepl in the target environments for development, even if you then compile it away for production.
So if you can run a lambda with nrepl somehow that would be what I would do. If this doesn't work I would use an EC2 instance to simulate it.
That way you get a feeling for how things work (also performance wise) during development.
Nrepl to Lambda is impossible I think. Even with their official docker image for local testing
@viesti you are an expert in this stuff, what do you think?
This is one reason why I don't like lambdas, they are a too restricted programming environment. But I get their value proposition.
Yeah, it's a tradeoff, for being able to pause processes lambda runs processes only when handling events, so for having something like nrepl, you would need transport that calls the lambda api with nrepl operation, and translates back the response, and limit concurrency to 1, so you don't hop between processes when sending nrepl commands (although a kind of scatter/gather would be interesting :))
No guys, I can't fix it because I do not understand now how do konserve works... i got rid of all reflection warnings, but now tests are failing(( Code is here but I do not think it is helpful: https://github.com/aldebogdanov/konserve-dynamodb/blob/main/src/konserve_dynamodb/core.clj
Export AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY and then bin/run-unittests. They pass for me.
What pass? Tests with my code? Impossible
I mean clj -M:test -m kaocha.runner
This is how I run the tests on my machine in the current konserve-dynamodb repo. I will try to pick up your code and see...
I just need to fix something else first.
Fixed your code, it was calling assoc on an immutable map, I converted it to Clojure https://github.com/replikativ/konserve-dynamodb/commit/b52bb04c14e44d73b1450ead78766e62bbfbb2a6
You can try the code from the comment section at the bottom of the file. I often keep them around to make testing easy.
(assuming you have the environment variables set)
Just bump to datahike-dynamodb 0.1.4 now, that should do the trick hopefully.
(assuming you have the environment variables set)Anyway I think better to test mostly against their official local dynamodb docker
AWS deploy measures the uncompressed binary size. I'm not keen on using UPX because I how can I trust that the compressed binary will compute the same as the original, and it's a potential security hole because it could inject code or introduce bugs. I'd much rather that the original generated binary be as small as possible.
UPX is not like ZIP. It reduce uncompressed size. It's open-source with 14.6k stars. Do you really think it injects backdoor and nobody care? ))
I understand what it does (I'm replying to more than one message above) but I'd still prefer to make the binary small to begin with (but it's useful to have UPX as an option)
Ok. I think either way making the binary smaller will help a lot. The first step towards this is to remove all resolve calls, because they make all the code on the classpath reachable at runtime, which means it cannot be removed by the GraalVM compiler. I pointed you to the chat with borkdude where he describes how to find all the resolve calls @octo221. So clone Datahike (and potential dependencies if they also call resolve) and replace the calls with stubs.
My native-image compiled stack has increased in size by around 60MB since adding Datahike, which tips it over the scales for deployment to AWS. Is there anything I can do to reduce it ?
Babashka has also increased in size and is now 80 MiB+. When you zip it is much smaller though, not sure whether this will help you.
I think making it smaller is definitely worthwhile, I just didn't have the time yet.
Compress native image with UPX (https://upx.github.io) or something similar. My executable reduced from ~450MB to 120-145MB
Nice 🙂
hmmm even if my code never causes Datahike to use the path to resolve ? Surely if my queries only use symbols from clojure.core it would be fine wouldn't it ? I'm reading the build reports for NI build with and without datahike using optimization level -O1 , and it's byte[]s for embedded resources & code metadata which dominate the image heap, due to the number of extra classes, which come simply I think from dependencies. There's not much that can be done about that.
It is a runtime feature and the query engine as well as the transactor can interpret arbitrary incoming data structures, which would call resolve. So it cannot be compiled away AOT.
So it is very nice, but I added :consistent-read? option to konserve-dynamodb in PR: https://github.com/replikativ/konserve-dynamodb/pull/2
Please taka a look.
And I can't understand about datahike-dynamodb 0.1.4 You do not need to update spec?
Ah ok, looks like there are inly required fields
You had two syntax errors in there, but I fixed them and am releasing it now.
I would love to hear how it is doing, what is working well, what isn't etc.
It going with problems. So, I decided also to try snap-start with Java21, but somehow encountered "Table was written with newer version of konserve"))) that is 0.7.317 instead of 0.7.314 in datahike. I have no idea how it happened(( but if I exclude konserve from datahike and left it to be 0.7.317 from konserve-dynamodb , my final uberjar grows more more then 150%...
Ok, not sure what caused the version mismatch, there must have been a downgrade somewhere.
Why does the native image not work?
> Ok, not sure what caused the version mismatch, there must have been a downgrade somewhere.
datahike itself dependent from 0.7.314
I see. Bumped it as well.
> Why does the native image not work? I am not completely sure, but looks like uncostistency of database reads, so while new option isn't released I decided to try. And last, but not the least... looks like Container images have terrible cold start, despite AI said it will be around 100-200ms for my size on subsequent starts
Do you need a container for the native image?
I guess you do, but the native image is statically linked with very few dependencies.
So it should almost run on a bare operating system.
Do you know any different approaches? After compression via UPX executable was around 140 megabytes...
Before compression 400+
I have not worked on decreasing the size of the binary yet, but that should not affect startup time or linking dependencies.
@viesti any ideas of how to get the native image to start quickly?
I have not done anything with Clojure in lambdas yet to be honest.
I think startup time there was taken by loading container image from ECR...
Aha, I see.
That sucks.
The problem is that Datomic allows to resolve functions at runtime and that makes all of the Clojure namespaces reachable for the native image.
Which means the binary gets really big. @borkdude and I were discussing this some time ago. One way to fix this would be to treat the resolve calls differently during native compilation.
wonder what the base image used was, having something like distroless could help, but yes, there is some time that loading from docker registry takes, but, lambda has their own caching too https://arxiv.org/abs/2305.13162, same cache used in snapstart IIRC, works best for frequently invoked lambdas, if you have say a day between, you might get evicted from those caches
It is used here https://github.com/replikativ/datahike/blob/12c68ee3c7796c90fddcae5117b7d9efe78b1481/src/datahike/db/transaction.cljc#L449
@viesti Hmm... I used amazonlinux:2023 as base image, so image was around 163 MB when executable took ~145 from it..
Right, at that point the executable size is a problem. It is not that I don't care, it was just not high priority yet.
Another culprit is https://github.com/replikativ/datahike/blob/553db1e6ade4dbf1fac2b62cfa6f03a9d0cb4119/src/datahike/query.cljc#L712
in the paper linked, they mention that they have their own tiered cache, that comes after the OCI image is downloaded
There is a tradeoff between late dynamic runtime binding and ahead of time optimization there.
One way to address this is to force people to explicitly register functions, e.g. in a (def resolvables (atom {'my.ns/function (fn [...] ...)})). And then redirect resolve to only lookup stuff in there.
in section > 2 Block-Level Loading
This will break Datomic compatibility though. Maybe that is fine, I have no clue how they solved this in Datomic cloud.
the other day was wondering, if native-images could be split into some base and app libraries...
> I have no clue how they solved this in Datomic cloud. Are you sure it is solved in Cloud setup? Cloud version is much cutted
@sasha_bogdanov_dev if you want to check whether the base image shrinks a lot you can replace these resolve calls with stubs and compile again. You might have to identify some more places where this is happening, e.g. the resolve-method below also does runtime reflection to load class information (which I guess is problematic).
Yeah, I guess they have just cut it down.
I don't care too much about being compatible with it to be honest. I think it is good if you can decide to use Datomic at some point if Datahike is a limitation, but Datahike doesn't have to follow all the design decisions of Datomic. We are now at a point where there is sufficient value added (with all the backends and our more decoupled distributed index space) that we can follow our own path.
Maybe you can use something like macro or another way of building dependently of flag like DATOMIC_COMPATIBLE_RESOLVES ?
Yes, that is possible.
I think we also only want to get rid of them for native images.
Taht makes sense. Anyweay in native image you always know on compile time what functions you will need
You are compiling a local clone of Datahike for AOT, right?
If you can replace the functions calling resolve with stubs there it might do the trick.
Then we can add an abstraction into datahike to make it work in the native image in general.
I will be back later.
> You are compiling a local clone of Datahike for AOT, right?
No.. I am not using local clones (btw need to migrate from lein to deps, but I was careful about native-image plugin), but compiling AOT, yes. It doesn't want to work without :aot :all (( Maybe it could, but most of libraries have to be AOT
I have been able to get the binary size down to 16 MiB by removing all resolve (and a few maybe related reflection warnings), but now it does not run anymore
christian@dyson:~/Development/datahike$ ./dthk --help
Exception in thread "main" java.lang.ExceptionInInitializerError
at clojure.lang.Namespace.<init>(Namespace.java:34)
at clojure.lang.Namespace.findOrCreate(Namespace.java:176)
at clojure.lang.Var.internPrivate(Var.java:156)
at datahike.cli.<clinit>(Unknown Source)
Caused by: java.io.FileNotFoundException: Could not locate clojure/core__init.class, clojure/core.clj or clojure/core.cljc on classpath.
at clojure.lang.RT.load(RT.java:462)
at clojure.lang.RT.load(RT.java:424)
at clojure.lang.RT.<clinit>(RT.java:338)
... 4 more