This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-10-01
Channels
- # announcements (8)
- # aws (8)
- # babashka (21)
- # beginners (125)
- # calva (12)
- # cider (10)
- # circleci (29)
- # clara (6)
- # clj-kondo (34)
- # cljdoc (3)
- # cljfx (65)
- # cljs-dev (18)
- # clojure (38)
- # clojure-australia (4)
- # clojure-berlin (5)
- # clojure-czech (2)
- # clojure-dev (15)
- # clojure-europe (22)
- # clojure-nl (3)
- # clojure-uk (31)
- # clojuredesign-podcast (7)
- # clojurescript (87)
- # code-reviews (1)
- # conjure (3)
- # cursive (2)
- # data-science (1)
- # datalog (1)
- # datomic (36)
- # emacs (12)
- # events (1)
- # fulcro (3)
- # graalvm (68)
- # instaparse (2)
- # jackdaw (2)
- # jobs (2)
- # leiningen (8)
- # luminus (2)
- # nrepl (31)
- # pedestal (44)
- # releases (1)
- # remote-jobs (6)
- # shadow-cljs (4)
- # spacemacs (4)
- # sql (13)
- # tools-deps (56)
- # uncomplicate (4)
- # xtdb (40)
- # yada (11)
Hey y'all - a while back I posted in here asking about using the cognitect AWS api from native-image. I wanted to share that I figured out how to do it and https://github.com/latacora/native-cogaws that shows how to get s3 read and write calls working
The key insights: 1. you need to provide configuration files for Reflection, resource loads, etc. 2. the cognitect aws client should not be instantiated in a global def (put it in a function or wrap it in a delay) 3. you need to require two internal https://github.com/latacora/native-cogaws/blob/master/src/latacora/native_cogaws.clj#L4 in the file that uses the client
#3 was something that was suggested in this channel, so thanks for the tip (I think it was from @borkdude or @U0FT7SRLP?)
One thing I'm curious about: why is it that requiring those namespaces helps resolve this runtime error:
Exception in thread "main" Syntax error compiling fn* at (cognitect/aws/http/cognitect.clj:1:1).
... (stack trace)
Caused by: com.oracle.svm.core.jdk.UnsupportedFeatureError: Unsupported method java.lang.ClassLoader.defineClass1(ClassLoader, String, byte[], int, int, ProtectionDomain, String) is reachable
compilation memory usage goes up to 7.2GB; the binary size is around 82MB (for native-cogaws)
ok thanks. This is similar what I got with hato, memory spikes. I think this can be heavily optimized for native.
How would you go about optimizing? All I could think of was removing entries from config files, but that seems pretty time consuming/prone to bugs
I don't know the cognitect library very well, but I think the approach taken by the pod-tzzh-aws library is more promising: instead of a runtime approach a compile time approach could work better, where all code paths are generated before hand and no reflection / dynamic requires are done at runtime.
> where all code paths are generated before hand and no reflection / dynamic requires are done at runtime Is this different from what happens when you feed native-image the reflection config file?
Not sure, but for what it does (a couple of calls to s3) I think 7-8GB and 80mb is way too much
I think @U0FT7SRLP also looked into this once, maybe he can tell you more
yea, seems pretty excessive. We had another cli app (just s3 calls) that went up from 60MB to 100MB after I added the reflection config. I haven't tried on more involved apps, so can't say if more services will mean more CPU/binary size
@sergey923 Here's some data for pprint: https://clojure.atlassian.net/browse/CLJ-2582
If I was a heavy user of cognitect aws or AWS in general, which I am not at the moment, I would probably rewrite that thing for GraalVM specifically
@sergey923 You might have seen my fork of cognitect's library https://github.com/AdGoji/aws-api It solves the case where you need to do something specific and you want to take that extra effort to have a standalone binary (no python or other dep). For quick ad hoc AWS calls it's not suited, but it takes the signing of requests (hardest part). I'm guessing it would be possible to run the whole library with graalvm if you get rid of some of the lazy loading (in the http client and credential lookup mostly I think). There is also some XML parsing that might bloat the binary more
One thing I'm curious about: why is it that requiring those namespaces helps resolve this runtime error:
Exception in thread "main" Syntax error compiling fn* at (cognitect/aws/http/cognitect.clj:1:1).
... (stack trace)
Caused by: com.oracle.svm.core.jdk.UnsupportedFeatureError: Unsupported method java.lang.ClassLoader.defineClass1(ClassLoader, String, byte[], int, int, ProtectionDomain, String) is reachable
the cognitect aws-api dynamically loads namespaces based on the target service's protocol
Side note: this is a way to call AWS at native (startup) speeds using babashka: https://github.com/tzzh/pod-tzzh-aws It's a pod leveraging the Go API but the Go code is generated using babashka itself (could also be done using Clojure, detail)
I guess it's a similar approach as cognitect AWS API except the code generation as all done beforehand, no reflection stuff
(he also published this today: https://github.com/tzzh/pod-tzzh-mail - similar approach leveraging a Go mail library)
Ah, thanks. There may be other issues why it's heavy on GraalVM native-image. One issue at a glance: $ ls src/cognitect/aws
, I spot that it has a dynaload
namespace. If that's similar to what spec does, it already explains something. I made a variant of dynaload with specific GraalVM settings: https://github.com/borkdude/dynaload
yes, but even then, having code around with find-var
, require
on the non-top-level (function bodies) in it can still make the image more bloated than necessary. It will work, but it will also be more bloated than necessary.
See e.g. https://clojure.atlassian.net/browse/CLJ-2582
If it doesn't do reflection, it is weird why such a huge reflection config is necessary in the AOT-ed lib to make it work
I don't know, I haven't analyzed it. Speaking with my maintainer hat on, aws-api does not explicitly do anything "reflective". It just delays the loading of a few namespaces.
Maybe it would already help if this part was avoided:
src/cognitect/aws/dynaload.clj
14: (or (resolve s)
is it?
(defn resolve-http-client
[http-client-or-sym]
(let [c (or (when (symbol? http-client-or-sym)
(let [ctor @(dynaload/load-var http-client-or-sym)]
(ctor)))
http-client-or-sym
(let [ctor @(dynaload/load-var (configured-client))]
(ctor)))]
(when-not (client? c)
(throw (ex-info "not an http client" {:provided http-client-or-sym
:resolved c})))
c))
anyway, these are the kinds of spots that need attention when dealing with GraalVM native-image probably. With pprint there was only one or two lines that needed changing and boom, 20mb less binary size
(my alter-var-root patch for pprint: https://github.com/borkdude/babashka/blob/master/src/babashka/impl/pprint.clj#L8-L49)
I am more interested in figuring out how to structure truly dynamic code to be graal-sympathetic
like how could you compile aws-api with the jetty client vs with the something-else-http client
not dynamic in the produced image, just a higher level front-end API to image generation
e.g. here is my program, it includes some multimethods, I want you to preload these namespaces which extend the multimethods, this is my entrypoint
like what is the data that is in @sergey923’s repo?
https://github.com/latacora/native-cogaws/blob/master/src/latacora/native_cogaws.clj#L4-L6
{:preload-nses [cognitect.aws.protocols.rest-xml
cognitect.aws.protocols.query
cognitect.aws.protocols.rest-json]
:entrypoint latacora.foo/main}
@ghadi I haven't looked into it myself but Quarkus is a JVM framework which has loads of modules/extensions that work with GraalVM. It might have some clues as to how to approach what you're interested in
@ghadi fwiw, my dynaload variant has a setting for GraalVM: https://github.com/borkdude/dynaload it behaves fully dynamic on the JVM, but less dynamic in CLJS and static in GraalVM native-image (where you're supposed to require the namespaces in a certain order, sure that could be configured using some .edn file)