This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-05-10
Channels
- # announcements (3)
- # babashka (16)
- # beginners (41)
- # biff (4)
- # calva (14)
- # circleci (1)
- # clj-http (24)
- # clj-kondo (9)
- # clj-on-windows (31)
- # cljs-dev (52)
- # clojure (162)
- # clojure-australia (10)
- # clojure-europe (52)
- # clojure-nl (2)
- # clojure-spec (1)
- # clojure-uk (5)
- # clojurescript (40)
- # conjure (6)
- # core-async (3)
- # cursive (5)
- # datalevin (11)
- # datomic (7)
- # emacs (12)
- # etaoin (19)
- # events (1)
- # figwheel-main (17)
- # fulcro (4)
- # graalvm (3)
- # gratitude (13)
- # honeysql (8)
- # introduce-yourself (7)
- # london-clojurians (1)
- # off-topic (9)
- # polylith (9)
- # rdf (1)
- # re-frame (21)
- # releases (5)
- # remote-jobs (4)
- # sci (28)
- # shadow-cljs (15)
- # spacemacs (2)
- # vim (4)
- # xtdb (15)
I want to read in an EDN config with #inst
tags but the java.util.Date
instances seem pretty hard to work with. Should I just add a day to the java.util.Date
instance? 😝 I want to support the #inst
reader tag on date strings. Using java.time.LocalDate/parse
as the #inst
reader works on dates but then if someone adds a time it becomes broken. I guess I could try parsing it as a date, then an in instant. Maybe I should just add a #date
reader tag or just tell my users to not put any #inst
tags and read the dates in as strings...
(let [fmt (java.text.SimpleDateFormat. "yyyy-MM-dd")]
(.format fmt #inst "2022-05-03"))
;; => "2022-05-02"
;; What?? :o
Timezones are hard.
you can just make your own tag and have it be whatever you want...
Another thing you can do is convert a date to java.time objects, which are quite nice, then convert back
You mean something like this @U050ECB92
(let [fmt (DateTimeFormatter/ofPattern "yyyy-MM-dd")]
(.format
(java.time.OffsetDateTime/ofInstant (.toInstant #inst "2022-05-05")
(java.time.ZoneId/of "UTC"))
fmt))
(since java.util.Date
and java.time.Instant
have no associated timezone)
@U050ECB92 Oh I missed that method! Ok thanks y'all this is very helpful. I didn't really that timezones could be the culprit of the weirdness with java.util.Date
but that makes sense if it's switching from UTC to local timezone implicitly. I might try to convert the java.util.Date
into a java.time.Instant
using the user's local timezone since the date they're specifying is specific to their experience. But first I'll just read it in as a string and error on java.util.Date
s. I can add support for the conversion later.
Since I'm already using Malli for input validation I just came up with this schema to "decode" the string to a java.time.LocalDate
and then validate it:
(def local-date
[:fn {:error/message "should be a date string in format YYYY-MM-DD"
:decode/local-date #(try (java.time.LocalDate/parse %)
(catch Exception _ %))}
#(= java.time.LocalDate (class %))])
(comment (m/validate local-date
(m/decode local-date "2022-05-04"
(mt/transformer {:name :local-date}))))
;; => true
Wasn't there some config somewhere to change the reader for #inst ? I feel I remember that from somewhere
@U0K064KQV Yeah this might be the ticket you're thinking of https://clojure.atlassian.net/browse/CLJ-2224
Ya, that would make it default. But you can also just override it yourself, the reference is where I saw that haha: https://clojure.org/reference/reader
> Since data-readers is a dynamic var that can be bound, you can replace the default reader with a different one. For example, clojure.instant/read-instant-calendar will parse the literal into a java.util.Calendar, while clojure.instant/read-instant-timestamp will parse it into a java.util.Timestamp
Though you'd also need to overload the printer for these, since I don't think the other types would print as an #inst by default
in general, you should think very carefully about overriding the built-in readers (as other libs may have expectations about them)
@U0K064KQV Yeah that was one of the options I considered but I'd rather not change the behavior of the EDN parsing. My use case is kind of weird in that the user is putting arbitrary data in an EDN file that they will eventually process with their own Clojure function. So I don't want to surprise them, but I do need certain fields from that EDN to be java.time.LocalDate
s eventually, so just enforcing that with a schema check seems to be the best route for ease of use.
If you're using EDN though, you don't have to touch the Clojure readers, just provide readers to EDN when you read
But if you mean that the user's function must return dates as java.time.LocalDate, and not that the EDN must parse #inst as such, then ya just let the user know that's the expectations.
have a look at https://github.com/henryw374/time-literals. even if you don't use it the readme+code should help. certainly I would read this in as a LocalDate and only convert to Instant etc as and when required.
Is there anything related to this stacktrace that comes to anyone's mind?
java.lang.AssertionError: Assert failed: (clojure.core/not (clojure.core/nil? s__16999__auto__))
at camel_snake_kebab.core$__GT_kebab_case_keyword.invokeStatic(core.cljc:21)
at camel_snake_kebab.core$__GT_kebab_case_keyword.doInvoke(core.cljc:21)
at clojure.lang.RestFn.invoke(RestFn.java:410)
at somelib.util$safe_case_conversion$fn__17110.invoke(util.clj:93)
at clojure.lang.AFn.applyToHelper(AFn.java:154)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invokeStatic(core.clj:
Sounds like you have a nil
value being passed where a string or keyword is exp... yeah, what he said.
thanks :))
How can I use less memory? I made a really basic web service using Pedestal, start it with /usr/local/bin/clojure -Xserver
(there is an alias in the deps.edn
) and it uses up a whopping 273.2MB in idle state on the production Linux server!! This is quite a lot when the droplet it runs on only has 1GB available. I have no experience at all fine-tuning JVMs. The code itself can be found here, it isn’t a lot: https://github.com/simongray/el
fwiw, my colleague wrote about memory use https://dev.solita.fi/2022/03/18/running-clojure-on-constrained-memory.html
(disclaimer: I'm no jvm memory expert) A couple of sources mentioned that the default heap size for a jvm is 25% of total system ram. The jvm also needs some memory for the thread stacks and other stuff. That might explain where that 273 megabyte value comes from. Naturally you can fine tune these limits, if necessary, and use tools like visualvm
to monitor the memory usage (e.g. how does the heap utilization fluctuate, how often does GC run, ..) to decide if it's safe to decrease the limits.
@U4P4NREBY 273 MB is really not much in the JVM world 🙂 How did you measure it - is it the memory consumed from the OS perspective? As @U0178V2SLAY said, most JVMs use 1/4 of the available memory as the default "max heap" (-Xmx) size. You could perhaps use a 512MB box but for a production service I would not use anything smaller than 1GB. These small machines tend to have other bad performance characteristics too (e.g. slower disks and network)
When I did (very unscientific) experiments with a minimal (desktop) app on my Macbook/macOS I got something like 130 MB RSS at minimum to be able to run it.
@U06BE1L6T It’s what systemctl status
is reporting. Caddy, running on the same server, consumes ~20 MB. Not really looking to go smaller than 1GB, but I do want to run other JVM web services on the same machine.
@U0178V2SLAY Thank you, that is very useful knowledge.
I have not tried this, but from what I know, you should use JDK 17 with the SerialGC. JDL 17 really improved a lot of things. And SerialGC has the least memory overhead of all GCs. Then try running with something like:
clojure -J-XX:+UseSerialGC -J-Xmx100m -J-Xms1m -J-Xss256k -J-XX:MaxHeapFreeRatio=10 -J-XX:MinHeapFreeRatio=1 -J-XX:-ShrinkHeapInSteps
@U0K064KQV Thanks! I am already using JDK17 so will definitely check this out.
And you can increase Xmx if you ever OutOfMemory. You can also increase Xss if you get some stackOverflows. From what I understand it means to use the SerialGC, which uses the least memory overhead for itself, it runs on a single thread and just collects all garbage. Xmx is max heap size, we say 100m here. Xms is min heap size, but it also might dictate the step increments when it grows HEAP, so we set it to 1m here to have small min and small increments. Xss is the stack size per thread. Each thread will consume the amount you specify here. Too small and you might start to StackOverFlow if you have deep call stacks. MaxHeapFreeRatio and MinHeapFreeRatio is basically the GC is generational, so the HEAP is broken down in 4 sections. This says that only grow a section, that is request more memory for some section if the section only has 1% free space left. And shrink the section if it has more than 10%, that is release the memory back to the OS. You could try lowering that one even more to save more memory. Finally -ShrinkHeapInSteps tells the GC to reclaim everything each time it runs, so not to do it step by step, normally it would run for a bit and stop, to minimize GC pauses, but here it'll run untill all garbage are cleaned.
It's possible these options do nothing if you use another GC, I don't know which are SerialGC specific and which are not.
Except -Xmx
these flags are rarely necessary.
Also there's non-heap memory - e.g. metaspace will typically be of significant size for a Clojure app.
You could perhaps try to run a few such services with a reasonably small heap (like 128 mb).
Aren't there also deployment options that share the JVM and maybe other stuff? I assumed that is what app servers do, but I never got around looking into it.
@U01EFUL1A8M No idea, but if there are that would be preferable, of course.
Adding: -J-XX:+TieredCompilation -J-XX:TieredStopAtLevel=1
also helps reduce the total memory overhead, this is equivalent to what used to -client
in older JVM, basically it will do less optimization, but it still will compile hot code to native machine code, so faster than running interpreted, but it needs to track less thing for it = less memory overhead
@U06BE1L6T My issue with Xmx is it doesn't release free memory back to the OS, that's where SerialGC and the other options come in. That also mean you can set Xmx to your max available memory on your machine, and still Java will use only what it needs. I find that nicer than limiting Xmx and causing OutOfMemory error
At least until this JEP: https://openjdk.java.net/jeps/8204088
It has nothing to do with Xmx - that is the feature of garbage collector that can release unused memory back to the OS. G1 has been able to do that for a long time and also Shenandoah.
Using that code from the prior linked blog with:
clojure -J-XX:+UseSerialGC -J-Xmx100m -J-Xms1m -J-Xss256k -J-XX:MaxHeapFreeRatio=10 -J-XX:MinHeapFreeRatio=1 -J-XX:-ShrinkHeapInSteps -J-XX:+TieredCompilation -J-XX:TieredStopAtLevel=1 -M -m sysinfo
I get 18mb used heap, and 115mb total used memory.Yes I know, and SerialGC is really good at doing that. G1 not so much. I have not tried Shenandoa and ZGC yet. But SerialGC with a HEAP of 100mb or less will run super fast still, I didn't benchmark, but the thing is serial GC is as slow as the total heap size, but that can be fast for small heap, and being single threaded lowers thread overhead which again for small heaps maybe not needed as much. But I did not benchmark
Just want to let you guys now that I really appreciate the continued discussion, @U06BE1L6T and @U0K064KQV. Lots stuff for me to look into now 🙏 Clojurians Slack never fails to deliver some answers.
I have to strongly recommend against messing with tiered compilation flags. Don't do it unless you're willing to suffer abysmal performance hits
As always, the best thing to do is measure first. Does it cause an issue you take 200mb? Maybe it's okay
@U4P4NREBY If you want to really go down in memory usage and also startup time, then consider graalvm native-image. It will probably require some tweaks to get it working, but if it works, it's really a drastic reduction in memory usage.
@U4P4NREBY To get an idea of what such an app would consume, try: https://github.com/kloimhardt/babashka-scittle-guestbook which should be fairly similar. Unless your app is reading a lot of stuff into memory itself of course.
Another idea might be to use some other lighter weight platform like Node.js / #nbb (both graalvm native image and this you could probably host on a raspberry pi)
@U04V15CAJ I considered it, but I wasn’t sure if graalvm would also deliver memory savings.
… and cost is really my ultimate concern, reducing memory usage is just a facet of that 😛
> native-image definitely will if you can get it running with the libraries you're using - sometimes it's a can of worms to figure things out, but I can help a bit if you ask questions in #graalvm
@UK0810AQ2 that's true, everything I'm doing might come at performance degradation, we're trading performance for memory, but I think for personal projects, self-hosted things, or desktop applications, you might want to make that trade sometimes. GraalVM native compilation will have the smallest memory overhead I believe, like substantially smaller in my experience, but more complicated to compile and deploy and has some limits of what libs you can use. I don't think GraalVM on its own yields smaller footprint. I'd also be curious to try IBM Semeru and various options there.
I would agree if I didn't have first hand experience measuring the performance impact of turning JIT off for Clojure, which absolutely kills it.
So recommending someone literally cut their own feet off to fit a bed they haven't measured yet is very risky; always measure first
You can use half as much RAM right off the bat if you build an uberjar out of the application instead of starting it from the command line
Tried it out with visualvm
(def lib 'server)
(def version (format "1.2.%s" (b/git-count-revs nil)))
(def class-dir "target/classes")
(def basis (b/create-basis {:project "deps.edn"}))
(def uber-file (format "target/%s-%s-standalone.jar" (name lib) version))
(defn clean [_]
(b/delete {:path "target"}))
(defn uber [_]
(clean nil)
(b/copy-dir {:src-dirs ["src" "resources"]
:target-dir class-dir})
(b/compile-clj {:basis basis
:src-dirs ["src"]
:class-dir class-dir})
(b/uber {:class-dir class-dir
:uber-file uber-file
:basis basis
:compile-opts
{:disable-locals-clearing false
:elide-meta [:doc :file :line]
:direct-linking true}
:main 'dk.simongray.el.calendar}))
Interesting, so Clojure AOT would also reduce memory overhead... Though I wonder if it's just from the elided meta? Just a clarification, TieredLevel=1 doesn't turn off the JIT, it just compiles everything using C1 JIT, granted that's not as good as C2 JIT in terms of optimization.
Regarding JIT, what I'm saying is that unless you're running with full JIT in Clojure, you might as well not be running with any at all. I ran those tests, you only beginto see the difference at level 3, and a big jump at 4
Another thing you can do is simply inline the definition of dicts
, shaves a bit more ~ 10MB
and that's before turning off the strongest feature Clojure relies on for its performance
A great discussion! I didn't notice they weren't running it as uberjar - that's a first thing to do; if nothing esle, it improves startup time
I wonder if just limiting the code caches and turning flushing on could yield better performance. If you halfed the C1 and C2 cache for example.
I wouldn't do that because every function in Clojure is a class. you have lots of classes
Until you measure where memory is allocated this is just looking in the dark for a black cat which isn't there
Ya, that's good to know. In my case I was running Clojure on a very low memory laptop and needed the REPL to run lean on memory
And if you'll actually look at what consumes memory in this case you'll see it's the arrays backing clojure hash maps
Wouldn't that all count towards the HEAP? I see the codecaches are taking the most space left
Anyway, you can also configure Jetty's thread allocation and queue size w/ https://www.eclipse.org/jetty/javadoc/jetty-9/org/eclipse/jetty/util/thread/QueuedThreadPool.html
With -J-XX:+UnlockDiagnosticVMOptions -J-XX:NativeMemoryTracking="summary" -J-XX:+PrintNMTStatistics
HEAP is only for what your code uses, and I think maybe also the class metadata counts in HEAP. But what the actual JVM uses, so the JIT memory consumption and all that would not count.
class metadata isn't stored on the heap. I always recommend this excellent answer to understand JVM memory pools better: https://stackoverflow.com/questions/53451103/java-using-much-more-memory-than-heap-size-or-size-correctly-docker-memory-limi/53624438#53624438
Hum, ya I think it's because older JDK did put meta into the permgen, but that was changed since JDK 8 I think. That's why I couldn't remember, cause I've read stuff that was pre-JDK8 and post-JDK8
@UK0810AQ2 Thanks for actually running my stuff and giving me very concrete advice! 😮 I haven’t had a chance to look at it myself yet, since I was at work all day and my 1-year-old kept me quite busy the rest of the time.
Another thing this got me thinking about is a sort of tracing process which pulls in only used functions during the compile phase to reduce the size of the final uberjar
Yeah, like carve but all your dependencies starting from a specified endpoint. This is useful only for packaging but it matters in some use cases
"uberjar + direct linking + elide meta save on memory" -- I understand how to make an uberjar, but how does one achieve the other two?
looks like I had a mistake there, they should go on compile-clj
https://clojure.github.io/tools.build/clojure.tools.build.api.html#var-compile-clj
Hum... a carve that goes down to your dependencies? Like a recursive carve? So it removes unused Vars in your project, and from all the namespaces you also use, etc.?
Maybe even that's the way, after copy let carve just go to town until it reaches a fixed point
Ya, I thought it was only the jar size it affected, but I guess there's also some memory savings from it.
Thanks a lot everyone who replied in this thread for opening my eyes to the many options available when it comes to saving memory. In the end, I looked at the memory usage in VisualVM, comparing the uberjar version to the clj-invoked one and a 100MB reduction was enough for me right now. Maybe I’ll revisit this topic once more when I have even more services running on the same 1GB node 😛
In my test, AOT did not affect memory size. I'm using the test http server from the link blog post
When started with clj I get 200MB, and when starting the compild jar I get the same 200MB memory.
But with my command line options, when starting with clojure I get: 92MB and the compiled jar with same options I get: 90MB
So maybe there's something weird with your particular app? Like it attaches a lot of metadata or something like that?
Or maybe your measurement somehow adds the memory used by the clojure command as well?
I am using VisualVM in dev and systemctl status
in prod. Both show a ~100MB reduction going from /usr/local/bin/clojure
to /usr/bin/java -jar
.
Weird, I'll try it with your project directly. The other project I was trying it on didn't exhibit any memory reductions from AOT
BTW didibus I wanted to say that I really appreciate your input as well! I will for sure bookmark the thread.
Another thing to look at which might trade off memory for performance is extracting the hiccup to a template which you'll walk and replace when requests arrive
Here’s a sort of amusing end to all of this…. by chance, I read a post on Hacker News about DigitalOcean planning to increase their prices for the 1GB RAM/1 vCPU droplet I have. The top comment recommended a German cloud solution called Hetzner and it turns out that they offer 2GB RAM/2 vCPUs for slightly less than what I am currently paying for half that at DigitalOcean… so now I suddenly have 2GB RAM available 😆
Haha, well, I still feel I've always wanted to know whats the minimalest memory configuration for Clojure JVM, so this was not wasted.
Funny - "buy more ram" was, after all, the right advice! 🙂 Btw. I've been using Hetzner's VM for a few years for my personal experiments. It's quite fast although they don't offer as many services as other cloud providers. Definitely great for supporting / non-production workfloads
You know the ability to do nested destructuring, i.e., the destructuring of bar
from this: [{:keys [foo] {:keys [bar]} :params :as my-funky-map}]
has that always been available since clojure day dot?
Hello all, I'm using (all-ns)
to list all namespace in my app. But I saw that some namespaces are not there. How should I load all namespaces before call (all-ns)
?
tools.namespace
can help with that - there's a tools.namespace.find
ns https://github.com/clojure/tools.namespace
Sorry, not sure I follow - why would you want to require all namespaces in your project?
I would step back and ask: what problem are you actually trying to solve? How did you get to using (all-ns)
in production code in the first place?
I use metadata like (defn ^{:resolver :usuario/atualizar_usuario} ...
that I'll use to map lacinia resolvers, but when I try to mapping it from (all-ns)
, it is not load yet to be in all namespace. So, I have to force it to load into namespaces.
So it sounds like the namespace that is trying to do that mapping should explicitly :require
all the namespaces that contain resolvers -- that's not unlike what you have to do if you use multimethods scattered across multiple files. See also the tools.deps.alpha
machinery around procurers -- it has to require all the namespaces to get the "side-effect" of the multimethod definition applied.
hummm, I understand, but it is not multimethod. It works if I use :require
but I want to do it dynamic because if I add other resolver, I have to include require manually. That's I don't want to do.
If you really want that to be automatic in production, you're going to need to use tools.namespace
to find those namespaces. And it will be different for source code in directories than for source (or AOT'd) code in a JAR file.
What you are doing is similar to how multimethods work in that you need to explicitly require all the appropriate namespaces for the resolver implementations to be available. Or protocol implementations. Or anything that relies on the side-effects of loading a namespace. Either you explicitly load them all manually, i.e., in code, or you need some sort of runtime "scanner" in production to find additional namespaces to load.
Personally, I think that's a lot of work and logical overhead to add, just to avoid remembering to add a :require
clause when you write a new resolver namespace. ¯\(ツ)/¯
Yes, I'll back into it if the domain get bigger than I can control, for fews you are right
I have a list of lines that I want to transform into an object with the shape of [{:line-number 0, :text "foo"}…]
. The following works but it feels like I’m beating around a bush. Is there a more straightforward way to achieve this?
(->> (str/split "foo\nbar\nbaz" "\n")
(map-indexed vector)
(map (partial zipmap [:line-number :text])))
I mean I love the composability of all the functions but whenever I string together more than 2 clojure functions I always feel like there’s got to be another clojure function better suited 😂
i think you're onto the right path, and haven't missed anything major. Reuse >>> some more specific function.
(->> (str/split-lines "foo\nbar\nbaz")
(map-indexed #(hash-map :line-number %1 :text %2)))
=> ({:line-number 0, :text "foo"} {:line-number 1, :text "bar"} {:line-number 2, :text "baz"})
@U01RL1YV4P7 :face_palm: thank you that’s exactly the sort of thing I felt I was missing 😂