This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-06-16
Channels
- # admin-announcements (1)
- # announcements (1)
- # babashka (130)
- # beginners (120)
- # calva (11)
- # cider (5)
- # clj-kondo (9)
- # cljsrn (17)
- # clojure (63)
- # clojure-australia (1)
- # clojure-canada (21)
- # clojure-europe (37)
- # clojure-israel (4)
- # clojure-uk (6)
- # clojurescript (170)
- # conjure (5)
- # core-async (23)
- # cursive (16)
- # datomic (4)
- # defnpodcast (1)
- # emacs (5)
- # fulcro (1)
- # gis (2)
- # graalvm (31)
- # graphql (4)
- # helix (6)
- # honeysql (16)
- # jobs-discuss (3)
- # juxt (1)
- # lsp (7)
- # malli (20)
- # meander (12)
- # missionary (6)
- # off-topic (50)
- # pathom (4)
- # re-frame (4)
- # react (1)
- # ring (2)
- # shadow-cljs (63)
- # spacemacs (2)
- # sql (15)
- # testing (6)
- # vim (8)
- # xtdb (7)
What is break even point for bb when it becomes too slow etc. and I have to switch to plain Clojure?
@dennisa That's hard to say in general, but as a rule of thumb I would say, scripts that take longer than 5 seconds are probably worth running on the JVM. Are you hitting any limits?
@borkdude Btw, I was thinking about this. I think you discovered that a Graalvm process can be replaced with another process, right? Could this mean that you could start a babashka process together with a clj process and have the clj process take over after X time? This assumes of course that there are no port collisions or other side effects
"after X time", this would just be an explicit call. you could e.g. do the CLI parsing in bb and then hand over control to the clj process
Nice 🙂
Can imagine it is even nice for user feedback. E.g. prepend a spinner to a slow starting jvm process https://github.com/clj-commons/spinner
But in this case you still have to pay the startup cost of the jvm, it’s not like it’s starting up ‘in the background’, since it’s an exec call.
@UFDRD93RR it's honestly not so hard to add, I'm just more worried that people use it in a way to shoot themselves in the foot
e.g. when using this with tasks, the tasks aren't supervised anymore, e.g. when one dependency uses exec, the entire tree of tasks will suddenly become that process
It is still useful, Python has this on its stdlib and when you need exec is the only option
so what's something you would use this for as opposed to just create a child process and wait for it to finish?
I needed to call a second program once where I didn't want to pay the memory consumption of my own program, so exec was the answer (also, I didn't need the result of the program, just calling it)
And I can do this with subprocess, but exec does this without needing special handling
exec
is a Unix system call, so it is better fitted to a place that groups system calls
we didn't add setenv because it would be very confusing since the env is cached in the jvm
BTW, I think exec may compose badly with other parts of Babashka Like, you can't set an environment 😅
so? if exec is possible in bb you would exit bb. same for python. what's the difference?
(Not saying that this doesn't in Python, it is just that Python programs generally have a good behavior on kill -9)
But maybe it is just that Java folks doesn't want to be too much coupled with Unix too
Both setenv and exec are kinda of Unix specific (environments exists in Windows but their behavior are different)
I just found that native programs doesn't need that much cleanup as a VM as big as Java
But this is just an assumption, maybe my second reasoning about Unix specific calls makes more sense
> maybe getting the entire environment map is slow Yeah, this is the part that doesn't make sense for me AFAIK, getenv in Linux is fast
BTW, I found a bug report about this issue of System.getenv
: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8173654
Interesting that it is marked as fixed (but we either hit another issue or something else, since this issue was resolved in 2017)
Yep, but the issue is similar that System.getenv
is returning the old value (and the explanation is what the developer from GraalVM said)
Here is a more complete history of the issue: https://bugs.openjdk.java.net/browse/JDK-8173654
> But independent of that, the caching of the environment on first use (and its immutability except when creating a subprocess) was a deliberate design decision back in the 5.0 days. So no JDK bug here. OTOH ... I don't think the caching behavior was ever specified, and it might be useful to users to know the rules.
So I think they just fixed whatever part of the code that broke this specially for JNI
btw, I think changing the dir may have pretty weird side effects on relative classpaths
> can you summarize it for me? I'm doing other stuff meanwhile 🙂
Sure:
- Like you said, this issue was a regression with calling setenv in JNI. Used to work before JDK 8u60, stopped working after this version
- Martin Buchholz says that there is cache-on-first-access for System.getenv
(actually, this seems to be from a code from ProcessBuilder
that System.getenv
reuses)
- The cache is a explicitly design decision, however it is not documented
- Also, changing environments using JNI is unsupported and may crash the JVM (I think this is highly unlikely unless you change some environment variable that JVM itself uses, but well)
- The issue is fixed without explanation, and I can just assume they fixed whatever caused the regression in 8u60, but this wasn't the expected behavior anyway since the cache is a explicit design decision
> oh so perhaps they could also fix it for the graalvm specific interop Maybe it would be a good idea to open a similar issue in GraalVM issuer tracker and see what the GraalVM devs think
@UFDRD93RR what you could hack is an intermediate C-style function that will call setenv and then exec, to get around the setenv problem
I mean, I think our original setenv would work for exec, for passing through env variables?
Or maybe having babashka.os.setenv
documented with "if you use this function and expect it to call in the current process, please use babashka.os.getenv
instead of System/getenv
but you could also hack a bash script the sets envs and then does exec
and exec to that bash script from bb
> I mean, I think our original setenv would work for exec, for passing through env variables?
Yeah, it should work, the only issue I see with setenv
is with System/getenv
that we could workaround with a wrapper around getenv
from C
yeah. there was also the Windows incompatibility with setenv/getenv, the Windows c lib calls this differently
Yeah, kinda a pain to maintain (lots of code):
- This is the easy part, POSIX: https://github.com/python/cpython/blob/bb3e0c240bc60fe08d332ff5955d54197f79751c/Modules/posixmodule.c#L10944-L10963
- And this is the ugly part, Win32: https://github.com/python/cpython/blob/bb3e0c240bc60fe08d332ff5955d54197f79751c/Modules/posixmodule.c#L10856-L10908
But basically it is a bunch of #if
os_putenv_impl
is the actual implementation, where on POSIX it compiles to call setenv()
, while on Windows it compiles to _wputenv()
Since setenv()
uses 3-arity, it pass setenv(env_var, value, 1)
, while on Windows they do _wputenv(env_var + "=" + "value")
AFAIK
(The call to _wputenv()
ends doing a bunch of validation because of this concat, this is why the code is so big)
I can give it a try if you want @borkdude, I mean, I am probably the only person interested on this right now 😆
Since you already did the hardwork figuring out how to call C code in setenv
branch, I think now it is mostly writing C+Java code
OK, I merged the master branch into the set-env
branch. No promises that if you make it work, that I will merge the branch, but feel free to try it :)
Yeah, please review the code and take your own conclusions I mean, it is a pretty niche case
@borkdude I've got a lot of map/filter code that runs per se fast enough ~ . But printing becomes super slow after a while, especially in repl, it takes 500 ms for one line. And printing is important for my scripting.
When you say "in the repl", are you working from Emacs by any chance? If you print a lot, it does tend to become laggy... clearing the REPL output helps in this case (usually)
I work with VSC. The repl printing becomes so slow I have to restart the Repl, and VSC at some point. I am afraid it will happen in CLI/prod and become a bottleneck
@dennisa Is it possible for you to make a minimal repro for this? I may have an idea where a performance problem with println could come from and I might be able to optimize it
It may just be an issue with your editor, so I'd like to have some kind of editor-independent repro
I need to find a way to separate the business logic from the printing code. Do you have ideas how to do it?
as a first step, you could try to run your scripts outside of the editor and see if your problem is editor related
@dennisa
printing in VSC (you probably mean vs code + Calva?) does indeed become slower and slower with more lines in output.calva-repl
"file".
Can you maybe mitigate (in editor) by shrinking some backlog setting?
btw @dennisa in case it's really the issue with calva and its slow appending to output.calva-repl
, I've already tried doing some optimizations in Calva in the past. You may check this archive: https://clojurians-log.clojureverse.org/calva/2021-03-30
Basically i just added batching into append
function inside results-doc.ts
. It made quite a big difference, especially when you append many lines in one-by-one fashion.
Here's a youtube video https://www.youtube.com/watch?v=GufgU7C4n6s showing the slowness and how it might be optimized. Unfortunately I didn't have time back then to fully finish this effort. If this slowness is what you're experiencing, then we should probably continue the conversation in #calva channel instead.
It's likely you are right guys. I had 11K lines in the calva output and once removed the performance is back up. : ) will give it a try and get back
Is there documentation anywhere that compares bash scripting to bb? Would like to point coworkers to this to make it easier for them to try bb. If not, was thinking of starting a wiki page
@cldwalker The wiki is open I believe.
There is also a github discussion about this. I'm also willing to incorporate this in the book at some point, but I'd be fine if someone else took initiative on this as well or maintained some page
Created https://github.com/babashka/babashka/wiki/Tasks:-Bash-and-Babashka-equivalents as a first pass. Happy to move to the book at some point. Fixes and more contributions welcome 🙂
@cldwalker Good start! Perhaps explain what shell
is since not all people might be familiar with bb.edn
's tasks setup. The shell
function comes from babashka.tasks
which is based on babashka.process/process
I want to ask if this is a well-known thing before opening an issue/continuing a discussion; I've done a cursory search through the issues and the book... running "one-liners" (passing forms on the command line without -e) on Windows will throw if the form contains "illegal" path characters, e.g. bb "(zero? 1)"
will throw because of the '?'
@U013JFLRFS8 This may just be a shell-specific thing? Which shell is this, powershell or cmd.exe?
@U013JFLRFS8 Ah yes, I see the issue