Fork me on GitHub
#babashka
<
2023-12-18
>
ccfontes14:12:14

How do I force this to happen even without need?:

Clojure tools not yet in expected location: /home/app/.deps.clj/1.11.1.1413/ClojureTools/clojure-tools-1.11.1.1413.jar
Downloading  to /home/app/.deps.clj/1.11.1.1413/ClojureTools/clojure-tools.zip
Unzipping /home/app/.deps.clj/1.11.1.1413/ClojureTools/clojure-tools.zip ...
Successfully installed clojure tools!
Use case: I need this setup to happen early on in a Dockerfile, to avoid doing the setup over and over again at a later step (triggered by COPY). Thanks!

borkdude15:12:40

In this Docker image, do you intend to run an application with dependencies in bb.edn? I think the best way to do this is to produce an uberjar first and then run the uberjar

borkdude15:12:49

such that no dependencies have to be fetched at runtime

borkdude15:12:14

you can do this with:

bb uberjar

ccfontes15:12:13

Thanks. In this case I'm fetching the dependencies when building the docker image (the user triggers build), and using bb prepare. The problem is that the clojure tools setup always run when there are changes to the source code files of the user (triggered with COPY). I want this setup to happen only in the first build, thus a step before the COPY Made available the code in a branch for better context: https://github.com/ccfontes/faas-bb/blob/setup-clj-tools/template/bb/Dockerfile#L24 To be fair, this only happens when function/bb.edn changes, which is a bit more rare than other source files.

borkdude15:12:17

bb prepare should also do Clojure tools not yet in expected location: when you are building the docker image. But you have to preserve the ~/.deps.clj directory into the image for it not to happen again

ccfontes15:12:52

sure, it's preserving

borkdude15:12:43

then what is the issue?

borkdude15:12:46

I don't see how the clojure tools setup would run multiple times, even with bb.edn changes

borkdude15:12:57

unless you're not preserving the clojure tools directory

ccfontes15:12:13

because of COPY. it busts the cache

borkdude15:12:24

what cache?

borkdude15:12:54

It sounds like this is a docker problem, not a bb problem then?

borkdude15:12:32

if you can make a repro Dockerfile I could have another look, not sure what to do about it now

borkdude15:12:11

maybe it helps to know that the tools dir is configurable: DEPS_CLJ_TOOLS_DIR

👍 1
ccfontes15:12:17

If I could force the clj tools setup before COPY, then it would solve my problem.

borkdude15:12:39

running bb prepare would force that

ccfontes15:12:40

I mean, I could some hacking I guess with a dummy project

borkdude15:12:49

do you mean you don't have a bb.edn?

ccfontes15:12:06

yes, I don't have bb.edn earlier than COPY

borkdude15:12:11

then you can do it like this:

bb -Sdeps '{:deps {some-fake-deps}}` prepare

1
ccfontes15:12:23

oh jeez of course. genious!

Ingy döt Net20:12:44

My ys program is written in clojure, evals with sci, compiled to binary with native-image... like bb is > ys is installed as a symlink to ys-0.1.28 which is the real binary file I want to have a couple subcommands that invoke a bash script say ys-sh that is installed in same dir as ys I don't know if I should make ys the bash thing that calls the binary ys-0.1.28 Or if I can shell out from the compiled ys-0.1.28 to ys-sh I would want the binary to invoke the sh script using the same directory it was in (not a PATH lookup)

borkdude20:12:12

using babashka.process, you can do an exec call (which is a true OS exec call) which might help. not sure about the rest of the trade-offs

Ingy döt Net20:12:30

that's awesome and another question that I had, so thanks I was just going to use https://clojuredocs.org/clojure.java.shell/sh but how do I get the abs path of my binary program at runtime

borkdude20:12:09

clojure.java.shell does not let you do an exec call btw it will create another child process

borkdude20:12:37

This is how you can get the binary name of the invoked process:

$ bb -e '(-> (java.lang.ProcessHandle/current) .info .command .get)'
"/Users/borkdude/dev/babashka/bb"

Ingy döt Net20:12:11

thanks! I'll try it.

Ingy döt Net20:12:05

it's preferable for me for the binary to call or exec the bash after it does the CLI option parsing, so I don't have to repeat that in bash

Ingy döt Net02:12:31

I'm getting

Method info on class java.lang.ProcessHandleImpl not allowed!
when I call basically
(sci/eval-string "(-> (java.lang.ProcessHandle/current) .info .command .get)"
  {:classes {'java.lang.ProcessHandle java.lang.ProcessHandle}})

Ingy döt Net02:12:04

in the repl, not with graal...

borkdude09:12:26

check the bb classes.clj file, it has a function called public-class which cases the Impl class back to the public class

borkdude09:12:38

why do you need to do this in SCI btw?

Ingy döt Net17:12:01

You're right. I don't need this in SCI for what I'm trying to do here. Thanks for noticing that!

Ingy döt Net17:12:26

I graal compiled ys with it in there and it gets the install path right. \o/

Ingy döt Net17:12:18

when not graal compiled it gives /usr/lib/jvm/java-17-openjdk-amd64/bin/java though

Ingy döt Net17:12:22

I'll probably just default to a PATH lookup in that case since I want to be able to test without a full native-image compile.

Ingy döt Net23:12:55

process is very nicely done.

❤️ 1
nyor.tr22:12:27

Is it possible to add an import to a Babashka script? I would like to use the com.deepbeginnings/eximia dependency from Babashka, but I get the following error:

----- Error --------------------------------------------------------------------
Type:     java.lang.Exception
Message:  Unable to resolve classname: javax.xml.stream.XMLInputFactory
Location: eximia/core.clj:21:3

----- Context ------------------------------------------------------------------
17: 
18:   CDATA blocks can also be read and written as [[CData]] records, processing instructions as [[ProcessingInstruction]]s
19:   and comments as [[Comment]]s."
20:   (:refer-clojure :exclude [read])
21:   (:import [javax.xml.stream XMLInputFactory XMLStreamReader XMLStreamWriter XMLStreamConstants XMLOutputFactory]
      ^--- Unable to resolve classname: javax.xml.stream.XMLInputFactory
22:            [javax.xml.namespace QName]
23:            [javax.xml XMLConstants]
24:            [java.io Reader Writer InputStream OutputStream StringReader StringWriter]
25:            [clojure.lang IPersistentMap]))
26: 

----- Stack trace --------------------------------------------------------------
eximia.core - eximia/core.clj:21:3
xml-utils2  - /tools/dataset-utils/xml_utils2.clj:6:3
Doesn't Babashka already include the javax.xml dependencies? I would like to use eximia instead of data.xml because it's faster with large files. This is the code:
(require '[babashka.deps :as deps])

(deps/add-deps '{:deps {com.deepbeginnings/eximia {:mvn/version "0.1.3"}}})

(ns xml-utils2
  (:require [eximia.core :as exml]
            [clojure.java.io :as io]))

(defn read-xml
  "Reads the xml from a file"
  [file]
  (try
    (->
     (with-open [input (io/input-stream file)]
       (exml/read input  {:tag-fn exml/qname->unq-keyword})))
    (catch Exception e
      (println (str "Could not read xml file '" file "': " e))
      (System/exit 1))))

borkdude22:12:59

None of the jaxax.xml classes are currently exposed in babashka, but you can use clojure.data.xml instead

borkdude22:12:05

which is already built-in

borkdude22:12:28

I'll have a look later if we can exposes those javax.xml classes, but I doubt that eximia will be faster when running in bb due to interpretation overhead, but it's worth testing. Do you have any examples of big XML files + clojure.data.xml + eximia example that compares speeds?

borkdude22:12:19

if you can prepare a repository that I can clone locally and just run with both, that would be a nice start

nyor.tr22:12:03

Yes, I do, I have tried it already with data.xml and it is slower parsing and specially updating. Currently they are company files, but I'll see if I can generate some similar ones. XML files are around several MB.

borkdude22:12:27

if you want to experiment yourself, you can add the javax.xml classes here: https://github.com/babashka/babashka/blob/master/src/babashka/impl/classes.clj and then compile bb locally

nyor.tr22:12:32

OK, thanks, I'll try

borkdude22:12:24

no, that's it

👍 1
borkdude22:12:34

you could try to first run with the JVM-version of bb to see if it works:

clj -M:babashka/dev my_script.clj
this gives you faster feedback (where babashka/dev is an alias which points to your local clone of bb)

nyor.tr18:12:45

Hi again, I added the javax.xml classes to babashka as you suggested. It compiled with no problems. Now, when I run the script to read an xml file I don't get the Unable to resolve classname error anymore, but now I get the following error:

Method setProperty on class com.sun.xml.internal.stream.XMLInputFactoryImpl not allowed!
Is there a way to solve this?

borkdude18:12:35

yes, let me check

borkdude18:12:09

Add to the :public-fn casts:

(instance? javax.xml.stream.XMLInputFactory v)
javax.xml.stream.XMLInputFactory

borkdude18:12:28

:public-class I mean:

borkdude18:12:37

add it near the bottom

nyor.tr18:12:18

OK, I'll try

nyor.tr18:12:19

Yes!, the error message disappeared, but now I get the error:

Method close on class com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl not allowed!
I suppose I have to do the same with the javax.xml.stream.XMLStreamReader class?

borkdude18:12:01

you got it

👍 1
nyor.tr18:12:34

It worked! Thanks!! I will add and test the functions to write xml files, I might have to add some more classes to the :public-class.

nyor.tr19:12:54

Trying to write an XML file, I got an error:

Type:     java.lang.NoSuchFieldException
Message:  close
I'm not sure if this is the same case as before.

nyor.tr19:12:48

I Does write the xml file but stops almost at the end, its incomplete and invalid xml, the file is not that big (541KB).

borkdude19:12:51

This is a reflection issue on something in a with-open form

borkdude19:12:17

at least .close is called on something but the .close method isn't available in the native image . perhaps you can track down which class this is

nyor.tr19:12:04

It shows this in the error:

----- Context ------------------------------------------------------------------
25: (defn save-xml
26:   "Saves the xml to a file"
27:   [xml path]
28:   (let [opts {:tag-fn exml/keyword->qname}]
29:     (exml/write xml (io/output-stream path) opts)))
        ^--- close

borkdude19:12:58

let's see if I can reproduce that

👍 1
borkdude19:12:10

This does work:

$ bb -e '(with-open [out (java.io.StringWriter.)] (.write out "foo") (str out))'
"foo"

nyor.tr19:12:21

How about with a long file like more than 500KB?

borkdude19:12:14

don't know. I think it would be useful to clone eximia locally and insert some printlns etc to see where exactly this is happening and then make a repro to fix

nyor.tr20:12:37

I debugged eximia, and putting a (.close out) after https://github.com/nilern/Eximia/blob/d57c6268eb3ee4a5c3f5baad7336a02693662ae0/src/eximia/core.clj#L135 generates the whole xml file, but the error remains.

borkdude20:12:57

and what is the type of out here? can you print (prn (type out))?

nyor.tr20:12:40

Seams to be: com.sun.xml.internal.stream.writers.XMLStreamWriterImpl

borkdude20:12:39

and you have javax.xml.stream.XMLStreamWriter in the class map + public-class fn?

borkdude20:12:56

Try to put this before the Map conversion

borkdude20:12:11

Before `(instance? java.util.Map v) java.util.Map`

borkdude20:12:29

since .close will be called on a Map if you do the cast later

nyor.tr20:12:03

I think so, I'll try.

nyor.tr20:12:25

Yes, that worked! Again thanks! Reading and writing xml files with eximia!

borkdude20:12:26

Now, is it faster than clojure.xml now?

nyor.tr21:12:22

Unfortunately no :(, here are some quick benchmarks: There was no transformation to the xml it only wrote back the xml that was read from the file. reading eximia babashka file size: 541K "Elapsed time: 236.811919 msecs" reading clojure.data.xml babashka file size: 541K "Elapsed time: 0.562374 msecs" writing eximia babashka file size: 541K "Elapsed time: 374.571785 msecs" writing clojure.data.xml babashka file size: 541K "Elapsed time: 117.879349 msecs" --- reading eximia babashka file size: 14M "Elapsed time: 5233.275265 msecs" reading clojure.data.xml babashka file size: 14M "Elapsed time: 0.565281 msecs" writing eximia babashka file size: 14M "Elapsed time: 8450.353057 msecs" writing clojure.data.xml babashka file size: 14M "Elapsed time: 3803.034851 msecs" --- reading eximia clj file size: 541K "Elapsed time: 49.630511 msecs" reading clojure.data.xml clj file size: 541K "Elapsed time: 19.460798 msecs" writing eximia clj file size: 541K "Elapsed time: 87.266801 msecs" writing clojure.data.xml clj file size: 541K "Elapsed time: 432.58637 msecs" --- reading eximia clj file size: 14M "Elapsed time: 284.419659 msecs" reading clojure.data.xml clj file size: 14M "Elapsed time: 19.567271 msecs" writing eximia clj file size: 14M "Elapsed time: 497.282376 msecs" writing clojure.data.xml clj file size: 14M "Elapsed time: 2173.910879 msecs"

borkdude21:12:38

This is what I suspected, the interpretation overhead is simply too big

nyor.tr22:12:37

Well, it was a nice experiment. I think I'll try my scripts with babashka and clojure.data.xml. Thanks for your time!

borkdude22:12:45

perhaps it's useful to make a PR with it, then we can close it, but at least we have the code how to add those xml classes in case we do need it at some point

👍 1
nyor.tr22:12:41

Here is a git patch with the changes I made, if it's quicker:

diff --git a/src/babashka/impl/classes.clj b/src/babashka/impl/classes.clj
index 5c2e808d..25998d78 100644
--- a/src/babashka/impl/classes.clj
+++ b/src/babashka/impl/classes.clj
@@ -553,7 +553,16 @@
                 ~(symbol "[Ljava.util.regex.Pattern;")
                 ~(symbol "[Lclojure.core$range;")])
           ~@(when features/yaml? '[org.yaml.snakeyaml.error.YAMLException])
-          ~@(when features/hsqldb? '[org.hsqldb.jdbcDriver])]
+          ~@(when features/hsqldb? '[org.hsqldb.jdbcDriver])
+          ;; for eximia (xml parser)
+          javax.xml.stream.XMLInputFactory
+          javax.xml.stream.XMLStreamReader
+          javax.xml.stream.XMLStreamWriter
+          javax.xml.stream.XMLStreamConstants
+          javax.xml.stream.XMLOutputFactory
+          javax.xml.namespace.QName
+          javax.xml.XMLConstants
+          ]
     :constructors [clojure.lang.Delay
                    clojure.lang.LineNumberingPushbackReader
                    java.io.EOFException]
@@ -675,6 +684,15 @@
                          java.lang.ProcessHandle
                          (instance? java.lang.ProcessHandle$Info v)
                          java.lang.ProcessHandle$Info
+                          ;; for eximia (xml parser)
+                         (instance? javax.xml.stream.XMLInputFactory v)
+                         javax.xml.stream.XMLInputFactory
+                         (instance? javax.xml.stream.XMLStreamReader v)
+                         javax.xml.stream.XMLStreamReader
+                         (instance? javax.xml.stream.XMLOutputFactory v)
+                         javax.xml.stream.XMLOutputFactory
+                         (instance? javax.xml.stream.XMLStreamWriter v)
+                         javax.xml.stream.XMLStreamWriter
                          ;; added for calling .put on .environment from ProcessBuilder
                          (instance? java.util.Map v)
                          java.util.Map

borkdude22:12:16

sure. that will do, thanks :)

👍 1