Fork me on GitHub
#babashka
<
2020-12-03
>
jumar08:12:40

In https://book.babashka.org/#_scripts there's this: > When writing scripts instead of one-liners on the command line, it is not recommended to use `*input*` May ask about more details why this isn't recommended? E.g. Nate uses user/*input* in his scripts: https://github.com/justone/bb-scripts/blob/master/src/comb.clj#L31

borkdude08:12:11

@jumar I prefer not to since it's not obvious from the script what the corresponding i/o flags on the command line should be

jumar08:12:44

Ah, so the shape of *input* will vary depending on -i vs -I but *in* stays the same, right? It would be great to add a little note to the babashka book's section referenced above

borkdude08:12:25

ok, will do

jumar08:12:41

Great, many thanks!

👍 3
jumar09:12:02

What's the recommended way to parse html in a babashka script? I looked at https://github.com/borkdude/babashka/blob/master/doc/projects.md#pods and found https://github.com/jaydeesimon/pod-jaydeesimon-jsoup but I'm not sure what are these "pods" about and how they differ from "libraries".

borkdude09:12:05

@jumar pods are CLIs that you can use as libraries from within babashka. So instead of shelling out manually you can just require a namespace and call functions.

borkdude09:12:10

@jumar the bootleg pod is the most common one to use for HTML parsing and production: https://github.com/retrogradeorbit/bootleg#babashka-pod-usage

jumar11:12:14

I've been trying to use bootleg to parse an HTML (string) for a while but was unable to do so. How can I do that? I basically want to parse the string into a navigable structure - hiccup or similar. Do you know if bootleg exposes a function that I can use?

borkdude11:12:39

ok, let me try it

borkdude11:12:46

(ns bootleg-script
  (:require [babashka.pods :as pods]))

(pods/load-pod "bootleg")
(require '[pod.retrogradeorbit.bootleg.utils :as utils])

(-> "<a>Hello</a>"
    (utils/convert-to :hiccup))

🎉 3
borkdude11:12:20

The utils/convert-to function is a versatile function which converts strings (HTML) to something else and EDN (hiccup) to HTML

borkdude11:12:13

@U1QQJJK89 might also be able to chime in here if he wants. I know he's been busy with other stuff lately. (Hope you're doing alright!)

jumar11:12:31

Hmm, I now tried in in the bb repl and it works as you showed. My problem is that I tried to use this in a clojure repl (deps.edn - something like Nate uses): deps.edn:

{:aliases {:clj {:extra-deps {;; Additional libs for clojure to match babashka includes
                              org.clojure/tools.cli {:mvn/version "1.0.194"}
                              org.clojure/data.csv {:mvn/version "1.0.0"}
                              org.clojure/data.xml {:mvn/version "0.2.0-alpha6"}
                              babashka/babashka.curl {:mvn/version "0.0.1"}
                              ;; used pods to load bootleg for HTML parsing: 
                              babashka/babashka.pods {:git/url ""
                                                      ;; 
                                                      :sha "1417f30fc4001cc9490b5f83c68630ea877d92d6"}
                              cheshire/cheshire {:mvn/version "5.10.0"}}
                 :extra-paths ["dev"]}}
 :paths ["src"]
 :deps {
        clj-tagsoup/clj-tagsoup {:mvn/version "0.3.0"}}}
In this script:
(ns html  (:require [babashka.curl :as curl]
            #_[pl.danieljanus.tagsoup :as html]
            [babashka.pods :as pods]
            ))

;;; use pods to load bootleg for html parsing: 
;;; Note that you have to install bht pod on the system first!
(pods/load-pod "bootleg")
(require '[pod.retrogradeorbit.bootleg.utils :as utils])

(utils/html->hiccup "<html><body><p>ahoj</p></body></html>")
It just freezes on the last line so I guess this isn't a supported use case?

borkdude11:12:04

Works fine over here:

$ bb /tmp/bootleg.clj
[:a "Hello"]
[:html [:body [:p "ahoj"]]]

borkdude11:12:30

ah a Clojure repl

jumar11:12:03

(Yes, working in cider and starting the repl as usual with cider-jack-in using the deps.edn file)

borkdude11:12:31

I think this is due to an incompatibility with pods on the JVM and in sci. For now you could try to use the babashka nREPL

borkdude11:12:44

And please make an issue about this at babashka pods

borkdude11:12:41

Nm, I will make an issue now

jumar11:12:41

Perfect, thanks!

borkdude11:12:24

Ah, I found it. Put a (require '[clojure.zip]) before loading the pod

borkdude11:12:45

@jumar So, this should work on the JVM:

(ns html  (:require [babashka.pods :as pods]))

(require '[clojure.zip])
(def pod (pods/load-pod "bootleg"))
(require '[pod.retrogradeorbit.bootleg.utils :as utils])
(prn (utils/html->hiccup "<html><body><p>ahoj</p></body></html>"))

(pods/unload-pod pod)
(shutdown-agents)

borkdude11:12:10

Note that I added:

(pods/unload-pod pod)
(shutdown-agents)
to make the JVM exit normally.

jumar11:12:07

Excellent! Thanks again.

jumar09:12:06

Hmm, but I'd like to just use a library without having to install any additional tool. Is there another option?

borkdude09:12:08

Not currently. But babashka will try to make working with pods easier in the future, so the pod will be automatically downloaded if you put it in the babashka.edn (which does not exist yet).

👍 3
💯 3
Michaël Salihi10:12:32

Yeah, that's great feature!

borkdude10:12:17

I'm brainstorming here: https://github.com/borkdude/babashka/issues/473 Feel free to come with ideas.

jumar09:12:20

Ok, thanks for you help!

isak18:12:04

What do you guys do to report progress to the user?

isak19:12:53

yea it was very cool to see that it just worked 🙂

nate19:12:53

great summary, thanks for putting it together!

dgb2320:12:06

nice touch: <summary><details> is used on the https://github.com/xapix-io/matchete readme. First time I see this is a readme. It’s a simple trick to give the layout a bit more hierarchy.

dgb2320:12:58

(the lib is mentioned in the babashka news article)

borkdude20:12:43

@denis.baudinot yeah, certainly. Maybe I'm going to convert the projects.md and news.md page to this format as well: https://book.babashka.org/ So you will have http://babashka.org/news, http://babashka.org/projects with a nice UI