Fork me on GitHub
#clojure
<
2021-06-16
>
Juλian (he/him)12:06:47

what's the proper way to parse HTML? there's clj-xpath, but that expects valid XML. should I have a look at hickory? or is there some other library I didn't find yet?

jumar12:06:22

I used Hickory in the past and it worked well

borkdude13:06:57

I personally like jsoup a lot

alexmiller14:06:13

I thought you were supposed to use regex?

😆 6
borkdude14:06:02

@U064X3EF3 I think using clojure.spec.alpha is the correct approach

noisesmith14:06:04

it's been some years, but a huge gotcha for the jsoup bindings I found: it returns a lazy-seq of document tags and data, and it creates a gc root for the underlying jsoup object in a helper thread it only allows the jsoup object to be freed if you consume the entire document the issue I ran into is that I wanted to lazily scan each document for a specific tag containing matching data (it's a lazy-seq, this is what they are for right?) and the book-keeping the library did meant I had to choose between reading an entire document I don't need or leaving the jsoup object hanging as unrecoverable garbage in the vm

noisesmith14:06:48

yet another example of "never mix laziness and state"

borkdude14:06:01

jsoup bindings? just use jsoup directly

noisesmith14:06:18

(it was years ago but my tech lead at the time was highly averse to adding any java interop to our codebase)

Juλian (he/him)06:06:27

Thanks, I'll have a look at jsoup and hickory

km19:06:04

Hey guys. Trying to use clojure.spec to write a simple string calculator, i.e. "2+2" => 4. It works, but it seems really verbose! I also don't like how I used regex-replace before conforming to spec (I'd rather use 100% spec). Any ideas how I could clean it up?

(ns demo.calculator
  (:require
   [clojure.spec.alpha :as s]
   [clojure.string :as string]))

(s/def ::digit (set "0123456789.n"))
(s/def ::operator (s/cat :kw #{:operator} :val char?))
(s/def ::expression (s/* (s/alt :number number?
                                :operator #{\/ \* \+ \-}
                                :parenthetical ::parenthetical)))
(s/def ::parenthetical 
       (s/cat :open #{\(}
              :body (s/* (s/alt :number number?
                                :operator #{\/ \* \+ \-}
                                :parenthetical ::parenthetical))
              :close #{\)}))

(def order-of-operations
  [[:operator \*]
   [:operator \/]
   [:operator \+]
   [:operator \-]])

(defn some-index [coll ops]
      (if (empty? ops) nil
          (let [i (.indexOf coll (first ops))]
            (if (= i -1)
              (some-index coll (rest ops))
              i))))

(defn evaluate [expression]
  (let [index (some-index expression order-of-operations)
        or-paren #(let [e (Exception. (str "invalid arithmetic at " %))]
                    (case (first %)
                    :parenthetical (-> % second :body evaluate)
                    :number (second %)
                    (throw e)
                    ))]
    (if-not index (or-paren (first expression))
            (let [[left-hand operator right-hand :as operation]
                  (take 3 (drop (dec index) expression))
                  before (take (dec index) expression)
                  after (drop (+ 2 index) expression)
                  result [:number ((resolve (symbol (str (second operator))))
                                   (or-paren left-hand)
                                   (or-paren right-hand))]]
              (evaluate
               (concat before
                       [result]
                       after))))))

(defn parse-math [arithmetic]
  (->>
   (string/replace
    (string/replace arithmetic #" " "")  #"(^|[^0-9])-([0-9]+)" "$1n$2")
   seq       
   (partition-by #(s/valid? ::digit %))
   (map #(if (s/valid? ::digit (last %))
           (Integer/parseInt (string/replace (apply str %) #"n" "-"))
           %))
   flatten
   (s/conform ::expression)
   evaluate))

(= 1/2 (parse-math "1/2"))
(= 42 (parse-math "(((((((((((42)))))))))))"))
(try (parse-math "2+*2")
     (catch Exception e true))
(= -4 (parse-math "-1 * (2 * 6 / 3)"))

zendevil19:06:16

What is the time and space complexity of the core count function?

ghadi19:06:58

On a vector or map, O(1)

ghadi19:06:17

On a lazy seq, you have to realize the seq to count it

zendevil19:06:07

why the hell does the racket implentation of lisp have a O(n) time for the built-in length function?

isak19:06:31

don't they use linked lists?

phronmophobic19:06:36

clojure has counted? to check if a coll implements count in constant time

zendevil19:06:32

How are vectors and maps implemented differently in Clojure to give O(1) time for count as opposed to O(n) time for length in racket? And is the space complexity for count O(1) too?

phronmophobic19:06:24

There's lots to be written about how to implement data structures with different time/space complexities for various operations. I think why may be more interesting than how. From https://download.clojure.org/papers/clojure-hopl-iv-final.pdf > 3.4 Clojure’s Data Structures > > While linked lists are available and can be implemented in most programming languages, and have a certain elegance and functional heritage, their use is dominated in professional practice by the use of vectors/arrays and hashtables/associative maps. Of course, lists are sequential like arrays and one can access the nth element, and e.g., Common Lisp has a notion of association lists and a function ‘assoc’ for creating mappings and thus treating a list of pairs like a map. But it highlights an important aspect of programming and difference from mathematics in that merely getting the right answer is not good enough. Not all isomorphisms are viable alternatives; the leverage provided by indexing support matters. I felt Clojure would be a non-starter for practitioners without credible substitutes for O(1) arrays and maps.

👍 3
phronmophobic20:06:57

Answering how clojure implements its data structures is a little complicated since maps and vectors are primarily defined by their interface rather than their implementation. In fact, the implementation can "change" as these collections shrink and grow even though the interface stays the same.

👍 6
quoll20:06:20

user=> (type (zipmap (range 8) (range 8)))
clojure.lang.PersistentArrayMap
user=> (type (zipmap (range 9) (range 9)))
clojure.lang.PersistentHashMap

👆 6
didibus20:06:09

The simple "how" is just that Clojure map and vector keep track of the count as elements are added/removed, so when you ask for the count, it's already been computed, you don't have to go through and count things.

Elliot Stern14:06:11

count is O(1) in Racket on vectors and can be O(1) or O(n) on hash tables. The big difference is that cons lists and assoc lists (cons lists of key value pairs) are much more idiomatic in Racket, and the vector implementation is much less generally useful than in Clojure.

Elliot Stern14:06:15

Racket doesn’t provide a persistent vector with a constant time vector-set! function; you can only update mutable vectors. Clojure gets around that by using a comparatively recent data structure for maps and vectors, the Hash Array Mapped Trie which allows amortized constant updates and access.

didibus02:06:23

I don't think anything about Hash Array Mapped Trie is needed for the O(1) count though, Clojure just literally counts elements as calls to assoc are made.

zendevil06:06:44

@U0K064KQV can you point me to the loc’s where this mutation on count’s value happens?

didibus06:06:03

Ya, if you look here: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentHashMap.java#L139 For maps, each assoc implementation returns a new map with the count parameter set to count + 1 and here for vector on cons is the same: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L228

dpsutton22:06:02

is there a way to remove a var from the current namespace? I did a (apply require clojure.main/repl-requires) not realizing there is a function in this ns called source. Now re-evaluating the file errors saying that source already refers to: #'clojure.repl/source in namespace

Dane Filipczak22:06:18

is it not (ns-unmap ns ’do-something)

dpsutton22:06:07

that's exactly it

dpsutton22:06:24

i had forgotten it was unmap. i was looking at apropos results for remove and var

dpsutton22:06:26

thanks dane!

seancorfield22:06:41

Removes aliases, interned names (public + private), refers except core.

Dane Filipczak22:06:16

@U04V70XH6 so cool, borrowing this : )

dpsutton22:06:23

neat. thanks sean

seancorfield22:06:07

It’s nice because it cleans out a namespace without actually destroying the ns itself, so loading the file “does the right thing” and any other nses that hold references to this ns — via :require — don’t get broken. I don’t have to use it very much, but it’s “just enough cleanup” to avoid any of those reload/refresh things…

dpsutton22:06:43

oh i'm jealous reading this config in cljs instead of elisp 🙂

seancorfield23:06:26

VS Code/Clover or Atom/Chlorine — exact same config (and hot keys) work on both.

coby22:06:23

I need to serialize HTTP (Ring) requests for a debugging tool I'm writing in ClojureScript. Each request is enriched with extra data, such as plugins (functions) that have been loaded, and random stuff like Reitit Match records. These are things that the ClojureScript environment isn't going to know about so I need a generic, legible way to represent them to the user. What's a good strategy for doing this? I found this article about https://tech.redplanetlabs.com/2020/01/06/serializing-and-deserializing-clojure-fns-with-nippy/, but I think that's overkill for what I need (and Nippy doesn't work in CLJS).

phronmophobic23:06:15

If nippy is overkill, then maybe serializing your data as edn is appropriate?

phronmophobic23:06:14

If you don't need to store data and are just transmitting data between applications, then transit might be a good fit, https://github.com/cognitect/transit-clj

coby23:06:21

> maybe serializing your data as edn is appropriate? That's the idea, but that results in fns that the EDN reader doesn't know how to deserialize client-side. Here's the current process, for context: • Enrich each request on the server as it comes in • Call (prn-str req) and send it over a websocket to the debugger • On the debugger client side, do (edn/read-string req) I think the issue here is with prn-str being too simplistic

phronmophobic23:06:06

If you're serializing to transmit between applications and aren't storing the data, then check out transit

phronmophobic23:06:07

I'm not aware of any serializer that can generically serialize functions. You'll probably have to preprocess the data or ignore that part of the data.

phronmophobic23:06:09

There are also some caveats when using pr-str like *print-length*. Let me see if I can remember a good reference that covers them.

coby23:06:16

Yeah, I guess what I'm asking about is the pre-processing part. I need to transform functions and other "unkown" entities like third-party records into simple symbols or strings, something that won't result in tagged values like #object that EDN/CLJS would need to load extensions for.

coby23:06:43

Transit is great, I will probably switch to it eventually to avoid the *print-length* trap etc. But AFAICT it'll still come up against the limitation of not knowing how to deserialize random library object client-side. It's okay to lose some fidelity in those cases and just render strings instead...so maybe what I'm looking at is extending Datafiable for fns etc. via metadata? Just a little hazy on how that would work. 🙂

phronmophobic23:06:55

another option is to use clojure.walk/prewalk or similar and remove any types you don't recognize

3
coby23:06:32

Right, makes sense. Talking through this has been helpful, thanks!

phronmophobic23:06:13

Unfortunately, I can't find anything recent that does a good job of explaining the "right" way to write edn.

coby23:06:53

I'm not too worried about it as I'll likely switch to Transit and translate anything unrecognized into stuff it can handle out of the box.

👍 3
colliderwriter23:06:15

I need to make an appeal to the collective memory here: somewhere on youtube, there is a video of a presentation about (as far as i remember and it's killing me) building a DSL out of ASTs constructed from a sample data structure about planets(?). My best guess for a time frame is three or four years ago. Does that ring any bells?

dpsutton23:06:38

That’s Tim bald ridge making a logic engine at the Denver clojure meetup

colliderwriter23:06:27

Thanks. This has been tormenting me for weeks.

dpsutton23:06:08

On GitHub he’s halgari. Look in his repos for “logic” and you’ll find it

Donald Pittard23:06:19

Has anyone played around with gitpod and a clojure environment?

colliderwriter23:06:23

Aha. It was searchproof because it's about zippers!