Fork me on GitHub
#beginners
<
2024-06-21
>
Noah Bogart01:06:34

scenario: i have an ns form that calls (:require [some-namespace :refer [foo]]) and then later want to use a different function of the same name and change the require to mention the new namespace and then run into this error

; (err) Execution error (IllegalStateException) at noahtheduke.splint.cli-test/eval37390$loading (REPL:5).
; (err) match? already refers to: #'matcher-combinators.test/match? in namespace: noahtheduke.splint.cli-test
(ns-unalias *ns* 'match?) doesn't work, (ns-unalias 'noahtheduke.splint.cli-test 'match?) doesn't work, (ns-unalias (the-ns 'matcher-combinators.test) 'match?) doesn't work, so what is the correct formulation to make this work?

phronmophobic01:06:13

(ns-unmap *ns* 'match?)
and then require again

phronmophobic01:06:22

ns-unalias is for aliases like (require [foo.bar :as foo]) you could unalias foo.

👍 1
Noah Bogart01:06:23

i thought unmap was for vars defined in the namespace

seancorfield01:06:45

I can't remember what editor you use but here's my "nuclear" option for VS Code: https://github.com/seancorfield/vscode-calva-setup/blob/develop/calva/config.edn#L121-L140 -- a bunch of ns-unalias and ns-unmap calls. I rarely need it, but it has the benefit of very local destruction 🙂

😂 1
👀 1
Noah Bogart01:06:04

that's great, thank you

growthesque07:06:10

What is the preferred way for traversing linked lists/SEQs? From what I understand loop/recur or reduce are supposedly more efficient than get/nth, because even though both have to traverse the whole sequence to get to the last item, get and nth are designed for random access and do more things than loop/recur & reduce. But with loop/recur you also need to check for empty? and use an if block, so possibly reduce is best?

daveliepmann08:06:26

Depends — what are you trying to do?

growthesque08:06:27

get the last item of a sequence without using last

daveliepmann08:06:40

i'd try to have the items in a vector and use https://clojuredocs.org/clojure.core/peek

🙏 1
growthesque08:06:33

is peek still efficient in this scenario when factoring in the cost of vectorizing the sequence?

daveliepmann08:06:53

i don't know; I suspect it depends

growthesque08:06:22

i guess the conversion process still takes O(n) time, but if vec is faster than get, then it's probably better

exitsandman08:06:07

vec then peek will very likely just be slower than last

daveliepmann08:06:33

how many times will you peek ?

exitsandman08:06:06

If you're looking at multiple random access ops then putting the seq in a vec will be good, but for a one-off you're best off with nth and similar

daveliepmann08:06:53

do you need the rest of the sequence?

growthesque08:06:06

@U074Z2B4FGA supposedly nth and get are designed for random access and check indices and stuff like that so reduce could be better as it just iterates and returns? @U05092LD5 no

exitsandman08:06:58

nth and friends are bad when repeatedly used, when used once they're the best option

daveliepmann08:06:59

is it possible to produce the last value as a single value with e.g. reduce instead of producing a sequence and then calling last on it?

daveliepmann08:06:13

and is this a large enough sequence that any of this matters?

growthesque08:06:22

@U05092LD5 it doesn't matter for practical purposes, I am asking for learning sake.

👍 1
growthesque08:06:01

can I use time reliably to compare performance of different approaches?

exitsandman08:06:06

Overall, the main optimization in cases where you build a seq just to throw it all away aside from its nth element is reconsidering your code to avoid having to build the seq in the first place.

2
exitsandman08:06:21

That said, if you aren't too worried about optimization and your seq represents an iterative process (in which case the linear traversal cost is inherent to the algorithm you're using) then you can mostly just do what's immediate to you.

Jim Newton08:06:55

I have some questions about some example code I downloaded from Andrey Fadeev: https://gist.github.com/andfadeev/35c351682c2df7d168c0473e4b96323e

Jim Newton08:06:57

There is the following curious function definition:

(defn parallel-requests
  []
  (let [urls (->> (client/get ""
                              {:as :json})
                  :body
                  :results
                  (map :url))]

    (->> urls
         (map (fn [url]
                (future
                  (-> (client/get url {:as :json})
                      :body
                      (select-keys [:name :shape])))))
         (map deref)
         (doall))))

Jim Newton08:06:34

In particular the part that begins (->> urls (map …. future) (map deref) …)

Jim Newton08:06:58

what does this do considering that map is lazy. To me it means to create a lazy sequence of futures, then lazily deref them. so the first future starts, it gets derefed, when we try to deref the second, the first lazy sequence then starts the second future and it gets derefed, then the third future is started and it gets derefed. Am I interpreting it correctly? It seems the author’s intention was to start all the futures and then deref them.

Jim Newton08:06:55

But that’s not what happens when I actually call the function.

daveliepmann08:06:42

> so the first future starts, it gets derefed, when we try to deref the second, the first lazy sequence then starts the second future and it gets derefed, then the third future is started and it gets derefed. this doesn't sound right. map's laziness is chunked. I think after each chunk it gets to work on the next chunk, not on the next map. so the first future is sent off, then the second, and so on to the nth equal to chunk size, and it does the same with the next chunk until the sequence is consumed. then the derefs start, again in a chunked manner, meaning each chunk waits for all its futures to finish dereferencing before futures in the next chunk are dereferenced

Jim Newton08:06:39

yikes! so when I call (first (filter (fn [x] some-extremely-expensive-computation) some-sequence), then the expensive function may be called a large number of times?

daveliepmann08:06:08

"large" in the sense of "up to the chunk size" which IIRC is 32

daveliepmann08:06:30

i'm a fan of some in those scenarios

Jim Newton08:06:57

some has different semantics. it finds the first truthy return value, not the first input value which matches the predicate.

Jim Newton08:06:05

I seem to recall having this conversion once before.

Jim Newton08:06:15

I find this very confusing. This code from Andrey Fadeev depends on the fact that lazy sequences are chunked.

daveliepmann08:06:14

(re: some i understand that when we discussed this last, you specifically wanted something which handled false and nil in a particular way which some doesn't. in actual practice I don't find this to be so much of an issue — curious if it's an issue in this particular extremely-expensive-computation)

Jim Newton08:06:57

Yes it is two different issues. but my interpretation of the future code, above, was based on the fact that I forgot about chunking. If you suppose the chuck size is 1, then the code above becomes serial. Somehow this seems like bad semantics to me.

Jim Newton08:06:32

It seems that the intent of the code is 1) start the futures, 2) wait on them. And return a non-lazy seq of the return values, evidenced by the call to doall

lassemaatta08:06:38

I think it's generally considered a poor idea to mix side-effects and lazy sequences

1
1
💯 1
Jim Newton08:06:05

@U0178V2SLAY that sounds like a better way to express my concern.

Jim Newton08:06:34

the author tries to make the code non-lazy with the call to doall but he needs to call doall twice, once for each lazy sequence

Jim Newton08:06:59

then it would work regardless of chunk size.

daveliepmann08:06:25

what about it doesn't "work"?

Jim Newton08:06:43

the code accidentally works because the chunksize is 32.

Jim Newton08:06:34

by work I mean, demonstrate the performance gain. The author is illustrating to beginners how it is faster to fetch all the pokemon urls in parallel, and then extract tags from their json representations. Thus you bascially parallelize the wait times of the http requests

👍 1
Jim Newton08:06:51

If I change the code to the following, then it starts all the futures, then derefs them all

(defn parallel-requests
  []
  (let [urls (->> (client/get ""
                              {:as :json})
                  :body
                  :results
                  (map :url))]

    (->> urls
         (map (fn [url]
                (pr-info (format "starting future for url: %s" url))
                (future
                  (-> (client/get url {:as :json})
                      :body
                      (select-keys [:name :shape])))))
         (doall)
         (map (fn [f] 
                (pr-info (format "deref %s" f))
                (deref f)))
         (doall))))

Jim Newton08:06:55

I have a related question, but I’ll ask in the main channel

Jim Newton09:06:16

I suppose this also means that (reduce f sx) will realize more of the sequence as well even if f returns a reduced value.

lassemaatta09:06:35

(not sure if I'm qualified to answer but..) one thing to also keep in mind is that (as far as I understand) nothing actually guarantees/specifies that every lazy seq is always chunked; it's up to the original data source to choose to chunk or not. And of course the chunking can be altered with stuff like that re-chunk, where the consumers of the returned seq see a different chunking behaviour. For example (as I just learned) it seems e.g. (range) ends up calling (iterate inc' 0) , which calls clojure.lang.Iterate/create and supplies inc' as the function f . But Iterate does no chunking; it calls f on demand once every time someone asks for the next value. So doing something like (->> (range) (map foo) (map bar) .. processes items one at a time. But eg. (range 100) works differently and chunks.

Jim Newton08:06:07

Can someone please help me understand how this works. Basically a given lazy sequence has a chunk size of, say, 32, how does this code avoid that inner sequence still chunking by 32.

(defn re-chunk 
  "Given a lazy sequence, change the chunking buffer size to n.
  This code was taken directory from
  "
  [n xs]
  (lazy-seq
    (when-let [s (seq (take n xs))]
      (let [cb (chunk-buffer n)]
        (doseq [x s] (chunk-append cb x))
        (chunk-cons (chunk cb) (re-chunk n (drop n xs)))))))

Jim Newton08:06:14

it seems to me that (take n xs) would trigger 32 values to be computed, and a seq of n of them to be returned. Similar to (first (filter p xs)) . why is (take n…) different that first in this regard?