Fork me on GitHub
#beginners
<
2021-03-17
>
Chase13:03:28

Why does (clojure.string/split (clojure.string/lower-case "Hello world") #"\W+") give me my intended behavior but (->> "Hello world" (str/lower-case) #(str/split % #"\W+") doesn't? I think I am misunderstanding how to use anonymous functions inside a threading macro.

alexmiller13:03:35

yeah, you can't do that - it's helpful to look at the expansion:

user=> (pprint (macroexpand '(->> "Hello world" (str/lower-case) #(str/split % #"\W+"))))
(fn* [p1__8#] (str/split p1__8# #"\W+") (str/lower-case "Hello world"))

alexmiller13:03:25

the #() is expanded to (fn* [...] ...) first, and then the ->> is threaded into the last expression of the (fn* ...), which is not what you intended

Chase13:03:17

ahhh, ok. Very good. I'll take a different approach. This is part of a bigger transformation so I was trying to shoehorn it into a threading macro

alexmiller13:03:36

you can make it work by wrapping the #():

(->> "Hello world" (str/lower-case) (#(str/split % #"\W+")))

Chase13:03:58

ahh. is that idiomatic though?

alexmiller13:03:00

but generally I find that to be more confusing to read than not doing it in the first place

Chase13:03:13

fair enough

Chase13:03:51

Yep, I was looking at those. Of course, this was the only part of like 5 step process that wasn't thread last

Chase13:03:35

How would you approach this:

(let [s "The foo the foo the\ndefenestration the"] (doseq [w (->> s (str/lower-case) (#(str/split % #"\W+")) (frequencies) (sort-by val >) (map #(str (key %) " " (val %))))] (println w)))

Chase14:03:05

not sure why the formatting got a little funky there

Chase14:03:26

I was just playing around with this while reading this article: https://benhoyt.com/writings/count-words/

manutter5114:03:38

(Triple backticks to format multi-line code)

manutter5114:03:49

What about something like this:

(let [s "The foo the foo the\ndefenestration the"
      word-split #(str/split % #"\W+")]
  (doseq [w (->> s
                 (str/lower-case)
                 (word-split)
                 (frequencies)
                 (sort-by val >)
                 (map #(str (key %) " " (val %))))]
    (println w)))

Chase14:03:29

Yeah, I like this approach, ty

ūüĎć 3
Hagenek14:03:49

Im doing some data validation stuff in clojure and I have encountered something I dont understand well enough yet. How do you gracefully handle e.g. AssertionErrors in Clojure? In another language I would return out of the function and console.log the error, while in Clojure I would need some If logic to not run the rest of the function? The validate-user-data returns a nil on successful validation, and the validation error as a map on unsuccesful Here is what I am working with:

(defn user-vector-builder
  "Takes a user-map and returns a valdiated and formatted user vector
   expects keys :email and :age and both values should be of string type"
  [user-m]
  (try
    (assert (= (validate-user-data user-m) nil) (str "Invalid data: " (validate-user-data user-m)))
    (catch AssertionError e (println (:case  e))))
  [(:age user-m) ; Header: alder
   (:email user-m) ; Csv-header epost
   (if (>= (Integer/parseInt (:age user-m)) 18)
     "1"
     "0") ; Csv-header myndig
   (str (t/ago (t/millis 1337))) ; Csv-header timestamp
   ])

Hagenek14:03:27

Do not hold back on critism of all parts of this code btw, I always want to improve as much as possible

gon14:03:26

I would use a different approach here ... instead of try/catch

(if (validate-user-data ....)
  (do my-code-here)
  (do fail here))  

Chase16:03:39

So back to my previous discussion about that count words benchmark article I posted earlier, I made the program into a babashka script to see how it compares to the other languages listed. I came up with this:

#!/usr/bin/env bb                                                                 
                                                                                  
(require '[clojure.string :as str])                                               
(require '[ :as io])                                               
                                                                                  
(defn read-lines [file]                                                           
  (with-open [r (io/reader file)]                                                 
    (doall (line-seq r))))                                                        
                                                                                  
(let [file (first *command-line-args*)                                            
      word-split #(str/split % #"\W+")]                                           
  (doseq [w (->> (read-lines file)                                                
                 (str/lower-case)                                                 
                 (word-split)                                                     
                 (frequencies)                                                    
                 (sort-by val >)                                                  
                 (map #(str (key %) " " (val %))))]                               
    (println w)))
It seems to be quite performant and looks way "simpler" to my biased self! What would you folks do differently?

delaguardo16:03:37

you could use transducers instead of threading macro to avoid construction of intermediate collections

Chase16:03:05

Ooh, that would be a good exercise. I don't know why but I've let this whole transducer thing intimidate me

andy.fingerhut16:03:23

Using existing transducers is way less intimidating than implementing a new one (which is usually unnecessary), or fully understanding all aspects / tradeoffs of their implementation.

dpsutton16:03:06

https://chrispenner.ca/posts/wc is an excellent post with a line of thinking exactly towards this problem and about 13 different ways they approached it