Fork me on GitHub
#code-reviews
<
2018-04-16
>
Bravi14:04:30

Hi everyone. I’ve created a little script that crawls a specific website and downloads all of the mp3 files there. It’s a Georgian site, and only Georgians would understand this 🙂 But I wanted to perhaps get some suggestions code-wise. It all works, but there’s always a room for improvement right?

(ns play-around.core
  (:require [net.cgrand.enlive-html :as html]
            [org.httpkit.client :as http]
            [ :as io]
            [clojure.string :as string])
  (:gen-class))

(def urls #{"" ""})

(defn- get-dom
  [url]
  (html/html-snippet (:body @(http/get url))))

(defn- build-urls
  [audio-links]
  (for [audio-link audio-links]
    {:download-url (format "" audio-link)
     :filename audio-link}))

(defn- extract-links
  [dom]
  (map #(get-in % [:attrs :href]) (html/select dom [:td :a])))

(defn- download-file
  [{:keys [download-url filename] :as file}]
  (let [filename (string/replace filename #"\/" "_")
        file (io/file "audio" filename)]
    (if-not (.exists file)
      (do
        (println "Downloading" filename)
        (with-open [in (io/input-stream download-url)
                    out (io/output-stream file)]
          (io/copy in out)))
      (println "File" filename "already exists, skipping"))))

(defn- download-files
  [files]
  (doall (map download-file files)))

(defn- is-mp3?
  [link]
  (re-matches #"(?i).*\.mp3" link))

(defn -main
  [& args]
  (when-not (.isDirectory (io/file "audio"))
    (.mkdir (io/file "audio")))
  (->> urls
       (mapcat get-dom)
       extract-links
       (filter is-mp3?)
       build-urls
       download-files))

Charles Fourdrignier14:04:37

@bravilogy My 2 cents, build-urls should be a map, no ? May be you should rewrite the main like

(->>
    urls
   (mapcat get-dom)
   (map ...)
   (filter ...)
   (map ...)
  ...
)
May be this will help you to see patterns, opportunity for transducers,...

Bravi14:04:41

oh good point

Bravi14:04:35

the thing about the last map is that I need to run it through doall somehow

Bravi14:04:43

otherwise it’s not being ‘realized’

Bravi14:04:12

so should I chuck doall in ->> after everything?

Bravi14:04:04

or perhaps the thread’s output should then be passed to doall separately like

(doall (->> urls
   (mapcat ...)
   ....))

Bravi14:04:23

or even have it in a let block

Bravi14:04:54

(defn -main
  [& args]
  (when-not (.isDirectory (io/file "audio"))
    (.mkdir (io/file "audio")))
  (let [downloads (->> urls
                       (mapcat get-dom)
                       extract-links
                       (filter is-mp3?)
                       (map build-url)
                       (map download-file))]
    (doall downloads)))

Charles Fourdrignier15:04:39

My Clojure experience is very limited, so take my advices with caution. 🙂 I think you should finish you (->>) with a download-files, but all previous are map and filter, and as reader, I would appreciate to identify it quickly.

ghadi15:04:39

be wary of functions that simply map over a collection like build-urls @bravilogy

Charles Fourdrignier15:04:42

Could you explain the point ?

ghadi15:04:49

sure, if you have a function that takes a collection but does nothing but call (map (fn something ]) coll), what you want instead is to (defn something ...) and inline the map something part at the usage site

Charles Fourdrignier15:04:32

OK, I prefer this kind of code style, too.

ghadi15:04:51

yeah it encourages reuse and idiomatic core library usage

👍 4
ajs15:04:00

with this you are trying to do side effects (doall (map download-file files))) but really doall/map is not a good combo for that. just use a doseq

Bravi15:04:00

(defn- download-files
  [files]
  (doseq [file files]
    (download-file file)))

(defn -main
  [& args]
  (when-not (.isDirectory (io/file "audio"))
    (.mkdir (io/file "audio")))
  (->> urls
       (mapcat get-dom)
       extract-links
       (filter is-mp3?)
       (map build-url)
       download-files))

Bravi15:04:19

thanks everyone 🙂

madstap20:04:26

@bravilogy There's also run! which is to map as doseq is to for. In this case I'd prefer (->> ,,, ,,, (run! download-file)).

👍 4