This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-04-16
Channels
- # atlanta-clojurians (1)
- # aws (1)
- # beginners (65)
- # boot (4)
- # cider (81)
- # cljs-dev (25)
- # cljsrn (27)
- # clojure (129)
- # clojure-dusseldorf (12)
- # clojure-italy (68)
- # clojure-norway (5)
- # clojure-poland (4)
- # clojure-spec (14)
- # clojure-uk (72)
- # clojurescript (144)
- # code-reviews (19)
- # copenhagen-clojurians (5)
- # cursive (16)
- # datomic (21)
- # editors (1)
- # emacs (15)
- # events (1)
- # figwheel (6)
- # fulcro (54)
- # graphql (1)
- # hoplon (24)
- # jobs (6)
- # jobs-discuss (2)
- # keechma (4)
- # leiningen (6)
- # luminus (17)
- # lumo (2)
- # off-topic (43)
- # onyx (6)
- # pedestal (2)
- # perun (2)
- # portkey (3)
- # re-frame (22)
- # reagent (11)
- # ring-swagger (5)
- # shadow-cljs (46)
- # specter (8)
- # test-check (2)
- # testing (3)
- # vim (16)
- # yada (1)
Hi everyone. I’ve created a little script that crawls a specific website and downloads all of the mp3 files there. It’s a Georgian site, and only Georgians would understand this 🙂 But I wanted to perhaps get some suggestions code-wise. It all works, but there’s always a room for improvement right?
(ns play-around.core
(:require [net.cgrand.enlive-html :as html]
[org.httpkit.client :as http]
[ :as io]
[clojure.string :as string])
(:gen-class))
(def urls #{"" ""})
(defn- get-dom
[url]
(html/html-snippet (:body @(http/get url))))
(defn- build-urls
[audio-links]
(for [audio-link audio-links]
{:download-url (format "" audio-link)
:filename audio-link}))
(defn- extract-links
[dom]
(map #(get-in % [:attrs :href]) (html/select dom [:td :a])))
(defn- download-file
[{:keys [download-url filename] :as file}]
(let [filename (string/replace filename #"\/" "_")
file (io/file "audio" filename)]
(if-not (.exists file)
(do
(println "Downloading" filename)
(with-open [in (io/input-stream download-url)
out (io/output-stream file)]
(io/copy in out)))
(println "File" filename "already exists, skipping"))))
(defn- download-files
[files]
(doall (map download-file files)))
(defn- is-mp3?
[link]
(re-matches #"(?i).*\.mp3" link))
(defn -main
[& args]
(when-not (.isDirectory (io/file "audio"))
(.mkdir (io/file "audio")))
(->> urls
(mapcat get-dom)
extract-links
(filter is-mp3?)
build-urls
download-files))
@bravilogy My 2 cents, build-urls should be a map
, no ?
May be you should rewrite the main like
(->>
urls
(mapcat get-dom)
(map ...)
(filter ...)
(map ...)
...
)
May be this will help you to see patterns, opportunity for transducers,...or perhaps the thread’s output should then be passed to doall
separately like
(doall (->> urls
(mapcat ...)
....))
(defn -main
[& args]
(when-not (.isDirectory (io/file "audio"))
(.mkdir (io/file "audio")))
(let [downloads (->> urls
(mapcat get-dom)
extract-links
(filter is-mp3?)
(map build-url)
(map download-file))]
(doall downloads)))
My Clojure experience is very limited, so take my advices with caution. 🙂
I think you should finish you (->>)
with a download-files
, but all previous are map and filter, and as reader, I would appreciate to identify it quickly.
be wary of functions that simply map over a collection like build-urls
@bravilogy
Could you explain the point ?
sure, if you have a function that takes a collection but does nothing but call (map (fn something ]) coll), what you want instead is to (defn something ...)
and inline the map something
part at the usage site
OK, I prefer this kind of code style, too.
with this you are trying to do side effects (doall (map download-file files)))
but really doall/map is not a good combo for that. just use a doseq
(defn- download-files
[files]
(doseq [file files]
(download-file file)))
(defn -main
[& args]
(when-not (.isDirectory (io/file "audio"))
(.mkdir (io/file "audio")))
(->> urls
(mapcat get-dom)
extract-links
(filter is-mp3?)
(map build-url)
download-files))
@bravilogy There's also run!
which is to map
as doseq
is to for
. In this case I'd prefer (->> ,,, ,,, (run! download-file))
.