This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-12-17
Channels
- # adventofcode (76)
- # announcements (6)
- # beginners (103)
- # boot (28)
- # calva (128)
- # cider (48)
- # cljs-dev (40)
- # clojure (268)
- # clojure-austin (2)
- # clojure-dev (2)
- # clojure-europe (47)
- # clojure-italy (10)
- # clojure-nl (17)
- # clojure-spec (2)
- # clojure-uk (15)
- # clojurescript (45)
- # code-reviews (14)
- # cursive (5)
- # data-science (2)
- # datascript (1)
- # datomic (52)
- # duct (4)
- # emacs (2)
- # figwheel (1)
- # figwheel-main (4)
- # fulcro (13)
- # hyperfiddle (51)
- # leiningen (19)
- # nrepl (40)
- # off-topic (45)
- # pathom (3)
- # pedestal (28)
- # portkey (7)
- # re-frame (25)
- # reagent (76)
- # reitit (7)
- # shadow-cljs (92)
- # slack-help (3)
- # specter (5)
- # timbre (2)
- # tools-deps (39)
- # unrepl (1)
- # vim (13)
Wrote this fn to parse a line from our log files. Idiomatic?
(require '[java-time :as t])
(defn parse-log-entry [line]
(let [log-format (array-map :timestamp "\\[([^\\]]+)\\]"
:file "\\[Fl:\\s*([^\\]]+)\\]"
:class "\\[Cl:\\s*([^\\]]+)\\]"
:method "\\[Fn:\\s*([^\\]]+)\\]"
:line "\\[Ln:\\s*([^\\]]+)\\]"
:level "\\[([^\\]]+)\\]"
:summary "\\[([^\\]]+)\\]"
:thread-id "([^_]+_[^\\s]+)\\s"
:server-ip "([\\d.]+)"
:client-ip "([\\d.]+)"
:msg "\\[(.+)$")
log-entry-regex (->> (concat "^" (vals log-format) "$")
(clojure.string/join "\\s*")
re-pattern)
entry (->> (re-find log-entry-regex line)
rest
(map clojure.string/trim)
(map vector (keys log-format))
(into {}))
parse-timestamp #(->> (concat % (repeat \0))
(take 24)
(apply str)
(t/local-date-time (t/formatter "dd-MM-yyyy HH:mm:ss.SSSS")))]
(-> entry
(update :line #(Integer/parseInt %))
(update :timestamp parse-timestamp))))
Yeah, instaparse seems like overkill, and bashing strings together to create regexps seems like underkill. Do you reckon instaparse is a better fit here?
Performance might have to be taken into consideration too (if your logs are large) - apart from readability and sensitivity to subtle errors, regexes aren't terribly performant. However, instaparse is also known to be quite slow (antlr is much better, but as you say that's probably an overkill in your case)
Thanks Dominic!
Yeah @U06BE1L6T this is just for a script that helps me analyze logs easily on my machine in the REPL. Will use a proper parser generator if we decide to put this somewhere at scale.
Logs of an application (Apache + PHP) at work.
array-map
is fragile - when you want order, you usually don't want maps. I'd use a vector of tuples instead.
I would also do some initialization outside of the function body - here you're compiling a big regex for each line of log!
Likewise, it seems that zipmap
is what you want for computing entry
.
TIL zipmap
.
I initially defined log-entry-regex
as a var, but moved it in here later. Thank you for your insight.