This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-12-14
Channels
- # adventofcode (62)
- # beginners (78)
- # boot (26)
- # boot-dev (9)
- # cider (73)
- # cljs-dev (33)
- # cljsrn (36)
- # clojure (159)
- # clojure-android (1)
- # clojure-austin (1)
- # clojure-greece (79)
- # clojure-italy (10)
- # clojure-nl (3)
- # clojure-russia (11)
- # clojure-spec (33)
- # clojure-uk (26)
- # clojurescript (107)
- # core-async (22)
- # core-logic (12)
- # cursive (16)
- # datomic (13)
- # devcards (5)
- # duct (36)
- # emacs (4)
- # figwheel (3)
- # fulcro (107)
- # graphql (171)
- # hoplon (27)
- # instaparse (24)
- # jobs-discuss (34)
- # juxt (3)
- # lein-figwheel (1)
- # leiningen (8)
- # lumo (11)
- # off-topic (9)
- # onyx (79)
- # parinfer (1)
- # pedestal (75)
- # re-frame (27)
- # rum (1)
- # shadow-cljs (11)
- # spacemacs (20)
- # specter (17)
- # unrepl (96)
I'm playing around with instaparse and for kicks and giggles I wrote a parser to parse some log files I have laying around
i.e. if I just want to gobble up a few characters into a tree node and don't care about the content there, is that possible?
Fixed width? Maybe #'.{N}'
?
right, yes regex does the job but is probably not very performant for just "take substring of 10 from where you are"
I think regex is the most performant way to grab a not-static set of characters
: ) well I should probably mention that I think instaparse is excellent and by far the best parser lib I've run across....so my intent was not to come here and critique it
Thanks! And no worries, I was just answering your question from the perspective of what instaparse actually supports
that being said...if I parse 2G of log files (without instaparse) and compare the simplest regex match with (subs line 10 20)
, regex performace doesn't exactly shine
But I see your point that if it theoretically supported a dedicated "substring" combinator, that would be faster
anyway, figured I would ask, but regex does indeed do the job and perhaps what I'm doing with this parser is a bit of an edge case
Maybe we should support "custom combinators" so people like you with special use cases can write their own more performant specialized versions
you would have to add some kind of extension point to the instaparse bnf syntax I guess
Maybe, or we don't allow extensions to the EBNF syntax and just let people make custom combinators for the combinator syntax
right now I'm considering writing my own mini language for this log parsing, I could use instaparse to parse that language and then do custom, optimized parsing based on the format specification tree coming out from instaparse...so still useful
hmm, how come I need to double escape the not-inclusive rule in the following grammmar:
(def my-p
(instaparse.core/parser
"spec = (field-spec <' '?>)+
field-spec = <'['>name ' '* <':'> ' '* (width | not-inclusive | not-exclusive | rest)<']'>
name = #'[^:]+'
width = <'{'> #'\\d+' <'}'>
not-inclusive = <'\\\\'> #'.'
not-exclusive = <'/'> #'.'
rest = '*'
"))
you mean the '\\\\'
?
because 1) you need to tell Clojure that you aren't escaping a character within a string 2) you need to tell Instaparse that you aren't escaping a character within a string combinator