This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-12-31
Channels
- # beginners (57)
- # boot (25)
- # cider (86)
- # cljs-dev (6)
- # clojure (137)
- # clojure-russia (7)
- # clojure-spec (10)
- # clojure-turkiye (1)
- # clojure-uk (47)
- # clojurescript (37)
- # cursive (10)
- # datascript (2)
- # datomic (2)
- # dirac (59)
- # emacs (1)
- # hoplon (46)
- # instaparse (122)
- # om (32)
- # om-next (1)
- # onyx (3)
- # pedestal (2)
- # perun (4)
- # protorepl (6)
- # re-frame (15)
- # reagent (60)
- # rum (4)
- # specter (7)
- # sql (3)
- # untangled (3)
- # yada (4)
instaparse requires keywords for the names of the whatchamacallits?
I think I might be using instaparse in a weird enough way for that to be a very mild problem
because I have to gensym the names and so it's a memory leak
@gfredericks It outputs either hiccup or enlive notation, so yes it probably would want keywords in reverse.
(def all-keywords-ever (map keyword (range)))
;; each time you dynamically create a parser
(let [my-syms ...
kws (zipmap my-syms all-keywords-ever)]
...)
That might be a way to conserve on keywords
Or do a string replace in the grammar to substitute non terminals with reusable symbols, then postwalk the resulting tree to convert back
I'm using the combinators, so it shouldn't be too hard to do something like that if I decide this matters
@gfredericks @aengelberg if we can actually get generating from grammars going I'd still be really stoked
I've been working on https://github.com/zmaril/instaparse-c the past few weeks and am getting within spitting distance of doing some fun stuff.
It can basically parse C at this point and I'm working on finishing the macro preprocessor now.
The goal is to get the output into datascript and queryable. But a side product of this is that if you have something that can generate strings from grammars then we already have something that can produce c programs (sans macros).
@zmaril do you or anybody know if all instaparse grammars are implemented using the combinators?
s/grammars/parser/
My understanding is that the ebnf notation that everybody uses is actually parsed by a parser expressed in the combinators that transforms the output into combinators
I just glanced at the combinator list -- I think only the lookaheads are problematic, but that's probably a big deal for sophisticated parsers
so...oh well.
you could implement them with gen/such-that
but the generator would fail if the lookahead condition is unlikely to pass by chance
I have no how that would play out IRL
That should be fine then. For the parsers I write lookahead is typically used to implement reserved keywords.
when I made the regex→string generator I just decided not to support look[ahead|behind] for the same reason
it might not be too hard to throw together a PoC
in fact that would potentially be useful for what I'm working on right now
So I imagine we could use generators the same way spec does and it would work well (fingers crossed)
😂 I just realized that it would require using string-from-regex
from test.chuck to support regexes in the grammars, and string-from-regex
uses instaparse to parse the regex.
indeed
just catching up
After I wrote "instagenerate" I realized going the generator route (as opposed to core.logic) would probably be easier, despite the lookahead such-that
problem
But what do you want to do about hide-tags?
It depends on what you expect the "input" to the generator to be
a parse tree still?
it'd be the combinator
it would generate totally random parsable things
not based on same partial input
ok, in that case I don't really have a problem with hide tags despite just waking up
I think if we got something going that just took a grammar and gave back random strings, that would be a good first step
part of why I did core.logic in instagenerate is @zmaril's initial request to go from partial input -> parseable strings, so I felt the need to put in the sophistication of logic programming as a general solver for all cases
oh, if we want to do partial input, we can provide skeletons with places to start generating from
(def p (insta/parser "
S = A B A | B A B
<A> ('a' <'c'> 'b')+
<B> ('b' 'a')+
"))
(generate p [:S "a" "b" "b" "a" "a" "b"])
=> ("acbbaacb")
seems hard to performantly solve generally
🙂 fair enough
but a generator approach using such-that
may never complete on a large enough grammar
let me know if I can help out in whichever path you decide to try out
yeah generators aren't generally for production stuff
I want a combinator that doesn't match anything
I thought maybe (combo/alt)
but that returns ε
a combinator, not a generator
I guess I can do negative lookahead with epsilon?
(string (str (java.util.UUID/randomUUID)))
I have an alternate thing in my codebase that could be called a parser, but instaparse also has something by that name so I called it a parsifier instead
and it's hard to remember that word because it could also have been parsinator
(defn enlive-output->datascript-datums [m]
(if-not (map? m)
{:type :value :value m}
(as-> m $
(assoc $ :meta (meta m))
(assoc $ :db/id (d/tempid :mcc))
(transform [:content ALL] enlive-output->datascript-datums $))))
This will take enlive output and make it so you can query it from datascriptdoes instaparse use its own regex engine?
I just got a misparse where the thing matches the regex but instaparse disagrees
and reordering a disjunction in the regex fixes it
this is the instparse-cljs thing in particular, but still on the jvm
here's the failing version: https://www.refheap.com/124435
"0/2" is not supposed to parse o_O
I see that's my fault though
I second !epsilon
as the "don't parse"
also instaparse fails on infinite loop grammars, so this might work
never-succeed = never-succeed
(then use never-succeed
wherever)@aengelberg do you think the current behavior of (combo/alt)
is bad/weird?
my hunch is that According To Math it should either throw or not match anything
yeah I agree with your instinct. Not really sure what the thinking was in that design.
my argument is that because (combo/alt p)
probably does not match ε, neither should (combo/alt)
Maybe since "don't parse anything" isn't really a common use case
you shouldn't parse more things by removing an arg from combo/alt
agreed
yeah I always end up finding the uncommon use cases
for a while every time I tried to use CLJS I ended up creating a jira ticket
#gobigorgohome
I think I know why your parser is failing
The regex for the denominator, when given "25"
as input, may arbitrarily decide to match either "2"
or "25"
In instaparse, whatever the regex decides is the one and only possible parse
user=> (re-matches #"[2-9]|[1-9][0-9]+" "25")
"25"
user=> (re-seq #"[2-9]|[1-9][0-9]+" "25")
("2" "5")
user=> (re-find #"[2-9]|[1-9][0-9]+" "25")
"2"
oh it's about re-matches
vs re-find
?
oh I think I see
you could instead do #"[2-9]" | #"[1-9][0-9]+"
If you move logic from regexes into instaparse, you get flexibility at the cost of speed
so the fact that I fixed it by rearranging the regex is sort of an implementation detail I guess?
Yes, so I would call rearranging the regex an improper solution
but #"[2-9]" | #"[1-9][0-9]+"
is proper
okay fine I'll switch it 😛