Fork me on GitHub
#instaparse
<
2018-11-22
>
Vincent Cantin06:11:12

I found a strange behavior with #'\\Z', I wonder if it is a bug or if it is normal.

((insta/parser
   "Paragraph = NonBlankLine+ BlankLine+
    BlankLine = #'[ \\t]'* EOL
    NonBlankLine = #'\\S'+ EOL
    EOL = (#'\\n' | EOF)
    EOF = #'\\Z'")
 "abc\ndef\n")

=> 
[:Paragraph
 [:NonBlankLine "a" "b" "c" [:EOL "\n"]]
 [:NonBlankLine "d" "e" "f" [:EOL [:EOF ""]]]
 [:BlankLine [:EOL "\n"]]]

Vincent Cantin06:11:03

EOF appears before "\n" in the parsed result.

Vincent Cantin06:11:23

This other approach which uses the negative lookahead does put the "\n" in the right place in the result, but there is another problem: The BlankLine is missing in the result. That may be a bug of instaparse. I am using the version 1.4.9.

((insta/parser
   "Paragraph = NonBlankLine+ BlankLine+
    BlankLine = #'[ \\t]'* EOL
    NonBlankLine = #'\\S'+ EOL
    EOL = (#'\\n' | EOF)
    EOF = !#'.'")
 "abc\ndef\n")
=>
[:Paragraph [:NonBlankLine "a" "b" "c" [:EOL "\n"]]
            [:NonBlankLine "d" "e" "f" [:EOL "\n"]]]

Vincent Cantin07:11:54

I am going to use this workaround for now: append “EOF” at the end of the input and parse it. It works very well 🙂

((insta/parser
   "Paragraph = NonBlankLine+ BlankLine+
    BlankLine = #'[ \\t]'* EOL
    NonBlankLine = #'\\S'+ EOL
    EOL = (#'\\n' | EOF)
    EOF = 'EOF' #'\\Z'") ; works as well with !#'.'
 "abc\ndef\nEOF")
=>
[:Paragraph
 [:NonBlankLine "a" "b" "c" [:EOL "\n"]]
 [:NonBlankLine "d" "e" "f" [:EOL "\n"]]
 [:BlankLine [:EOL [:EOF "EOF" ""]]]]

sova-soars-the-sora15:11:36

love instaparse

2️⃣ 4
❤️ 8
parrot 4