Fork me on GitHub

@mmeix i think you probably want to parse this into a tree and go over it tree-style if you need opening/closing tag harmony. using a CFG to do that might be possible but it's more for orientable sequences. in reality if you wanted to you could just run over that thing wit a regex and make opening tags [: and closing tags ], throwing away </span> and </p> in favor of ] i would definitely write some sample input, some sample output, and then see what tool is best for the job. instaparse is indeed powerful, i used it to create an EBNF grammar representation of Japanese.