Fork me on GitHub
#reagent
<
2022-09-07
>
pez13:09:52

I need to parse html that I get from an API, then modify that html and render it. The html will contain input elements that I want to connect to my app such that I'll have them as part of my state (re-frame, most probably). I've made an experiment with parsing the html to an AST using posthtml-parser (from npm) and then I convert that to hiccup, which I let reagent render. It seems to work... My ast->hiccup is pretty naïve, though.

(defn ast->hiccup
  [ast]
  (mapv
   (fn [element]
     (if (:tag element)
       (let [{:keys [tag attrs content]} element]
         (into [(keyword tag) attrs] (ast->hiccup content)))
       element))
   ast))
I'm not super worried about the stack yet, but I'd like to be able to edit the AST as well as query it for some things I am interested in. I've been looking at the hickory README a bit and maybe it is a good fit. However, I'd like to be able to parse on node as well (my tests run there), and it seems tricky to do this with hickory. I wonder what my options might be. Keep doing it the naïve way until it hurts? Go for a zipper solution? Hickory (and scrap the parsing inside tests)? Something completely else?

p-himik14:09:16

I would probably forgo the Hiccup way completely and instead, assuming you could trust the HTML, I would use this: https://github.com/reagent-project/reagent/blob/master/doc/FAQ/dangerouslySetInnerHTML.md All modifications would then be done immediately after the initial rendering

pez14:09:21

I can trust the HTML. However, it is unclear to me how I would do ”all modifications” if I go that way.

p-himik14:09:41

Depends on the kind of modifications. But basically, exactly the same way you'd be doing it before the React time and in the jQuery era. :) Only without jQuery and by using native API. Node.appendChild and stuff like that.

pez14:09:38

And why would this be to prefer over doing the transforms on the AST?

p-himik14:09:56

• Hiccup is just unpleasant to transform, unless your HTML parser always gives you an attribute map, even if it's empty, and always puts classes in a single location in a specific format • In general, HTML is not "clean". There can easily be broken stuff that browsers handle in some way but an HTML parser might not handle it or handle it differently • There's also a chance of some data being lost or mangled during that round trip. It's easy enough to imagine how e.g. some data or aria attribute gets decoded into Hiccup in one way and then encoded back into HTML in another Just way too many moving parts for my taste.

pez14:09:41

The html parser gives me an AST, not hiccup. Just so that is clear. It has some nobs that I haven't explored, but default it skips outputting an attributes map if there are no attributes. But that just means that when I do (:attrs element), I get nil. Which is fine, I think.

p-himik14:09:32

Ah, alright. And if attrs is just a map of string keys, then it should be fine. But still - such a solution has a HTML->AST->Hiccup pipeline, whereas with what I proposed it's just HTML. There's no need for any libraries even. The only issue with my suggestion, AFAICT, is being able to run it on Node. But there are probably libraries out there to mimic DOM API on Node specifically for headless tests without having to use a headless browser.