I'm rewriting html I parsed with crouton and trying to collect text from within nested tags, but I miss something with the flattening part. What's missing in this pattern?
{:tag (m/or :code :span :p :div :em :a)
:content (m/some [!x ...])}
[(m/cata !x) ...]
I started writing code like https://clojurians.slack.com/archives/C03S1KBA2/p1644357312001629 instead of cata. It lets me debug it more easily.
I'd still rather do it with cata...
It’s not a list is it?
(m/rewrite '(1 2 3)
[!x ...] [!x ...])
;; nil
No, it's nested parsed HTML
Sorry, I’m not sure from your answer.
If the value of :content isn’t a vector then [!x …] won’t match.
(m/rewrite '(1 2 3)
(m/seqable !x ...) [!x ...])
;; [1 2 3]
The value of :content is a vector of strings or maps which will contain more :content
I recommend adding the key to cata as the first argument, which will tell us what should happen
Not sure I follow. specifically, I'm parsing some html table:
(m/rewrite (second tables)
{:tag :table
:content [(m/cata !m) ...]}
[!m ...]
{:tag :tbody :content nil} {}
{:tag :thead
:content
[{:content
[{:content [?parameter]}
{:content [?description]}]}]}
{:parameter ?parameter
:description ?description}
{:tag :tbody
:content [!tr ...]}
[(m/cata !tr) ...]
{:tag :tr
:content
[{:tag :td :content [?key ?desc]}
{:tag :td :content [(m/cata !doc) ...]}]}
{:field (m/cata ?key)
:type (m/cata ?desc)
:doc [!doc ...]}
{:tag (m/or :a :code :span :p :div :em :a :ul :li :i)
:content (m/some [(m/cata !xs) ...])} [!xs ...]
{:tag (m/or :a :code :span :p :div :em :a :ul :li :i)} nil
{:tag :br} "\n"
?x ?x)
{:tag (m/or :code :span :p :div :em :a)
:content (m/some [(m/cata [:flatten !x]) ...])}
[:flatten [!xs ...]]
[!xs ...]
[:flatten ?x]
?x
ah, tag the data
this way, every time you use cata in any place, it will either unpack the vector or return the argument unchanged
much easier to debug and read code IMHO
I'm missing the rhs for the map example
can you give a piece of HTML so we have the same?
btw, try this
{:tag :table
:content [& [(m/cata !m) ...]]}
[!m ...]
it should work like into
Even a tiny example like
{:tag :p
:content
[{:tag :p
:content
[{:tag :p
:content ["Hello"]}
{:tag :p
:content ["world"]}]}
{:tag :p
:content
[{:tag :p
:content ["Yes"
{:tag :p
:content ["No"]}]}]}]}
(m/rewrite data
{:tag :table
:content [& [(m/cata !m) ...]]}
[!m ...]
{:tag :tbody :content nil} {}
{:tag :thead
:content
[{:content
[{:content [?parameter]}
{:content [?description]}]}]}
{:parameter ?parameter
:description ?description}
{:tag :tbody
:content [!tr ...]}
[& [(m/cata !tr) ...]]
{:tag :tr
:content
[{:tag :td :content [?key ?desc]}
{:tag :td :content [& [(m/cata !doc) ...]]}]}
{:field (m/cata ?key)
:type (m/cata ?desc)
:doc [!doc ...]}
{:tag (m/or :a :code :span :p :div :em :a :ul :li :i)
:content (m/some [(m/cata !xs) ...])} (m/cata [!xs ...])
{:tag (m/or :a :code :span :p :div :em :a :ul :li :i)} nil
{:tag :br} "\n"
(m/with [%a (m/some !xs)
%b [(m/or %b %a) ...]
%c (m/or %b %a)]
%c)
[!xs ...]
?x ?x)
;; => ["Hello" "world" "Yes" "No"]this is for now, but can be done better
👍
This + hiccup example:
;; Collect all content
(m/with [%p {:content [(m/or (m/pred string? !s) %p %q) ...]}
%q {:content nil}]
%p)
[!s ...]
collects all strings, no need to collect then flatten
Slightly more verbose but clearer what's going on:
(m/with [%s (m/pred string? !s) ;; string ref
%q {:content nil} ;; empty
%c (m/or %s %p %q) ;; content vector can be
%p {:content [%c ...]} ;; recursion, mutually
]
%p)
[!s ...]