This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-07-21
Channels
- # aws (14)
- # babashka (35)
- # beginners (163)
- # boot (2)
- # calva (5)
- # cider (30)
- # clojure (143)
- # clojure-colombia (1)
- # clojure-europe (5)
- # clojure-nl (11)
- # clojure-spec (1)
- # clojure-uk (16)
- # clojurescript (71)
- # community-development (2)
- # conjure (1)
- # cursive (6)
- # datomic (30)
- # duct (4)
- # figwheel-main (11)
- # fulcro (28)
- # graalvm (3)
- # graphql (23)
- # hoplon (36)
- # jackdaw (24)
- # kaocha (16)
- # lambdaisland (1)
- # leiningen (4)
- # luminus (3)
- # meander (4)
- # observability (1)
- # off-topic (10)
- # pathom (5)
- # re-frame (27)
- # reitit (7)
- # remote-jobs (1)
- # sci (17)
- # shadow-cljs (22)
- # spacemacs (14)
- # sql (61)
- # testing (3)
- # tools-deps (27)
- # vim (2)
- # xtdb (18)
- # yada (2)
I understand the reasoning but it still seems weird that when I search for a phrase containing the word "property" on the clojurescript website, it also finds results with no property but with "properly" in them. It doesn't do that if I just search for "property" alone.
I’m looking for some advice on how to transform complex xml documents to a clear edn format. I know about the various libraries to parse xml and have played around with some of them (clojure.data.xml, tupelo.forest, etc) and I get how to extract specific parts of the xml. However the specification for the xml I’m dealing with (if you can call it a specification…) allows for a lot of variation. So different documents will have the same information at slightly different locations/paths. Also some documents don’t follow the spec completely. Has anyone dealt with similar problems? Is there a smart way to tackle this?
one big picture approach to try: - use normal libraries to turn xml into a tree of clojure data (eg. clojure.data.xml), then use tree-seq
plus filter to find subtrees matching some pattern. Using the args to tree-seq you can "prune" subtrees (eg. if the parent is a comment form, not search into the child nodes for data, only look for certain tags inside specific relevant parent tags...)
that said I find xml frustrating (I think we all do), and I've never been fully satisfied with how I've handled xml data
Thanks for the tips. Last idea I had was using a zipper to walk depth first through the doc, keeping track of where you are in the tree and matching on patterns like you said @U051SS2EU. That way I can build up the simpler tree I want to get out of the data.
yeah - the difference between a zipper and tree-seq is with tree-seq you get every subtree in a lazy-seq, and can filter out the ones you think are relevant based on the full path, and with a zipper you have a place oriented navigation API, both can accomplish the same thing, I am prejudiced toward values over places
I think I see what you mean using tree-seq, although I guess it could be quite finicky to rebuild parts of the tree. But I will try again tomorrow
yeah - that approach is about extracting values, rather than transforming the tree
Using XPath (e.g. via https://github.com/eerohele/sigel) might also be an option.