Fork me on GitHub
#beginners
<
2016-03-09
>
swizzard00:03:43

i need some algorithm help

swizzard00:03:33

i’ve got 2 if-lets and a cond inside of a loop

swizzard00:03:58

and i worry it’ll only get worse without outside assistance

swizzard00:03:12

i’ve pulled meta tags out of a web page with enlive, and i want to combine them into maps

swizzard00:03:05

i was using partition, but there’s no guarantee they repeat in the same order

Chris O’Donnell00:03:13

@swizzard: so what form is the the data you have, and what form do you want it to be in?

swizzard00:03:15

the input is a seq of maps that look like this: {:attrs {:content "17" :itemprop "numTracks"} :content nil :tag :meta}

swizzard00:03:44

there are like 5 different itemprops that i want

swizzard00:03:11

so i want to turn a seq like ({:attrs {:content "17" :itemprop "numTracks"} :content nil :tag :meta} {:attrs {:content “foo" :itemprop "name"} :content nil :tag :meta}…) into a seq like ({“numTracks” “17”, “name” “foo”,…}…)

Chris O’Donnell00:03:59

so you want to extract the :itemprop and :content keys from :attrs from each element and then accumulate them into a map. is that right?

swizzard00:03:24

but actually a seq of maps

swizzard00:03:38

({:attrs {:content "17" :itemprop "numTracks"} :content nil :tag :meta}
 {:attrs {:content "2009" :itemprop "copyrightYear"} :content nil :tag :meta}
 {:attrs {:content "10 Cc" :itemprop "byArtist"} :content nil :tag :meta}
 {:attrs {:content "Rock" :itemprop "genre"} :content nil :tag :meta}
 {:attrs {:content ""
          :itemprop "url"}
  :content nil
  :tag :meta}
 {:attrs {:itemprop "name"}
  :content ("Live in Concert: Clever Clogs")
  :tag :span}
 {:attrs {:content "14" :itemprop "numTracks"} :content nil :tag :meta}
 {:attrs {:content "1995" :itemprop "copyrightYear"} :content nil :tag :meta}
 {:attrs {:content "10 Cc" :itemprop "byArtist"} :content nil :tag :meta}
 {:attrs {:content "Rock" :itemprop "genre"} :content nil :tag :meta}
 {:attrs {:content ""
          :itemprop "url"}
  :content nil
  :tag :meta}
 {:attrs {:content "Avex" :itemprop "publisher"} :content nil :tag :meta}
 {:attrs {:itemprop "name"} :content ("Mirror Mirror") :tag :span}
 {:attrs {:content "8" :itemprop "numTracks"} :content nil :tag :meta}
 {:attrs {:content "1975" :itemprop "copyrightYear"} :content nil :tag :meta}
 {:attrs {:content "10 Cc" :itemprop "byArtist"} :content nil :tag :meta}
 {:attrs {:content "Rock" :itemprop "genre"} :content nil :tag :meta}
 {:attrs {:content ""
          :itemprop "url"}
  :content nil
  :tag :meta}
 {:attrs {:itemprop "name"} :content ("Original Soundtrack") :tag :span}
 {:attrs {:content "10" :itemprop "numTracks"} :content nil :tag :meta}
 {:attrs {:content "0" :itemprop "copyrightYear"} :content nil :tag :meta}
 {:attrs {:content "10 Cc" :itemprop "byArtist"} :content nil :tag :meta}
 {:attrs {:content "Rock" :itemprop "genre"} :content nil :tag :meta}
 {:attrs {:content ""
          :itemprop "url"}
  :content nil
  :tag :meta}
 {:attrs {:itemprop "name"} :content ("...meanwhile") :tag :span})

swizzard00:03:49

that’s the full input

swizzard00:03:25

which i want to yield 3 maps

swizzard00:03:20

i was using partition and then reduce merge

swizzard00:03:45

but that bakes in assumptions that make me really nervous

Chris O’Donnell00:03:31

Are you getting 3 seqs by splitting at the span tags?

swizzard00:03:32

i was just doing partition 6

swizzard00:03:29

but if there’s more or fewer attrs it’ll throw everything off

swizzard00:03:13

i think i could like write a transducer that wraps completed maps in reduced but that’s scary

Chris O’Donnell00:03:31

You could use something like split-with to grab all tags until you get to a particular one. If you did that iteratively, it would have the same effect as your partition call.

swizzard00:03:49

yeah, but i’m not sure they’ll repeat in the same order

Chris O’Donnell01:03:34

here's what I'm thinking (pseudocode): (reduce (fn [[cur-album & rest :as albums] tag] (if (already-has-tag cur-album tag) (conj albums (new-album tag)) (conj rest (merge cur-album tag)))) [] albums)

Chris O’Donnell01:03:17

that should probably be a list, not a vector as the initial value

swizzard01:03:29

that makes sense, yeah

swizzard01:03:56

the initial val should be something like ’({}), right?

Chris O’Donnell01:03:19

the way I wrote it, you'd need an initial entry

Chris O’Donnell01:03:19

@swizzard: I think it would make sense to map your tags into a more amenable format before combining them into albums. It's a lot easier to work with '({:num-tracks 17} {:copyright-year 2009} ...) than the format you get from enlive.

swizzard01:03:58

i can use map destructuring though

swizzard01:03:42

(fn [[cur-album & r :as albums] {{:keys [:itemprop :content]} :attrs cntnt :content}]…

Chris O’Donnell01:03:37

I think the reason I shy away from that is it feels like making one piece of code do two things.

swizzard01:03:06

i hear that

Chris O’Donnell01:03:28

I think it's fine.

Chris O’Donnell01:03:48

I am a serial over-refactorer. 😛

swizzard01:03:32

honestly i’ve already got the map-validation function & don’t want to write another little helper function

Chris O’Donnell01:03:06

that sounds very fair

chadhs20:03:08

whats the difference between using str or pr-str

donaldball20:03:15

The latter emits a string that can be parsed by the clojure reader

chadhs20:03:23

good to know, but admittedly im not quite sure what that means

swizzard20:03:03

@chadhs: i don't know if this will actually help, but it's analogous to the difference between str and repr in python

swizzard20:03:57

the idea is that you could e.g. copy the result of (pr-str my-thing) into the repl or some source code and get back my-thing

ghadi20:03:29

@chadhs: it's easier to understand the difference looking at pr vs print:

ghadi20:03:26

i shouldn't have used numbers, but notice the pr version roundtrips exactly if you use read

ghadi20:03:51

print/println is for "human" friendly, pr/prn is machine friendly

chadhs20:03:19

that makes sense @ghadi

ghadi20:03:28

pr-str just captures what it would have printed into a string, rather than printing

chadhs20:03:48

i just did a test route to show that in the browser

chadhs20:03:53

:body (str "Yo! " (:name (:route-params req)) "!”)

ghadi20:03:03

and str calls .toString in java, which may not do any of the above

chadhs20:03:03

:body (pr-str "Yo! " (:name (:route-params req)) "!”)

chadhs20:03:36

and i could see the human friendly printed string vs the actual quoted strings