clojure 2020-10-25 | Slack Archive

this is probably a typo, right? https://github.com/cognitect/transit-clj#usage

(def reader (transit/reader in :json))

rakyi11:10:46

I don’t see the typo. What do you mean specifically?

joshkh14:10:19

specifically the in symbol. i had a flashback to coffeescript

Alex Miller (Clojure team)14:10:34

That’s the input arg to read from

misha14:10:50

(def in (ByteArrayInputStream. (.toByteArray out)))
(def reader (transit/reader in :json))

joshkh15:10:03

ever stare at something for so long that you don't see the forest through the trees? 😳

misha15:10:13

happens all the time, it's ok

misha14:10:49

greetings! what is the fastest way to reduce size of seq of maps by comparing only subset of keys, eg: [{:id (uuid) :a 1 :b 1} {:id (uuid) :a 1 :b 1}] => ;;just [{:id (uuid) :a 1 :b 1}]

misha14:10:36

seq is millions of items. the only thing I can think of - is to put those extra unique fields into meta, and accumulate maps into set from the get go

p-himik14:10:00

clojure.core/distinct

p-himik14:10:26

Or do you mean comparing only by :a and :b?

p-himik14:10:53

If so, what of the two :id keys will be used?

misha14:10:19

compare by equivalent of (dissoc m :id) or (select-keys m [:a :b])

p-himik14:10:55

But what ID will you use? Or are you OK with a random ID?

p-himik14:10:29

I mean the result, not for comparison.

p-himik14:10:15

If dissoc + distinct + adding a new ID (all via transducers) is not fast enough, then I would look into tries.

misha14:10:17

id format is not important, but what is important - next states will have :prev-state-id

misha14:10:01

so I 1) cannot assign just random ids after everything is done. 2) given millions of items = just scanning through them and assigning ids after some step - takes very long too

matthewad15:10:30

You could dissoc the id, put the original value in the meta data, use distinct, then retrieve the original data from the meta data. I have no idea if this will be suitable performance-wise, but it avoids the 2 issues you listed.

(defn distinct-ignore-id [items]
        (map #(:original-item (meta %))
             (distinct
              (map (fn [item]
                     (with-meta (dissoc item :id) 
                       {:original-item item})) 
                   items))))

matthewad15:10:00

Sorry, I just noticed this is basically the same solution you suggested in the top level comment. Nevermind.

Faiz Halde04:10:06

or you could use a sorted-set with a custom comparator that dissoc's :id before comparing (or select keys :a and :b) .. you can get rid of the meta usage that way

misha08:10:53

looking into it, thanks

misha14:10:37

the task itself is a game simulation, where you start from initial state, and given some rules, generate possible next states from it, and iterate until out of memory :) so with each iteration set of next states grows exponentially in addition to already calculated ones

Nico17:10:49

hi, if I want to take a string like this

this is a [test](test.md) of inline [links]()

and remove the links but keep track of them, like this:

{:text "this is a test of inline links" :links [{:name "test" :path "test.md"} {:name "links" :path ""}]}

what would be the best way to do this? (I am parsing a markdown file, but a full markdown parser is unavailable)

andy.fingerhut17:10:34

One way (not necessarily the best) is to use a parser like instaparse and write a small grammar for the things you want to recognize differently from the rest of the text. That might be tricky for handling arbitrary Github-flavored markdown, since I know that some of their constructs are dependent on what comes first on the line, and other construts like [link to this text](something.md) can have the contents between [] spanning multiple lines.

andy.fingerhut17:10:44

This StackOverflow question has some answers that might lead you to a full parser for Github-flavored markdown, but perhaps not written in Clojure.

andy.fingerhut17:10:45

https://stackoverflow.com/questions/39560644/what-library-does-github-use-for-parsing-markdown

Nico17:10:19

I know that I'm only going to have link contents on one line, and all I need to do is extract links rather than do proper markdown parsing

andy.fingerhut17:10:25

This sundown library is mentioned: https://github.com/vmg/sundown Its README says it has bindings for many languages, including Python, Ruby, JavaScript, Haskell, and Go, but I don't see Java there. The JavaScript library might be easily callable from ClojureScript.

Nico17:10:48

I'm also running in a babashka environment, so the libraries that would be available arne't here

andy.fingerhut17:10:04

If you know such links are always going to be within a single line, you could attempt to use regex matching.

andy.fingerhut17:10:16

Do you ever expect the text to have comments in parentheses or square brackets, that aren't links? e.g. could someone write "the foo (which is variant of a bar) can be blurg [see reference 5]"

Nico17:10:10

yeah

Nico17:10:46

I know how to regex match to find if a line contains a link but not what the link is or how to then remove it

Nico17:10:43

I'm not sure how you'd do that with regex

Ed17:10:20

(let [s "this is a [test](test.md) of inline [links]()"
        re #"\[([^\]]*)\]\(([^\]]*)\)"]
    {:links (->> (re-seq re s) (map (fn [[_ n p]] {:name n :path p})))
     :text (clojure.string/replace s re "$1")})

Ed17:10:27

maybe something like that?

Nico17:10:27

ah, thanks

Nico17:10:32

didn't know re-seq existed

👍 3

solf17:10:46

Is there existent tooling that would take function docstrings from a project to populate an API section of it's README? The idea being to avoid having to maintain documentation from two places

lukasz17:10:06

Something like this https://cljdoc.org or https://github.com/weavejester/codox ?

solf18:10:45

Codox might be to target the README, using a custom writer. It does seem rather over featured for what I had in mind. I'm not sure how easy it would be to use it through clj

solf18:10:26

I had in mind a light program, using probably clj-kondo under the hood, to just lift some docstrings from .clj files to the README and using an api like this imaginary one:

clj -A:readme "./src/api.clj"  --output README.md

2020-10-25

Channels