Fork me on GitHub
#rewrite-clj
<
2019-11-05
>
pez09:11:10

Forgive me for stupid questions. I am quite unfamiliar with what is going on in rewrite-clj and only have the vaguest idea about zippers and stuff. Eventually I will fix that, but right now I am just a user by proxy and I wonder about how you guys think about parsing things that are not strictly EDN. My use cases are pretty printing, and syntax highlighting and such. Here’s an example:

(comment 
 (do
   (def db-uri "datomic:")
   (d/create-database db-uri)
   (def conn (§d/connect db-uri))
   @(d/transact conn [{:db/doc "Hello world"}])))
Evaluating the (do... form here gives back something like this:
{:db-before [email protected], :db-after [email protected], :tx-data [#datom[13194139534316 50 #inst "2019-11-05T09:00:57.870-00:00" 13194139534316 true] #datom[17592186045421 62 "Hello world" 13194139534316 true]], :tempids {-9223301668109598141 17592186045421}}
Trying to pretty-print that with anything other than clojure.core/pprint blows up, for somewhat different reasons, depending on which pretty printer is tried. But if rewrite.clj(s) is involved it blows up on the db values there, trying to parse the stuff after the @ as a number. On some level that makes sense, but from a Calva user perspective it most often doesn’t.

pez09:11:45

So, my question. Is there a way today to get rewrite-clj to not totally croak on that input? And if it isn’t, do you think it would be something worth trying to add?

pez09:11:26

Another thing that confuses me is that when trying to pretty print the result above, nREPL server side, all pretty printers I have tried, except clojure.core/pprint, start to evaluate the database values, so with the query above, zprint, for instance, prints out the whole database. Twice. This isn’t happening in rewrite-clj, right`?

borkdude09:11:46

@pez db values? can you give an example of the error?

pez10:11:00

The db values are those keyed by :db-before and :db-after in the above result.

borkdude10:11:06

This is what I'm seeing with rewrite-clj:

$ clj -Sdeps '{:deps {rewrite-clj {:mvn/version "RELEASE"}}}'
Clojure 1.10.1
user=> (require '[rewrite-clj.parser :as p])
nil
user=> (p/parse-string "{:db-before [email protected], :db-after [email protected], :tx-data [#datom[13194139534316 50 #inst \"2019-11-05T09:00:57.870-00:00\" 13194139534316 true] #datom[17592186045421 62 \"Hello world\" 13194139534316 true]], :tempids {-9223301668109598141 17592186045421}}")
Execution error (ExceptionInfo) at clojure.tools.reader.impl.errors/throw-ex (errors.clj:34).
Invalid number: 298c26ad.

borkdude10:11:45

This is the same error you get from clojure.core/read-string

borkdude10:11:56

so the problem is that this value doesn't round-trip as valid clojure code

pez10:11:50

Exactly.

borkdude10:11:55

using clojure.pprint on this doesn't require you to parse the value from a string, because it can print the value directly from memory. that's why it works

pez10:11:15

Is rewrite-clj using read-string?

borkdude10:11:35

rewrite-clj is designed to rewrite files with code/EDN in it

borkdude10:11:50

so it parses the text, and then rewrites it and then writes it back as text

pez10:11:18

Yeah. hence my question, if it would be something worth considering, to have a switch where it won’t croak when something doesn’t parse to a valid EDN type.

borkdude10:11:02

if it isn't valid code, it should probably croak. it should not try to be better than clojure.core/read-string

borkdude10:11:58

use the right tool for the right job, rewrite-clj isn't a pretty-printer, it's a rewriter from text -> text

pez10:11:48

Agree somewhat. I’m just trying to avoid having to write my own parser. Which I have already done, as it happens, so I could probably solve this that wat, but I would like to be able to use libraries built on top of rewrite-clj.

borkdude10:11:51

why are you first printing it to a string and then trying it to re-parse it, if the goal is to have pretty-printing?

pez10:11:59

I know rewrite-clj isn’t a pretty printer, but it is used by pretty printers 😃

borkdude10:11:50

it makes sense if you're trying to pretty-print text files (so code as you write it, re-formatting), but not for pretty-printing in memory values

pez10:11:53

So, there are reasons. Trying to do this server side, I get other, unacceptable side effects (as mentioned above), so I am trying to solve it client side and then what I have is a string, because that is what nrepl gives me. Maybe it is time to look at the EDN transport there…

pez10:11:39

Even if I wonder if it can be of help since it isn’t valid EDN to begin with…

borkdude10:11:06

rewrite-clj is concerned with code, not only EDN, but still same problem

pez10:11:33

Yeah, Calva wants to use it for code as well.

borkdude10:11:43

nREPL should give you access to the value as is, which you can then pprint, no serialization in between

borkdude10:11:54

that's the only way to deal with this afaik

pez10:11:01

Well, bencode is bencode.

borkdude10:11:15

or try/catch and then fall back on something else

borkdude10:11:27

print as is maybe

pez10:11:31

That’s what I do now.

pez10:11:41

But pretty printing is not the only use case for me. I also do code formatting, using cljfmt. And it creates a bad Ux when it refuses to format results like the above.

pez10:11:00

In a way it sounds like cljfmt, zprint and other formatters/pretty printers should not be using rewrite-clj. Is that correctly understood?

borkdude10:11:21

The above value cannot be parsed by Clojure itself so it should probably not occur in code. So I don't see a problem for cljfmt there?

borkdude10:11:09

Not completely. Using rewrite-clj for pprinting is fine as long as you start with valid textual code

borkdude10:11:33

Saved on disk. Serialized.

pez10:11:58

Code files contain a lot of stuff. So what I often do is that I paste the result of some evaluation inside a (comment ...) form to have as a reference. For that usage pretty printing is very helpful.

borkdude10:11:54

[email protected] is something that Java prints when an Object doesn't have an overriden toString method. The one who serializes it could chose to serialize it as "<<object>>" so it will remain valid when parsed back. I think that's where it should/could be fixed.

borkdude10:11:22

> Code files contain a lot of stuff. Yes, but if that code cannot be read by clojure.core/read-string you're f*cked anyway.

borkdude10:11:20

Just try to insert the above db value in one of your project and then re-start your REPL.

borkdude10:11:56

This is what I'm seeing for example:

pez10:11:21

Yeah, I’ve seen that often, trying to load a file with one of those (comment invalid-form) in it. 😃

pez10:11:55

So, I know that read-string doesn’t like it. I have tried a lot of things when trying to fix this problem in Calva, believe me. haha

borkdude10:11:27

it's a common source of frustration that pr-str doesn't always output strings that can be read back with read-string. Same with edn/read-string.

borkdude10:11:47

Maybe you should not try to fix that problem 😉

pez10:11:13

Haha, well, I like for Calva to be as useful as I can make it. So I turn many stones.

borkdude10:11:13

user=> (read-string (pr-str {:a (symbol "foo bar")}))
Execution error at user/eval9 (REPL:1).
Map literal must contain an even number of forms

pez10:11:51

However, rewrite-clj doesn’t complain. 😃

pez10:11:45

Would be horrible if it did, because then I couldn’t use cljfmt for my formatting at all.

pez10:11:51

Anyway, I think I have the answer to my question. My search continues!

borkdude10:11:24

The parser doesn't complain but:

user=> (rewrite-clj.node/sexpr (p/parse-string "{:a foo bar}"))
Execution error (IllegalArgumentException) at rewrite-clj.node.seq/map-node$fn (seq.clj:110).
No value supplied for key: bar

pez10:11:48

Yeah, that’s good!

plexus11:11:15

@pez I would look into extending print-method so it prints datomic.db.Db values in a readable way

plexus11:11:19

say #datomic/db {:ts 1234567}

plexus11:11:32

actually reading that back to a Db value is a different story, but at least you'll be able to process it with rewrite-clj, and you can define a dummy data reader so this syntax round trips, even if reading it doesn't yield an actual valid Db instance

borkdude11:11:46

that works, but as an IDE you can't possibly foresee all the kinds of objects that people are going to print

borkdude11:11:10

maybe something like this could work:

(set! *warn-on-reflection* true)
(let [old-method (get-method print-method Object)]
  (defmethod print-method Object [v ^java.io.Writer w]
    (if old-method (old-method v w)
        (.write w (format "<<%s>>" (.getName (.getClass ^Object v)))))))
(deftype XYZ [])
(XYZ.) ;;=> <<user.XYZ>>
(remove-method print-method Object)

borkdude11:11:24

not sure if/how that is going to interfere with existing print-methods...

borkdude11:11:05

maybe restoring the print-method after printing is best, so it doesn't interfere with other tooling

borkdude11:11:15

so maybe:

(let [old-method (get-method print-method Object)]
  (defmethod print-method Object [v ^java.io.Writer w]
    (.write w (format "<<%s>>" (.getName (.getClass ^Object v)))))
  ;; pprint here
  ;; restore method
  (when old-method
    (defmethod print-method Object [v w]
      (old-method v w))))

borkdude11:11:46

This is likely going to interfere with other threads that are running in the same JVM

borkdude12:11:17

@pez Maybe you could also use print-dup. That throws when there is no valid method defined:

user=> (let [sw (java.io.StringWriter.)] (print-dup (XYZ.) sw) (str sw))
Execution error (IllegalArgumentException) at user/eval146 (REPL:1).
No method in multimethod 'print-dup' for dispatch value: class 

pez15:11:33

Thanks, both of you! I will experiment a bit and we’ll see if I can get this as user friendly as I’d like to.

sogaiu16:11:54

@pez btw, regarding zippers, if you are interested, one of the most helpful videos i saw was plexus': https://lambdaisland.com/blog/2018-11-26-art-tree-shaping-clojure-zip -- i found the diagrams presented with the explanations to be particularly good.

lread16:11:47

and also Tim Baldridge's videos are great (not entirely free, but also very inexpensive) https://tbaldridge.pivotshare.com/media/zippers-episode-1/11348/feature?t=0

sogaiu16:11:49

oh yes, i have found that following along with tim baldridge's presentations at the repl to be particularly instructive. though it is effort, i have usually found it to be worth it :thumbsup:

lread16:11:10

yeah he's a smart guy and goes deep!

sogaiu16:11:34

on the note of zippers, not sure about how relevant it might be, but @martinklepsch noticed that in a java "port" of tree-sitter, there is a clojure api: https://github.com/JetBrains/jsitter/tree/master/clj -- it has some zipper stuff in it. apart from that though i wonder whether might be some ideas / synergy for rewrite-clj*...

borkdude16:11:26

I also enjoyed plexus' zipper video (small sidenote: my findings about performance were different than his conclusion at the end)

sogaiu16:11:51

thanks for sharing the bit about your findings, that is interesting indeed

plexus19:11:01

Can you elaborate @borkdude? I don't remember what I said exactly about performance, besides the recommendation to use fast-zip

plexus19:11:10

Also yay glad to see people find that talk useful. The viz code is on github somewhere, although I never got around to releasing it properly.

borkdude21:11:38

@plexus https://www.youtube.com/watch?v=5Nm56YvTKZY&amp;feature=youtu.be&amp;t=2558 "one of the benefits of using zippers is the performance gain", although you state that's not the main reason why you would use them. I didn't find them to be more performant than manual editing (actually the opposite).

borkdude21:11:21

But overall quite nice introduction to zippers, thanks for that.

sogaiu21:11:51

just so i understand, he was referring to the modifications being done all together vs each one separately, right?

borkdude21:11:19

I think so, but traversing with zippers also causes data structures to be modified, so I guess it depends on the amount of overhead vs. the amount of updates you're trying to accomplish. With typical things I was doing with rewrite-clj manual updates were much faster. I guess we need benchmarks.

sogaiu21:11:29

it's often in the details i guess 🙂