Fork me on GitHub
#rewrite-clj
<
2021-03-15
>
reefersleep11:03:20

So, I’m using the zip API to navigate through a file. I need to find a particular function call and edit it. I’ve found out how to do this, and I’m composing it into a context where I recursively do this to all files under a directory. Yay! However, there are problems. I’m using find-next to find the particular s-expression that I’m interested in, and in my predicate function, I’m using z/sexpr to be able to inspect the s-expression as a Clojure data structure. Some of the files under the directory has discard blocks at the root level, and this causes z/sexpr to throw. I could (and I’m probably going to) iterate through the nodes using z/right instead and interpret the type of node, or just catch, to figure out whether I’m at a discard node (or some other un-sexpr-able node) that I need to skip, and skip it. I’m curious that z/find-next doesn’t already have this option, though. I see that z/right, for example, internally has a call to skip-whitespace. Why not also have skip-unsexprable as an option somewhere in the call chain?

borkdude11:03:27

I think z/sexpr should not throw when a node has unevals/discards

reefersleep11:03:03

I see the rationale, though (I think! 🙂 ); what would you return in place of a discard node?

borkdude11:03:12

it should be treated the same as whitespace: ignored

reefersleep11:03:17

nil is misleading. It’s literally discarded, not returned as nil

reefersleep11:03:40

hm, but what would you return?

reefersleep11:03:49

don’t visit them at all

reefersleep11:03:52

actually skip them 🙂

borkdude11:03:01

oh you are calling z/sexpr on an uneval node itself? not something with an uneval node inside of it?

reefersleep11:03:12

the latter works fine

borkdude11:03:12

yeah, don't do that :)

reefersleep11:03:27

I’m trying to work my way around it 🙂

borkdude11:03:38

you can just look at the :tag of the node

borkdude11:03:41

and then skip it

reefersleep11:03:51

yeh, that’s what I figured. Interpret the type of node.

reefersleep11:03:08

It forces me into oldschool looping rather than the neat find-next functionality 🙂

reefersleep11:03:39

And I was wondering if that could maybe be an area that could improved. I don’t know the history of rewrite-clj, though

borkdude11:03:43

you can also do a kind of postwalk using z/next maybe

borkdude11:03:48

not sure if that helps over find-next, I never used the latter

borkdude11:03:07

you should always be a bit careful with z/sexpr

borkdude11:03:12

this has always been the case

borkdude11:03:38

it can throw on invalid formed maps as well, e.g. {:a 1 :b}, there's just nothing rewrite-clj can or should do about this I think

reefersleep11:03:35

So you might actually generally have to have the try/catch safeguard

reefersleep11:03:13

Depending on your scenario ofc

borkdude11:03:40

yes, and/or first check with tag

lread12:03:30

@reefersleep there is an open issue on skipping uneval nodes https://github.com/clj-commons/rewrite-clj/issues/70, we are discussing a decent way to solve.

lread12:03:13

I sometimes use zip API walk functions when I want to only visit certain nodes in a zipper.

reefersleep12:03:15

@lee I’ll have a look! Meanwhile, I made my own find-next.

(defn sturdy-find-next
  ([zloc pred]
   (sturdy-find-next zloc z/right pred))
  ([zloc f pred]
   (if (z/end? zloc)
     nil
     (let [next-zloc (f zloc)]
       (cond (z/end? next-zloc)
             nil

             (= :uneval (n/tag (z/node next-zloc)))
             (recur next-zloc f pred)

             (pred next-zloc)
             next-zloc

             :else
             (recur next-zloc f pred))))))

reefersleep12:03:32

(Just a quick WIP)

👍 3
reefersleep12:03:03

I think the idea of further configurability of what is skipped is interesting, but of course needs some hammock time 🙂

reefersleep12:03:14

(what @sogaiu mentions in the thread)

borkdude12:03:51

@lee Without looking into the issue, so forgive my ignorance, why is uneval not just treated as whitespace? Because one might want to rewrite it right?

reefersleep09:03:41

That’s what I thought, and I

reefersleep09:03:50

I think I’m in the same state of mind as you; maybe it should be up to the user what they want to skip, and what not. In an ideal world, anyway. But I’ve no idea if that’d pollute the interface, how it’d impact performance and so on.

lread12:03:29

I’m guessing the original authors just made a choice on what to skip. I see their point, it really is not technically whitespace.

borkdude12:03:48

But same could be true for commas?

borkdude12:03:04

What if I want to rewrite all commas into whitespace for example?

lread12:03:23

But commas are technically Clojure whitespace.

borkdude12:03:46

yes, I know, but you might want to get rid of those using a zipper in rewrite-clj

lread12:03:36

It is a fuzzy area, I don’t think the default choice was wrong, and am thinking something configurable probably makes sense. The fact that rewrite-clj skips comments by default is likely why cljfmt does not format them!

borkdude12:03:05

The difference from a Clojure perspective between ;; and # is that # actually goes through the reader. Anything invalid in # will cause the reader to throw. E.g. #{:a}. So technically it doesn't belong in the "whitespace" category of things that are plainly ignored by the reader.

3
lread12:03:50

Good observation, yeah

lread12:03:43

To finish my ramble on cljfmt... it does also use raw movement functions (effectively right* next* etc) to examine and effect changes to whitespace.

lread13:03:57

On a different note, I’m thinking I should probably start measuring how changes we make to rewrite-clj affect its performance. I know this can be a difficult thing to do in a meaningful way, so might not pursue. I’m particularly interested how rewrite-clj v1 compares to rewrite-cljs as I turfed some of the rewrite-cljs tunings in favour of a common code base. But I’m also generally interested. Not sure what a good strategy would be yet. Do folks know of any projects that do measure and compare performance? I do run 3rd party libs tests suits (ex. zprint, cljfmt, etc) against rewrite-clj v1. I wonder if somehow I could use the test suite runs to help me measure if rewrite-clj changes improve/degrade performance.

🙌 3
lread23:03:11

I am planning to upgrade cljfmt to rewrite-clj v1 (if weavejester agrees). Since cljfmt uses rewrite-cljs, I decided to take a https://github.com/weavejester/cljfmt/issues/211#issuecomment-799813626. Result are to be taken with a grain of salt, but under my little test I am seeing rewrite-clj v1 is 20% slower than rewrite-cljs for cljfmt. I guessed there might be a hit and this seems not too horrible. We’ll see what others think.

👍 3
🙌 3
vemv23:03:54

btw have you profiled execution looking for slow spots? btw I have a WIP lib that auto-profiles a whole codebase with https://github.com/ptaoussanis/tufte/ , so one can know the slow functions very quickly I could share a sample report if that helps

lread03:03:11

No, not yet at all. Tufte looks very interesting! I’m not sure I want to dive into a performance tuning wormhole unlesss I need to. 🙂 Then again, I suppose we might find easy wins?

👀 3
reefersleep09:03:01

Tufte is great for manually working down towards bottlenecks. Auto-profiling sounds even better!

reefersleep09:03:55

And, dipping into performance tuning for the easy wins feels awesome. You could timebox it and prevent wasting too much time :)

🙂 3
vemv10:03:20

Yeah perf can be important. In a use case like clj-kondo, integrated with some sort of "squiggly lines" IDE functionality one wants a particularly tight feedback loop ...while for cljfmt it's less important IMO, as one is not reformatting files the whole time (especially for most IDEs, which basic formattings will match with cljfmt's)

lread10:03:43

Good news, weavejester does not consider potential cljs slowdowns with rewrite-clj v1 as blockers, so I’ll proceed without optimizations for now.

👍 9