Fork me on GitHub
#rewrite-clj
<
2022-09-29
>
skylize15:09:47

Current behavior of the rewrite-clj's reader is making editor tooling unnecessarily difficult. This discussion started at least as far back as January 2021 https://github.com/clj-commons/rewrite-clj/issues/87, though I wasn't around for that. I have recently stumbled over numerous warts using Calva, where reader exceptions bubble their way up from rewrite-clj through clojure-lsp, cljfmt, or some other route, all the way to the user as a thrown-exception popup. When this happens, useful help for linting, formatting, etc. are all prety much useless until the code is edited back into a "readable" state. From a user perspective, all these errors seem random and arbitrary. It significantly degrades the user experience, and misses out on opportunities for tools to target help for fixing broken code. The examples shown in the thread are minimal cases, but occur easily and frequently inside or around much larger blocks of code, where the "cause" can be quite difficult to discover. Some no longer throw, but only because they are now patched over with a hack to swallow the exception. As noted by @ericdallo (edit in brackets to say clojure-lsp where he accidentally said rewrite-clj) > we've been hacking that on kondo and [clojure-lsp], we should probably have that on rewrite behind some setting or flag > https://clojurians.slack.com/archives/CPABC1H61/p1664463806428379?thread_ts=1663954653.160439&amp;cid=CPABC1H61 It seems to me the reader needs to be augmented with error recovery, to look forward or backward from the point of failure in search of a place where lexing can continue. Then report errors alongside all the valid nodes (hopefully with sufficient data that a tool like clj-kondo can make a good guess at what went wrong).

💡 1
skylize15:09:16

> • Type into document such that {:foo :bar} becomes {#_#:foo :bar}. > • See error pop up: Request textDocument/codeAction failed. > https://github.com/clj-commons/rewrite-clj/issues/87 > Repro: (`|` represents cursor) > clojure-lsp version used: 2022.09.20-16.33.44-nightly >

{:foo b|}
> Type <backspace> to trigger. > https://clojurians.slack.com/archives/CPABC1H61/p1663701774338309 >
()#_() 
> With cursor anywhere except between the first 2 parens. > Then use mouse or arrow keys to place cursor inside uncommented form to trigger. >
(|)#_()
> https://clojurians.slack.com/archives/CPABC1H61/p1663884706635019 >
({:foo}) ;; 💣
> Cursor anywhere except after the entire form to throw exception. > https://clojurians.slack.com/archives/CPABC1H61/p1663954653160439 >
(/foo :bar)
> Here /foo is an invalid symbol, as indicated by the linter Invalid symbol: /foo. Okay. > > But after command Barf Forward >
(/foo :bar)    -->    (/foo) :bar
> and again after Slurp Forward on that result, >
(/foo) :bar    -->    (/foo :bar)
> the error emits as a popup. > https://clojurians.slack.com/archives/CBE668G4R/p1663863858149529

lread15:09:11

Thanks for investing all this time to summarize your findings on this topic @U90R0EPHA.

☺️ 1
lread16:09:28

With https://cljdoc.org/d/rewrite-clj/rewrite-clj/1.1.45/doc/user-guide#_parsing_peculiarities, rewrite-clj currently only parses valid Clojure code, but that’s at all saying that we won’t handle invalid code in the future. We would just have to invest the time in a solution that makes sense.

👍 1
lread16:09:40

We might be able to steal some strategies from clj-kondo’s internalized and customized version of rewrite-clj. Not sure exactly how far it goes. It must at least deal with invalid symbols, but maybe not unbalanced braces, etc.

snoe16:09:20

parinfer could possibly be used to guess where unbalanced braces should be placed and allow delineation of top level forms

lread16:09:44

interesting idea @U0BUV7XSA, deciding when and how to recover from unbalanced braces can be tricky.

snoe16:09:25

yeah, but just identifying unreadable tokens similar to uneval nodes might be a great step

lread16:09:56

True, we don’t have to do it all at once. Could tackle the easier bits first.

lread16:09:05

@U90R0EPHA for your {:foo b|} example, rewrite-clj happens to support unbalanced maps today, so there might be a workable solution for that specific case. To demonstrate:

(require '[rewrite-clj.zip :as z])

(-> "{:foo b}"
    z/of-string
    z/down
    z/right
    z/remove
    z/root-string)
;; => "{:foo}"

lread16:09:29

For ()#_() are you modifying anything? (you probably are, lemme know) Rewrite-clj can read and write this structure:

(-> "()#_()"
    z/of-string
    z/root-string)
;; => "()#_()"

lread16:09:06

Same question for:

(-> "({:foo}) ;; 💣"
    z/of-string
    z/root-string)
;; => "({:foo}) ;; 💣"

lread16:09:14

I do see your /foo issue:

(-> "(/foo :bar)"
    z/of-string)
;; => ExceptionInfo Invalid symbol: /foo.  clojure.core/ex-info (core.clj:4617)

lread16:09:42

@ericdallo when you have a moment, please feel free to chime in on above too.

ericdallo16:09:48

I see lots of possibilities here, so maybe we could prioritize some of the most common invalid codes https://github.com/clojure-lsp/clojure-lsp/blob/master/lib/src/clojure_lsp/parser.clj#L46-L99 is where clojure-lsp do its custom ugly parse 😅 @U04V15CAJ may be interested on that as well.

skylize16:09:57

> or ()#_() are you modifying anything? When examples say "place cursor ...", there is no further editing required. The errors will trigger just by moving the cursor into regions specified. (Not clear to me how that ties in with what rewrite is doing. But when I asked @ericdallo, he indicated this was still more-or-less the same issue as the /foo problem I discussed with you.) Check this out for significant elaboration on those 2 examples. https://github.com/clojure-lsp/clojure-lsp/issues/1268

lread16:09:18

@U90R0EPHA thanks, watcha think @ericdallo? I don’t yet see rewrite-clj as being the problem for some examples I showed to work with rewrite-clj above. But… I totally fine with being wrong about that.

lread17:09:41

There’s some good ideas and initial discussion in https://github.com/clj-commons/rewrite-clj/issues/87, I’ll study that more carefully soon to freshen my brain on this topic.

pez18:09:43

It could be that rewrite-clj should be split into two different libraries. I've totally not thought this through, it's just something that strikes me. Some editor tooling needs quite different help from what other tooling needs in terms of Clojure source code reading and manipulation. While typing, about anything should be accepted. Other use cases wants stricter and more semantic help. Is it the same library that should cater to both?

snoe18:09:42

So, personally I feel that rewrite-clj was always a text parser. If we're talking about transforming sexps tools.reader would have been sufficient. But we want to maintain whitespace and comments; things outside sexps. unreadable errors in source code happen and, for me, it would fit well in rewrite-clj. If you look at the biggest use cases: formatting, analysis, refactoring i think they would all benefit from error handling.

borkdude18:09:18

I haven't read all of the conversation here (having a day off), but every time someone took the effort to report a case where clj-kondo should continue parsing with invalid code instead of throwing, it's been solved. So please keep doing that.

metal 1
lread19:09:17

I don’t think we need to split rewrite-clj to two different libraries. We can figure out something sensible.

pez19:09:26

Haha, yeah, maybe it is non-sensible. 😃

borkdude19:09:58

in clj-kondo I vendored both rewrite-clj and tools.reader upon which it depends to deal with these problems

borkdude19:09:11

(also to get better error messages especially from tools.reader)

lread19:09:14

Oh no, didn’t at all mean your idea was not appreciated @U0ETXRFEW! We are at the brainstorming stage and all input is great.

❤️ 1
borkdude19:09:20

this was before rewrite-clj v1 was out

lread19:09:54

Yeah, I had to futz tools.reader a bit to deal with line endings, but did not inline. Yet.

borkdude19:09:34

you might as well use edamame and propose a feature request there if you need something weird ;)

lread19:09:58

See, all ideas are good! simple_smile

borkdude19:09:11

I think it has the similar better error messages that the inlined tools.reader has

lread19:09:43

It’s an option.

borkdude19:09:51

perhaps one day I could undo the vendoring if we go that way :-D

🤞 1
borkdude19:09:26

I also vendored rewrite-clj since it was a bit unmaintained and I didn't want to be limited by that

borkdude19:09:41

but this is no longer true

lread19:09:18

Yeah but you had some tweaks too. Like skipping all whitespace for perf, and treating meta differently I think.

borkdude19:09:17

Yes. The skipping of whitespace could be easily an option in rewrite-clj though

lread19:09:24

True, true.

borkdude19:09:01

I mean, it doesn't really fit with the "rewrite" use case, but the tweak is very minimal

👍 1
lread19:09:22

When borkdude takes a day off, how many borks join him? Do you leave a few borks behind to keep working?

😂 1
borkdude19:09:38

I'm on the train home now... catching up ;)

lread19:09:27

Which bork am I talking to? I never know. Maybe it really doesn’t matter.

😂 1
pez19:09:34

I only use rewrite-clj by proxy, so am actually not very familiar with it or informed about the boundaries and stuff it operates within.

lread19:09:32

Indirectly through clojure-lsp, clj-kondo, zprint (and formerly cljfmt)?

pez19:09:53

Yes, Those.

pez19:09:11

Formerly cljfmt?

lread19:09:52

I thought you might have switched to zprint? I dunno, maybe a false rumour.

pez19:09:36

Ah, I thought you was just telling me cljfmt had switched to something else. 😃

lread19:09:28

Oh yes, (I mean no, cljfmt still uses rewrite-clj), communication, eh? Can be tricky. simple_smile

pez19:09:34

Calva is very much still using cljfmt. And also zprint.

👍 1
pez19:09:06

cljfmt for formatting, and zprint for pretty printing

lread19:09:30

Interesting.

lread19:09:00

I know that zprint has many more knobs to turn, but have not ever really dug in to learn it.

pez19:09:07

I want to support zprint for formatting as well, but it gets complicated.

pez19:09:38

Maybe some day. 😃

lread19:09:52

Maybe tomorrow, maybe never. simple_smile

pez19:09:12

Haha. something like that.

lread19:09:17

(an expression I learned when visiting Mexico many years ago)

pez19:09:01

I'll be stealing it.

lread19:09:28

It’s a good one, you definitely should.

pez07:09:45

> parinfer could possibly be used to guess where unbalanced braces should be placed Not sure, but Parinfer can only infer parens on correctly indented code, right?

snoe16:09:16

>> parinfer could possibly be used to guess where unbalanced braces should be placed > Not sure, but Parinfer can only infer parens on correctly indented code, right? Mostly, but when I made the vim plugin, I would run paren mode to correct indentation based on parens first, then turn on infer mode that would move indentation without changing structure. I think smart mode was added later to do something similar. In my experience, at least top level forms start on col 0, so if you split the text at those points, you could possibly try parsing each chunk until you hit one you can't read then wrap it in a [:inferred (infer chunk)] node, and continue with the rest of the text. That way, linters could signal that's something wrong, but analysis or whatever could still be applied internally to that inferred node. The goal would be to read as much of the file as possible normally, and infer only that one chunk in the middle that missed a closing paren.

lread16:09:15

To mull: One thing that rewrite-clj does now (with the exception of normalizing all OS newline variants to \n) is spit out exactly what it reads in. It think it will have to do the same even in the unbalanced delimiters scenario. I also am not sure what this means for formatters like zprint, cljfmt and cljstyle. I would think they’d probably be fine if rewrite-clj started to not barf on invalid symbols and keywords, but not sure if they want to deal with code with unbalanced delimiters.

💯 1