2021-10-16 nrepl | Clojure Slack Archive

nrepl

dominicm 2021-10-16T08:37:56.060200Z

I'm just doing some profiling in vim, and a lot of time goes to finding the completion context. How are other editors doing this fast? I'm considering whether I should just send the whole buffer or the last 400/next 400 lines or something like that.

dominicm 2021-10-16T10:31:53.061500Z

Dumb idea of just dumping the whole file doesn't work because compliment only reads a single form 🙂

😂 1

dominicm 2021-10-16T08:40:08.060900Z

https://github.com/tpope/vim-fireplace/blob/c9a155b88d629221628937eadb64a3e87f8d82d9/autoload/fireplace.vim#L195-L224 here's how it's being done in fireplace/vim.

pez 2021-10-17T07:32:10.065300Z

I’m not sure how it was done in zmacs and its precursors, but Calva maintains a parsed representation of the document that is fast to navigate.

dominicm 2021-10-17T08:27:09.065500Z

Does VSCode provide tools for assisting with maintenance of that parsed representation?

dominicm 2021-10-17T08:27:17.065700Z

I can't imagine writing that myself!

pez 2021-10-17T09:23:14.065900Z

I wish. And it is pretty advanced tech, for sure. I got help creating the infrastructure from a developer who does such things in the sleep.

dominicm 2021-10-17T20:51:04.066100Z

Nice, I'll have to poke about some time.

pez 2021-10-16T09:29:38.061100Z

Calva does something similar. Take the text from start of top level form to in front on what is completed, take the text from end of what is completed to end of top level form, concatenate. Where is the time spend in the above snippet?

dominicm 2021-10-16T10:31:36.061300Z

On finding the opening paren. In bigger functions this is pretty slow.

pez 2021-10-16T11:35:11.061800Z

Is it slow as in that the user notices a delay?

pez 2021-10-16T11:42:49.062Z

Calva uses its Token Cursor for all things like this. So finding top level form is a super quick operation, because it is just to go downList() until you can’t anymore, then the top level form is the same as the current form. If you have something like a token cursor, or zipper, or whatever, then probably that would be quick. OTH, maybe searchpairpos() uses something like that already?

dominicm 2021-10-16T16:51:54.063Z

Sorry, missed this. It's so slow I noticed delay while trying to work and just disabled everything. I felt like I was using vi 🙂

dominicm 2021-10-16T16:52:37.063200Z

I'm not sure what a Token Cursor is, but my suspicion is that you've got some kind of caching/incremental parsing going on there. Vim isn't that smart. It has to search the whole file every time to figure out if there's a cheeky ( at the start of the file to match.

dominicm 2021-10-16T16:55:50.063400Z

Benchmarks indicate a reduction of 59ms to 0.26ms per call.

dominicm 2021-10-16T16:58:00.063600Z

Apparently a 99.56% decrease 😄

pez 2021-10-16T20:23:46.063900Z

A Token Cursor is an old invention for navigating LISP code. Here’s Up List from an 80s zmacs manual: > Moves forward out of nesting levels. It moves forward one level of list structure. It searches for a close parenthesis and leaves point to the right of that close parenthesis. Also, if called inside of a string, it moves up out of that string, leaving point to the right of its ending quote. In Calva we maintain a sequence of tokens for all documents that are being edited. And then we use a structural “cursor” to navigate the code. Totally inspired by that age old invention.

pez 2021-10-16T20:25:02.064100Z

When you reached for TreeSitter you took a similar approach, I would say. Our token sequence is comparable to the AST that TreeSitter builds and uses.

pez 2021-10-16T20:29:34.064300Z

Interestingly, the same manual says this about top level expressions: > A Lisp file contains a sequence of expressions that we call top-level expressions, to distinguish them from their own sUbexpressions. Zmacs assumes that top-level expressions begin with an open parenthesis against the left margin. It does not parse top-level expressions by balancing parentheses, since parentheses do not always balance while programs are being written. The indentation represents the programmer’s conception of program structure, and provides a better guide. So by top-level expression, we mean a section of text delimited by open parentheses at the beginning of two lines.

pez 2021-10-16T20:30:43.064500Z

Calva instead helps with keeping the document always balanced, and when it is not offers almost no help for editing it. 😃

dominicm 2021-10-16T20:39:10.064700Z

Ah, so it just searches backwards for ^( and then forwards for the same...

dominicm 2021-10-16T20:39:47.064900Z

Unfortunately, compliment only works on complete forms anyway 🙂

dominicm 2021-10-16T20:40:43.065100Z

I might put forward that suggestion to tpope from the zmacs manual and see what he thinks.

flowthing 2021-10-18T07:44:58.068800Z

> Vim isn't that smart. It has to search the whole file every time to figure out if there's a cheeky `(` at the start of the file to match. I got bit by the same thing (in a different context) in the thing I'm making for Sublime Text. I solved it by just assuming that if ( is at the first column, it's the outermost form. 🤷 It's not perfect, but it probably solves 99.9% of the cases. 😛

pez 2021-10-18T08:20:46.069100Z

That’s how zmacs solved it too. 😃 But for a different reason.

flowthing 2021-10-18T08:26:28.069300Z

Oh yes, I see that now in the quote you pasted earlier. 🙂

pez 2021-10-18T08:27:26.069500Z

To be clear. VS Code is not helping much with this for Calva. We have added it with the help of only the onDocumentContentChanged events (which are also very bare and could be much more helpful, but there it is). We maintain a “mirror” document, which gets updated whenever the vs code document gets updated, and vice versa. This mirror document is parsed and we maintain the an array of tokens which we then use for navigating the document. The code for finding the top level form looks like so:

rangeForDefun(offset: number, commentCreatesTopLevel = true): [number, number] {
        const cursor = this.doc.getTokenCursor(offset);
        let lastCandidateRange: [number, number] = cursor.rangeForCurrentForm(offset);
        while (cursor.forwardList() && cursor.upList()) {
            const commentCursor = cursor.clone();
            commentCursor.backwardDownList();
            if (!commentCreatesTopLevel || commentCursor.getToken().raw !== ')' || commentCursor.getFunctionName() !== 'comment') {
                lastCandidateRange = cursor.rangeForCurrentForm(cursor.offsetStart);
            }
        }
        return lastCandidateRange;
    }

Including the handling of rich comment top level forms.

pez 2021-10-18T08:31:58.069700Z

It’s not only used for navigation. All structural edits in Calva are performed on this mirror document which is then mirrored to the “real” document.

flowthing 2021-10-18T08:45:27.069900Z

Goodness. That does sound sophisticated. I'm basically just moving parens around. 🤪 And it's not too expensive to maintain that "mirror" document in the background?

pez 2021-10-18T11:37:42.070100Z

Because LISP this cursor thing is also basically just moving parens around. Memorywise, every document consumes at least double (text + token array). And maybe more, I don’t know how smart VS Code is about in-active documents, but Calva keeps the mirror around. That doesn’t seem to be too expensive, though. Parsing the documents can be quite a lot of work, but still below 200ms for Clojure core. That could get a lot more efficient if we used DFA regexes, but so far I have resisted going down that rabbit hole. Traversing the token array can also get expensive if the forms are very large. Calva hates huge EDN structures. But they need to get pretty huge first. Calva’s rainbow parens and other highlight services built on this take care only to traverse forms that are in view, but with a huge EDN form, that doesn’t help much.

pez 2021-10-18T11:40:21.070300Z

I’ve been trying to understand this article a bit, to see if I could use some of the approach, but so far it mostly passes over my head. https://code.visualstudio.com/blogs/2021/09/29/bracket-pair-colorization

flowthing 2021-10-18T13:05:24.075900Z

Interesting, thanks for the info!

flowthing 2021-10-18T13:05:56.076200Z

And yeah, I glanced at that bracket pair colorization article earlier, but, well, I'm an English major, so... 😛

😂 1

dominicm 2021-10-16T15:02:23.062600Z

I ended up rewriting it using treesitter/lua and now it doesn't even show up in the profile it's so fast.... Oh yeah!

pez 2021-10-16T16:13:33.062800Z

That’s a similar path as Calva uses, even if we are using a homegrown syntax parser.

Clojurians Log v2

nrepl