Fork me on GitHub
#vscode
<
2020-02-15
>
sogaiu13:02:08

@pez i think i have something that might be worth trying out now -- have you used tree sitter at all? if not, i can write up some notes on how to use things. if you're already familiar with tree sitter stuff, you can probably just replace the grammar.js file in one or the other of the tree-sitter-clojure repositories with this: https://gist.github.com/sogaiu/a6c7a6dca4259480ca960c4f8a594019 (i used tree-sitter-cli 0.16.4)

pez13:02:26

No, never used tree sitter. And I'm not sure where I would plug it in, even if I think I have an idea about how it could fit in.

sogaiu13:02:17

the tree-sitter cli provides a parse command which can measure execution time -- so one can at least get some sense by running it on individual clojure files.

pez13:02:21

My processing of core.clj takes 150 ms, btw. Measured with checking the clock before and after. I don't know how much of that is the parsing. There is quite a lot going on in that proceess.

sogaiu13:02:07

ah thanks for the numbers. borkdude and i did some measurements with clj-kondo's parsing -- we both got about what you got.

sogaiu13:02:39

ofc we are all using different machines though 🙂

sogaiu13:02:12

if you're interested, i'll write up some instructions. i've also made a few sample vscode extensions that use the grammar.

pez13:02:26

I have a pretty speedy machine, even if it is a laptop.

sogaiu13:02:04

i've been testing on a notebook as well.

pez13:02:38

I'd like some instructions, yes. Not sure when I will have time to act on it, but I'd like to do some experimentation to try figure out a bit more about what kind of work would be involved.

sogaiu13:02:03

ok, i'll work on putting a repository together with instructions then.

pez13:02:25

If you want to do the measurement yourself, here's where I put my time stamps: https://github.com/BetterThanTomorrow/calva/blob/dev/src/cursor-doc/model.ts#L388

pez13:02:39

At entry of that function and at exit.

sogaiu13:02:01

ty! i hope to take a look soon -- gotta sleep first though 🙂

pez13:02:18

Crazy TZ diffs! 😃

sogaiu13:02:36

he he -- we can leverage the differences to get more work done sooner 😉

pez13:02:11

vscode Clojure has someone at the watch 24/7.

pez13:02:13

I'll try to time the rainbow paren parser as well. Even if that one is in no way a full parser.

pez13:02:21

I imagine tree sitter would be the thing returning my tokens here: https://github.com/BetterThanTomorrow/calva/blob/dev/src/cursor-doc/model.ts#L41

sogaiu14:02:17

one way to use the results is to use tree sitter's tree traversal api. another way to use the tree is to construct a query to it and then get back matching info.

sogaiu14:02:56

i also fed the root node to:

(defn node-seq
  [node]
  (tree-seq
    #(< 0 (.-childCount ^js %)) ; branch?
    #(.-children ^js %) ; children
    node)) ; root
to get back a sequence of nodes to process.

sogaiu14:02:32

it's a bit ugly, but here are two ways of doing the same thing:

(defn doc-syms
  [^js doc ^js tok]
  (if (= "clojure" (.-languageId doc))
    (if-let [uri (.-uri doc)]
      (if-let [^js tree (get @trees uri)]
        (let [id-nodes (find-id-nodes (node-seq (.-rootNode tree)))]
          (->> id-nodes
            (map (fn [id-node]
                   (make-fn-sym-info (.-text id-node)
                     doc
                     (make-range id-node))))
            clj->js))
        #js [])
      #js [])
    #js [])) ; XXX: how to reduce this repetition

sogaiu14:02:45

;; uses tree sitter query api
(defn doc-syms-2
  [^js doc ^js tok]
  (if (= "clojure" (.-languageId doc))
    (if-let [uri (.-uri doc)]
      (if-let [^js tree (get @trees uri)]
        (let [q (.query @clj-lang
                  (str "((list "
                       "   (symbol) @head "
                       "   (symbol) @name) "
                       " (match? @head \"^def[a-zA-Z]*\"))"))
              ms (.matches q (.-rootNode tree))]
          (->> ms
            (keep (fn [match]
                    (when match
                      (when-let [caps (go/get match "captures")]
                        (when (= 2 (count caps))
                          (when-let [name-node (go/get (nth caps 1) "node")]
                            (make-fn-sym-info (.-text name-node)
                              doc
                              (make-range name-node))))))))
            clj->js))
        #js [])
      #js [])
    #js [])) ; XXX: how to reduce this repetition

sogaiu14:02:44

those basically implemented provideDocumentSymbols

pez14:02:11

I think it would be nice to build Calva's token cursor from a parser that more projects cared about.

pez14:02:23

But I'm pretty fond of the token cursor as such, because it is very natural and intuitive to work with.

sogaiu14:02:56

i'll have to study it to appreciate those observations 🙂

pez14:02:15

I might whip up a symbol provider using the token cursor and we can compare, for the fun of it. 😃

sogaiu14:02:51

sounds good to me -- hope that fits in your schedule 🙂

pez14:02:44

Haha, well, not that I really have a schedule, but yeah, lots of things to do, no doubt.

sogaiu14:02:24

i'm off for the evening -- good luck with your endeavors 👋

metal 4
pez16:02:33

So this is how symbols would be collected using Calva's LispTokenCursor:

function docSyms(document: vscode.TextDocument): vscode.SymbolInformation[] {
    const cursor: LispTokenCursor = docMirror.getDocument(document).getTokenCursor(0);
    let symbols: vscode.SymbolInformation[] = [];
    do {
        cursor.forwardWhitespace();
        cursor.downList();
        cursor.forwardWhitespace();
        const token: Token = cursor.getToken();
        if (token.type === 'id' && token.raw.startsWith('def')) {
            while(cursor.forwardSexp()) {
                cursor.forwardWhitespace();
                if (token.type === 'id') {
                    symbols.push(makeSymbol(document, cursor));
                    break;
                }
            }
        }
        cursor.forwardList();
        cursor.upList();
    } while (!cursor.atEnd());
    return symbols;
}

pez16:02:15

I would implement symbols using nrepl, probably, but anyway, if a static method would be called for, the token cursor is very neat, imo.