tree-sitter

mauricio.szabo 2023-09-15T19:33:34.631449Z

I don't know if anyone is following, but I am thinking on some very useful things for Clojure support on Pulsar's Tree-Sitter

mauricio.szabo 2023-09-15T19:34:07.622689Z

Like this one - syntax-quoting will make vars that end with # highlight differently, so we know we're using a local var

lispers-anonymous 2023-09-16T13:44:38.265279Z

The current tree-sitter grammar marks the start of quoting and unquoting, but once you are inside the quoted thing you don't have a reliable way to tell that the current form is quoted.

lispers-anonymous 2023-09-16T13:46:32.181699Z

Since you can arbitrarily nest things in a quoted symbol, I don't think it's possible to write a query to know when something like my-gensym# will expand to a gensym'd symbol, or just be a plain symbol ending in #, right?

mauricio.szabo 2023-09-16T14:08:58.466079Z

Yes - currently, tree-sitter doesn't support recursive queries (like all descendants of a specific parent) so we're making specific #set! variables that capture some of these situations. It's not perfect, and I am actually thinking on extending some of these variables too

mauricio.szabo 2023-09-16T14:09:39.848029Z

I would think that what we're doing is a hack, the problem is that everybody else is doing the same thing, so.... 🤷

lispers-anonymous 2023-09-16T14:15:09.347549Z

With the clojure grammar matching just about anything semantic is a hack since the grammar only exposes basic lisp syntax. Even trying to match a function definition can be problematic if some is writing funny code.

(defmacro foo [n args & body] ;; untested
  (println "I'm compiling!" `(defn ~n ~args ...))
  `(let [x# "i'm a macro"]
      (defn ~n ~args 
        (println x#)
        ~@body)))
How do we know the inner defn s aren't actually defining functions when using the grammar? We can't really. It's all best effort.

lispers-anonymous 2023-09-16T14:15:55.732489Z

Tree-sitter wasn't really designed to work with languages that have these powerful macros. That's why the clojure grammar doesn't try to expose defn nodes. You can't really know it's a defn until you evaluate it.

➕ 1
lispers-anonymous 2023-09-16T14:16:33.250389Z

All that said, highlighting gensym literals is a neat idea.

➕ 1
2023-09-16T15:38:23.508529Z

Yeah agreed

2023-09-15T19:34:33.428549Z

that would be very cool

2023-09-15T19:35:09.593259Z

i experimented with highlighting queries for locals for clojure.core functions, but couldn't ever get it to work properly so i gave up

lispers-anonymous 2023-09-15T23:58:24.462269Z

I've thought about a clojure grammar the understands quoting and unquoting. so like

(`(foo (bar (~baz sym))))
Might produce a tree that looks like
(list 
  (syntax_quote
    (quoted_list quoted_symbol 
      (quoted_list quoted_symbol 
        (quoted_list (unquote unquoted_symbol) quoted_symbol)))))
There would essentially be an unquoted and quoted variant of most nodes. It would also make it possible to give nodes like ba# some kind of type like gensym_symbol . Without something like this tree-sitter doesn't really know if you're in a quote or not, since it can't backtrack

mauricio.szabo 2023-09-16T03:00:47.177529Z

The current tree-sitter-clojure does have this tree, right?

mauricio.szabo 2023-09-16T03:01:46.497639Z

Yep - it does:

mauricio.szabo 2023-09-16T03:01:56.123579Z

It's possible to handle these cases with some specific queries I believe

mauricio.szabo 2023-09-16T03:02:17.224129Z

(The problem is that currently, tree-sitter queries are quite limited and basically every editor is doing their own thing - Pulsar included)

mauricio.szabo 2023-09-16T03:03:18.559689Z

It doesn't have the quoted_list parent, but does have a syn_quoting_lit that is a parent of a sym_list

mauricio.szabo 2023-09-18T22:59:19.645169Z

Yup, doable :)

1
😍 1