2023-01-27 tree-sitter | Clojure Slack Archive

tree-sitter

2023-01-27T13:38:30.189509Z

They should, but ts clojure has no built in semantic markings, only data. The maintainer feels that given the extensible nature of clojure, trying to model locals and function definitions etc is fruitless

lispers-anonymous 2023-01-27T14:59:02.515889Z

Regular tree-sitter queries should work. There are example of emacs flavored queries here: https://github.com/clojure-emacs/clojure-ts-mode/blob/c9f1ed357d1cc9b73dfa53ef239a846a7ef17bd2/clojure-ts-mode.el#L198

lispers-anonymous 2023-01-27T15:00:09.380579Z

Noah is correct though, the grammar only describes raw syntax of clojure. Aka, this thing is a list, this thing is a symbol, etc. There are no nodes for things like functions for namespace declarations.

mauricio.szabo 2023-01-27T17:45:33.273839Z

I'm still not sure them how syntax highlighting is supposed to work, honestly. I though we need to define queries and these queries will return what needs to be highlighted... did I understand that thing wrong about tree-sitter?

2023-01-27T19:07:05.069059Z

Ehh, sort of? Tree sitter has the parser, which reads in the file (a string) and creates nodes: (def foo ::cool) is turned into something like (list_node [(sym_node (sym_name)) (kwd_node (kwd_name))])

2023-01-27T19:09:00.938099Z

that's the AST that the queries then run over. the syntax highlighting maps kwd_node to "purple", or (list_node [(sym_node (sym_name) ...]) where sym_name is def) to green etc

borkdude 2023-01-27T19:10:01.003849Z

maybe a tree-sitter plugin can be made or enhanced with clj-kondo since it does have this semantic information (which is also used by lsp clients for highlighting, it's called semantic tokens)

lispers-anonymous 2023-01-29T14:10:23.143209Z

For editors like Emacs that is probably overkill. We can infer enough semantic information 99% of the time to get good syntax highlighting that works better than the existing regex based highlighting. For example, we know it is mostly safe to say something like

(list_list (sym_lit) @def_kw
           (sym_lit) @def_name)

To match definition forms, when @def_kw matches a regex like "^def.*" The actual function name matching is more complicated, but it's not too crazy.

lispers-anonymous 2023-01-29T14:11:33.363639Z

Beyond syntax highlighting, navigation and indentation are really the only things to worry about. Indentation is complicated but not too much.

borkdude 2023-01-29T14:11:46.387259Z

ah yes, makes sense. so by matching def + defn you can also infer the "function" names maybe?

borkdude 2023-01-29T14:12:00.676579Z

and what about locals?

lispers-anonymous 2023-01-29T14:12:44.390699Z

That can also be done by writing queries to match bindings inside let type blocks.

lispers-anonymous 2023-01-29T14:14:27.846729Z

BUT all this can be thwarted by users. Imagine

(defmacro my-improved-let ...)

Similar with def Users can always sidestep this stuff with macros. That's why we don't codify any of that in tree-sitter-clojure. We can get false negatives and false positives trying to identify semantic things in the language because users can change the language's semantics

borkdude 2023-01-29T14:17:07.367779Z

@ericdallo Does clojure-lsp colorize locals based on clj-kondo analysis, I think so right? This also has support for macros (when you configure it correctly)

borkdude 2023-01-29T14:17:41.841359Z

Also navigation to locals works like you would expect. Also e.g. when they have the same name as a var. But I assume you can also make that work with tree-sitter

lispers-anonymous 2023-01-29T14:18:34.807039Z

Maybe. Tree sitter lacks the context of the rest of the document though. It doesn't know what was declared as a macro somewhere else, or a binding up above.

lispers-anonymous 2023-01-29T14:19:00.983029Z

In trying to do that with tree-sitter I would end up doing a lot of the same work that clj-kondo does, and clojure-lsp

borkdude 2023-01-29T14:19:24.505159Z

I wonder what the difference in experience is with tree-sitter vs clojure-lsp for highlighting, I hope @ericdallo knows a bit about this.

ericdallo 2023-01-29T14:19:56.913439Z

Clojure-lsp does colorize locals as variables using kondo analysis

lispers-anonymous 2023-01-29T14:20:22.095129Z

Yeah, it doesn't do general highlighting for the entire document right?

ericdallo 2023-01-29T14:20:40.254839Z

IMO clojure-lsp semantic tokens are smarter since we have more data from kondo, so we know if a call is from a function or macro for example and other things

ericdallo 2023-01-29T14:22:08.635929Z

https://clojure-lsp.io/features/#semantic-tokens

lispers-anonymous 2023-01-29T14:22:54.534389Z

Yeah, I don't have access to that info at all in tree-sitter. BUT the semantics tokens work very well on top of tree-sitter highlighting in Emacs. I've used them both at the same time.

ericdallo 2023-01-29T14:23:12.196609Z

That's good

lispers-anonymous 2023-01-29T14:29:57.469109Z

Tree-sitter is great at describing the syntax of a language. Clojure doesn't have much syntax, so as a result the tree-sitter grammar is extremely simple. In other languages that are not as flexible as Clojure or other lisps it can also be good at understanding the semantics as well. But for Clojure, to understand the semantics means to understand that program, because the program can change the semantics. Tree-sitter can't do that, it's not even possible for it to backtrack a single character when parsing. Some basic semantics, like the def example above, can be considered, but it's always possible for it to be wrong. Normal clojure-mode can run into this as well. That's why more sophisticated tools like clojure-lsp are such a good supplement.

👍 1

ericdallo 2023-01-29T14:33:13.671839Z

That's my point of view as well, agreed

mauricio.szabo 2023-01-27T19:15:46.482979Z

Well, I was trying to follow this: https://tree-sitter.github.io/tree-sitter/syntax-highlighting, where it does mention something like: > The Tree-sitter highlighting system works by annotating ranges of source code with logical “highlight names” like function.method, type.builtin, keyword, etc. In order to decide what color should be used for rendering each highlight, a theme is needed. When I tried to use tree-sitter-clojure, none of these names like function were returned anywhere, so I'm completely lost...

mauricio.szabo 2023-01-27T04:09:11.729929Z

Folks, does queries already work with tree-sitter Clojure?

mauricio.szabo 2023-01-27T04:09:33.650419Z

I was trying to use it and I'm getting syntax errors on queries...

Clojurians Log v2

tree-sitter