This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-01-27
Channels
- # announcements (4)
- # asami (6)
- # aws-lambda (1)
- # babashka (38)
- # babashka-sci-dev (20)
- # beginners (87)
- # calva (67)
- # cider (19)
- # clerk (13)
- # clojure (102)
- # clojure-europe (52)
- # clojure-filipino (1)
- # clojure-hungary (4)
- # clojure-nl (1)
- # clojure-norway (6)
- # clojure-sweden (3)
- # clojure-uk (1)
- # cursive (13)
- # data-science (7)
- # datomic (8)
- # deps-new (1)
- # emacs (3)
- # fulcro (16)
- # graphql (3)
- # humbleui (3)
- # kaocha (3)
- # leiningen (3)
- # malli (3)
- # off-topic (14)
- # pathom (34)
- # polylith (4)
- # rdf (12)
- # reitit (3)
- # releases (1)
- # remote-jobs (7)
- # rum (2)
- # sci (22)
- # shadow-cljs (115)
- # tools-deps (26)
- # tree-sitter (29)
Folks, does queries already work with tree-sitter Clojure?
I was trying to use it and I'm getting syntax errors on queries...
They should, but ts clojure has no built in semantic markings, only data. The maintainer feels that given the extensible nature of clojure, trying to model locals and function definitions etc is fruitless
Regular tree-sitter queries should work. There are example of emacs flavored queries here: https://github.com/clojure-emacs/clojure-ts-mode/blob/c9f1ed357d1cc9b73dfa53ef239a846a7ef17bd2/clojure-ts-mode.el#L198
Noah is correct though, the grammar only describes raw syntax of clojure. Aka, this thing is a list, this thing is a symbol, etc. There are no nodes for things like functions for namespace declarations.
I'm still not sure them how syntax highlighting is supposed to work, honestly. I though we need to define queries and these queries will return what needs to be highlighted... did I understand that thing wrong about tree-sitter?
Ehh, sort of? Tree sitter has the parser, which reads in the file (a string) and creates nodes: (def foo ::cool)
is turned into something like (list_node [(sym_node (sym_name)) (kwd_node (kwd_name))])
that's the AST that the queries then run over. the syntax highlighting maps kwd_node
to "purple", or (list_node [(sym_node (sym_name) ...]) where sym_name is def)
to green etc
maybe a tree-sitter plugin can be made or enhanced with clj-kondo since it does have this semantic information (which is also used by lsp clients for highlighting, it's called semantic tokens)
For editors like Emacs that is probably overkill. We can infer enough semantic information 99% of the time to get good syntax highlighting that works better than the existing regex based highlighting. For example, we know it is mostly safe to say something like
(list_list (sym_lit) @def_kw
(sym_lit) @def_name)
To match definition forms, when @def_kw
matches a regex like "^def.*"
The actual function name matching is more complicated, but it's not too crazy.Beyond syntax highlighting, navigation and indentation are really the only things to worry about. Indentation is complicated but not too much.
ah yes, makes sense. so by matching def + defn you can also infer the "function" names maybe?
That can also be done by writing queries to match bindings inside let
type blocks.
BUT all this can be thwarted by users. Imagine
(defmacro my-improved-let ...)
Similar with def
Users can always sidestep this stuff with macros. That's why we don't codify any of that in tree-sitter-clojure. We can get false negatives and false positives trying to identify semantic things in the language because users can change the language's semantics@UKFSJSM38 Does clojure-lsp colorize locals based on clj-kondo analysis, I think so right? This also has support for macros (when you configure it correctly)
Also navigation to locals works like you would expect. Also e.g. when they have the same name as a var. But I assume you can also make that work with tree-sitter
Maybe. Tree sitter lacks the context of the rest of the document though. It doesn't know what was declared as a macro somewhere else, or a binding up above.
In trying to do that with tree-sitter I would end up doing a lot of the same work that clj-kondo does, and clojure-lsp
I wonder what the difference in experience is with tree-sitter vs clojure-lsp for highlighting, I hope @UKFSJSM38 knows a bit about this.
Yeah, it doesn't do general highlighting for the entire document right?
IMO clojure-lsp semantic tokens are smarter since we have more data from kondo, so we know if a call is from a function or macro for example and other things
Yeah, I don't have access to that info at all in tree-sitter. BUT the semantics tokens work very well on top of tree-sitter highlighting in Emacs. I've used them both at the same time.
Tree-sitter is great at describing the syntax of a language. Clojure doesn't have much syntax, so as a result the tree-sitter grammar is extremely simple. In other languages that are not as flexible as Clojure or other lisps it can also be good at understanding the semantics as well. But for Clojure, to understand the semantics means to understand that program, because the program can change the semantics. Tree-sitter can't do that, it's not even possible for it to backtrack a single character when parsing. Some basic semantics, like the def
example above, can be considered, but it's always possible for it to be wrong. Normal clojure-mode can run into this as well. That's why more sophisticated tools like clojure-lsp are such a good supplement.
Well, I was trying to follow this: https://tree-sitter.github.io/tree-sitter/syntax-highlighting, where it does mention something like:
> The Tree-sitter highlighting system works by annotating ranges of source code with logical “highlight names” like function.method, type.builtin, keyword, etc. In order to decide what color should be used for rendering each highlight, a theme is needed.
When I tried to use tree-sitter-clojure
, none of these names like function
were returned anywhere, so I'm completely lost...