Fork me on GitHub
#lsp
<
2022-04-09
>
andrewzhurov09:04:58

I've taken a look at how clean-ns works, and it seems that datamodel of an ns, that is being cleaned, is managed with https://github.com/clojure-lsp/clojure-lsp/blob/87ec9773b63e884304499fa592c0c8c858a79490/lib/src/clojure_lsp/feature/clean_ns.clj#L352, and formatting is a part of that datamodel (it includes whitespaces). My original thought was that you'd have a semantical representation of a namespace that you work on, and when you want to serialize it as text, you'd put clj-fmt to use, so I was surprised to see that there is no semantical representation of ns, but ns as its data DSL + formatting. This DSL was meant for humans to write, seeing how https://github.com/clojure-lsp/clojure-lsp/blob/87ec9773b63e884304499fa592c0c8c858a79490/lib/src/clojure_lsp/feature/clean_ns.clj#L176 looks scary. 😄 I can only imagine how much trouble you had with it, guys. E.g., https://github.com/clojure-lsp/clojure-lsp/blob/87ec9773b63e884304499fa592c0c8c858a79490/lib/src/clojure_lsp/feature/clean_ns.clj#L157. Why is it done this way?

andrewzhurov09:04:40

I guess one benefit for having a zipper on ns-dsl+formatting is performance, as it seems to be faster than: ns-dsl->ns-semantics <cleaning ns-semantics> ns-semantics->ns-dsl ns-dsl->formatted-text And that is given that clj-fmt can work on data, allowing for a straight ns-dsl->formatted-text Otherwise it would be: ns-dsl-(pr-str)>text text-(clj-fmt)>formatted-text

andrewzhurov13:04:08

it seems to be possible to use clj-fmt for ns-dsl->formatted-text , as ns-dsl used by clojure-lsp is in rewrite-clj's datamodel and clj-fmt works on the exact same datamodel, as seen https://github.com/weavejester/cljfmt/blob/da37b7aef08be99dbb8e2e8772d95308150d7adb/cljfmt/src/cljfmt/core.cljc#L460

ericdallo14:04:55

ATM we use cljfmt only for format feature, I know cljfmt has some performance issues too so if we manage to make clean-ns without it, sounds good, the tradeoff is not that huge I think

ericdallo14:04:45

when one uses clean-ns via API, performance is important, and making clean-ns slower using cljfmt would be not a good idea, but we may need to test the performance tradeoff

snoe19:04:43

I think the main problem is that the ns macro is just an incoherent, inconsistent grab bag of features, some of which haven't been used in a decade, others that are getting added with every release. Whether you go to an intermediate step or not, the biggest challenge is analyzing it. With lsp though, whitespace and comments matters a lot too. And, in general, people want what they have written with with an unused require removed. Your intermediate step has to capture all of that so it can be put back, ultimately just modifying what's there seemed much more straightforward than inventing a dsl that rewrite already gave us.

snoe19:04:55

semantically to clojure, there's no different between (:require a.b.c) (:require [a.b.c]) (:require [a.b [c]]) but within a project they may have semantics that we should maintain

jacob.maine00:04:41

@U0ZQT0K2N clojure-lsp uses two tools, for two purposes: • rewrite-clj, a zipper of the code structure, which maintains whitespace and comments • clj-kondo, which outputs semantic analysis (and also points out semantic and syntactic errors, a.k.a. linting) When you need to change the structure of someone’s code, you must preserve whitespace and comments. They might not mean anything to the compiler, but they are crucial to humans who need to understand the code. rewrite-clj is an important tool for that. When you need to show or change the meaning of someone’s code, you need some sort of semantic analysis. clj-kondo provides that, in the form of, among other things, how symbols relate to each other, and how function calls work. (There’s a lot more to semantic analysis, especially around types, which clj-kondo can do to a limited extent.) So, when you’re reading clojure-lsp, think about both tools. Many features use only one, but some use both. clean-ns is a good example. Looking at that code, it appears that it’s mostly a LOT of manual manipulation of the structure of the code, ignoring its semantics. But I think you’re focusing on the fact that it takes some real work to re-structure code in a whitespace-and-comment preserving way, especially when that structure is particularly baroque, as with ns. But, all that zipper manipulation obscures the fact that clojure-lsp does need the semantics of a namespace to clean it. It has to know which requires are unused or are duplicates, among other things. To get that data, it queries the clj-kondo analysis. See https://github.com/clojure-lsp/clojure-lsp/blob/1d6e554ec0afd3a04c113e513f6b6724bbe0f709/lib/src/clojure_lsp/feature/clean_ns.clj#L359-L362. In that sense, I think it’s wrong to say that “there’s no semantical representation of ns”. You’ve posted in several different channels about this. I haven’t read all your comments (you’re a bit wordy, a problem I have myself, as you can see 🙂) so I’m not entirely sure I understand what you’re looking for. If you’re strictly interested in semantic analysis, I suggest looking at clj-kondo again. It’s advertised as a linter, but at its core, it’s a semantic analysis tool, with a linter attached to it. If you’re interested in manipulating code as text, clojure-lsp has a lot of lessons.

❤️ 1
ericdallo02:04:33

Thanks for the great explanation @U07M2C8TT :)

andrewzhurov08:04:37

> semantically to clojure, there's no different between (:require a.b.c) (:require [a.b.c]) (:require [a.b [c]]) but within a project they may have semantics that we should maintain @U0BUV7XSA so code semantics are the same, these are different ways to present/format/view semantics. Ideally how to view semantics should be up to the end user, but currently we need to maintain one view of program semantics, so in order to have it easier for people to follow the styleguide something like clj-fmt is put to use. If we're following down that road, I'd expect clj-fmt to take care of presenting ns semantics as well, formatting it uniformally as a.b.c or [a.b.c] or whatever else happened to be project's styleguide

andrewzhurov08:04:12

> than inventing a dsl that rewrite already gave us ah, I thought that a semantical representation of an ns can be derived out of rewrite-clj's representation of ns (that's what I refered to as ns-dsl) we would have:

ns-dsl-text -(rewrite-clj)> ns-dsl-data -> ns-semantics -> cleaned-ns-semantics -> ns-dsl-data -(clj-fmt)> ns-dsl-text

andrewzhurov08:04:33

> When you need to change the structure of someone’s code, you must preserve whitespace and comments. They might not mean anything to the compiler, but they are crucial to humans who need to understand the code. rewrite-clj is an important tool for that. Ah, indeed. I took it as given that a project uses clj-fmt, they may not be, in such case preserving their way of styling is needed.

andrewzhurov08:04:19

> clean-ns is a good example. Looking at that code, it appears that it’s mostly a LOT of manual manipulation of the structure of the code, ignoring its semantics. But I think you’re focusing on the fact that it takes some real work to re-structure code in a whitespace-and-comment preserving way, especially when that structure is particularly baroque, as with ns. yup, using zipper to clean ns is what caught my eye, I imagine it took a huge amount of effort and is an imressive feat to pull through, but at the same time I was puzzled why it was decided to work on that representation of ns over deriving a more semantic one. I have learned of two good reasons: • need to preserve structure (formatting), as clj-fmt may not be used in a project • zippers are fast, clj-fmt may be too slow

andrewzhurov08:04:45

> But, all that zipper manipulation obscures the fact that clojure-lsp does need the semantics of a namespace to clean it. It has to know which requires are unused or are duplicates, among other things. To get that data, it queries the clj-kondo analysis. See https://github.com/clojure-lsp/clojure-lsp/blob/1d6e554ec0afd3a04c113e513f6b6724bbe0f709/lib/src/clojure_lsp/feature/clean_ns.clj#L359-L362. In that sense, I think it’s wrong to say that “there’s no semantical representation of ns”. Ah, indeed, there is a semantic representation of ns! Thanks for pointing. 🙂

andrewzhurov08:04:18

> You’ve posted in several different channels about this. I haven’t read all your comments (you’re a bit wordy, a problem I have myself, as you can see ) so I’m not entirely sure I understand what you’re looking for. If you’re strictly interested in semantic analysis, I suggest looking at clj-kondo again. It’s advertised as a linter, but at its core, it’s a semantic analysis tool, with a linter attached to it. If you’re interested in manipulating code as text, clojure-lsp has a lot of lessons. I'm interested in having a Clojure program as an immutable datastructure with names as user-level entities, for that I'm taking a look at existing tools that could be leveraged. Semantical analysis tools would be of use for that. I acquired an impression that clj-kondo does not derive a semantical representation, due to seeing how it's grounded in linting, but it's a wrong one. As you point out, it does have semantical representation and grounds it to text for linting. I will take a look at clj-kondo next, thanks for mention. 🙂

jacob.maine01:04:44

@U0ZQT0K2N in that case, you might be interested in https://blog.datomic.com/2012/10/codeq.html, a project that Rich Hickey was working on way back in 2012. He might still be working on this field in the background, but that repository hasn’t been updated in a long time.

jacob.maine01:04:25

He discussed some similar topics in a talk called Spec-ulation https://www.youtube.com/watch?v=oyLBGkS5ICk

andrewzhurov07:04:43

I find codeq interesting for its git integration, but it doesn't seem to go into semantics derivations much, it analyzes program down to definitions and not further, keeping code as text there is also https://github.com/quoll/cast, it does a good job at parsing semantics (stores them in Datomic)

ericdallo18:04:20

clojure-lsp We reached 800 Github stars Thank you everyone!

🎉 10
🌟 9