Fork me on GitHub

So I have basic completion in Emacs going using clj-kondo analyses output


But now I'm realizing, for completing the vars from the "require" clause, I'll need a better notion of a project. Cause right now I'm just analyzing the buffer, and that gives me the vars in the current buffer, but not the required ones hum...


@didibus I'm curious, is this an exercise to see what can be done without a REPL running?


Eventually I'd like to complement Cider's nRepl completion with clj-kondo, since they don't always overlap, for example, Cider will not complete until a var is loaded in the REPL (or it'll do so using text completion and just list out all words in the file)


I think it be nice to have completion and jump to definition, and find usages, without nRepl as well, as a fallback,if you don't have nRepl setup.


I guess I can use clj-kondo's own notion of projects, or I could use projectile.


Well, I'm always happy to see tooling that doesn't rely on the 800lb gorilla (CIDER/nREPL) so this is good work!


haha, ya, while I do love all the power features of cider/nRepl. Sometimes I'd be happy just using inf-clojure, especially if I have clj-kondo for linting and completion and basic jumps. Useful when I remote connect to a REPL that doesn't have nREPL setup for example.


It's a huge plus point for me to have tooling that works exactly the same locally and remotely without any additional server-side dependencies (and without any code injection).


Why not code injection? Unrepl style?


@didibus when I connect my editor to a Socket REPL running in a production process (or even a QA process), I want it to interfere as little as possible. When we first got started with Clojure, we used to run nREPL servers on QA and production processes with a fair bit of middleware, but it really felt like a "smell" to have all as non-development dependencies. That's why we switched to Socket REPLs and for a while we put up with unrepl injection (of compliment) but now, with the way Atom/Chlorine works, it no longer needs compliment for completion and it is slowly moving away from even the unrepl level of injection which is a good thing as far as we're concerned.


Hum, ya that's a fair point for prod REPLs


But I've looked at some of the Chlorine code, and it seems to all be handled through code injection. How is it going to move away from it without dropping features?


using clj-kondo, alc.index-defs ( does an initial indexing to produce either tags or TAGS -- these can be used by emacs or whatever other editor has support for them. it works across dependencies as well including handling the namespace prefixes on symbols. so you get jump-to-definition without a repl running and so far it has been pretty accurate. iiuc, in emacs, TAGS info can also be used toward completion. the main downside is that indexing across dependencies takes a while initially. i've used the results in atom, emacs, vim, and vscode.

👍 8

Ya, but I'm not a big fan of TAGS. They're a bit annoying to use. I wanted something more integrated.


ah, would you mind sharing specifically how you find them to be annoying to use?


It's two separate process, you have to first run the command to analyse and generate the tags and then tell Emacs where to find the tags. And you kind of have to regenerate them every here and there as you make more changes to the code.


yes, that is true. do you have any better ideas for handling the indexing across all of the dependencies?


i think that the info in the dependencies doesn't change that often. i don't know what's typically the number of files one edits, but it's probably not too many? so a hybrid approach might work out where there is a base-indexing (plus updating when dependencies are updated). for individual file analysis, it may be that clj-kondo is fast enough. also, i wonder if instantaneously updating is really necessary.


(btw, i think for the use-case of just reading a code base (especially someone else's), i don't think the initial indexing is too much of a price to pay)


Hum, well I am just starting to think about it


Ya, for reading and navigating TAGS are nice


Right now I call clj-kondo to analyse the current buffer on every completion-at-point invocation.


Seems pretty fast, for single files, clj-kondo takes less than 100ms to complete. But I'm guessing it might take longer when run on a project


nice! may i ask what the typical buffer sizes you have tried with? what i try to benchmark against for editor type stuff is clojure.core.


when i feel like feeling pain i go for clojurescript's core 🙂


Haven't tried clojure.core yet, I thought I should since that namespace is huged, probably at the upper bound on the size that you'll find at large


clojure.core is probably not typical for most projects


yes, i have over 100,000 clojure files here and clojure.core is definitely not typical 🙂


How is clj-kondo handling the cache?


Does it skips files that haven't been touched since last time they were linted/analyzed?


@didibus Your editor drives how clj-kondo is invoked. clj-kondo just does what it is told to by the editor


it doesn't look at newer since last linted or anything like that, it just lints a file it is told to lint


What about when told to lint a directory? Like how it can find the root with .clj-kondo to identify the project and lint the entire project


@didibus To lint an entire project including dependencies you typically invoke it like:

PROJECT_ROOT $ clj-kondo --lint $(clojure -Spath)


clj-kondo looks from the current working dir upwards to find a .clj-kondo directory


So when it lints mutliple files at a time like that, like say given a directory or using a full classpath, what is the cache used for?


@didibus the cache is more or less a database with information which is used for linting, like arities from library functions


But it should in theory make re-linting a full project a second time faster correct? Or does it serve a different purpose?


no, it's not used like that. it's only used to hold on to information that it's seen before


Hum, I'm not sure I understand that part. If it's afterwards looking for that info again, it would first find it in the cache and so wouldn't need to recompute it? Is that right? So wouldn't that make it faster?


say you are linting library cheshire. it finds cheshire.core and now it saves the fact that cheshire.core/generate-string takes x arguments to the cache. when you are using this function from your editor and call it with n != x args, you will get a lint warning, because clj-kondo was able to get that information from the cache. the cache is not used to preventing files from being linted again.


a better name is really "db"


Ah ok. Ya so I understand that. Let me put my question a different way. If I generate analyses for a full project, by doing clj-kondo --lint (clojure -Spath) --config {:analyses true}I know I got the config command wrong but you get the point. And after doing that, I have clj-kondo analyse one single file from that same project. Would that analyses now be faster?


the analysis itself doesn't go any faster, it just lints that single file


it also doesn't return the analysis information for anything but that single file


Ok ya I see. But it seems the cache might have all that's the needed for the require namespaces "analyses"? How hard do you think it would be for the analyses feature to be extended to support say an additional argument that also includes in it's output the var-definitions of the required namespaces?


@didibus I think clj-kondo should be fairly agnostic about what you do with the analysis information. For your purposes it might be better to process the analysis output, extract only the information you need and save it elsewhere (possibly also in the .cache directory). Then make incremental updates to that using the analysis output of the single file.


Processing the analysis output of the entire project on every completion will be more expensive than pre-processing it once.


So after analysing a file, the current cache does not already contain its list of var-defs?


I guess I'm just trying to see if I'm just.going to recreate a cache for what clj-kondo already caches.


@didibus clj-kondo does not use the analysis export internally, the information in the cache is slightly different, so it cannot provide you the analysis export from other namespaces, even if they are in the cache.


also the information from the cache itself is an implementation detail one should not rely on


Ya I guess that's a fair point


(i.e. one should not reverse engineer the transit files and expect that to keep working over versions)


So what you said above was my idea. I'd just analyse a whole project probably when you initialize the minor mode in a buffer, and also maybe on changing the deps.edn, project.clj, and such file afterwards. Cache for each file the analyses data as well as the file last modified timestamp. And then on completion I'll ask clj-kondo to analyse the current buffer, where I'll update the cache for the current file with its result, and then grab the var-definitions of that file and all it's required file from the cache and merge them for the completion list I give Emacs


Yep. Where cache = the thing specialized for your purposes.


btw, typical times for analyzing the whole project range in the minutes here


that's including depdencies


Ya, my own cache, not clj-kondo's


sorry, that's including the post-analysis phase


@sogaiu I would say that's fairly untypical too, clj-kondo can lint itself including deps in 10s


you're right -- i'm misremembering. though the total times i see can exceed 20s if including the post-analysis work. the point i'm trying to make is that this is not a great duration from the perspective of editor use. one might want to parallelize or work in the background.


yeah, see the pmap branch


that is about what i get for clj-kondo, yes


Maybe I should analyse things as they are encountered in the buffer instead


clj-kondo is not that large


as awesome as it is 🙂


@didibus Yes, analyze per buffer, but an initial lint of dependencies can be on demand


Like when you open a file in the editor, on opening the file I'd analyse the file, and then analyse the files from its require.


In an async process and cache that


Then all completion is looked up in the cache


That should amortize for the user the amount of time he's ever waiting


Can clj-kondo work in parallel? Or would it trip up?


If say I use 4 clj-kondo process at a time linting different files?


Anyways, I think one or more combination of these ideas would probably work well enough. I'll see how far I can get on that in the next few weeks

👍 4

btw, here's some distribution of file sizes across clojars-fetched clojure source:

1k:  73042
  2k:  22604
  4k:  16474
  8k:   7887
 16k:   3027
 32k:   1014
 64k:    331
128k:    108
256k:     21
512k:      6
  1M:      2
  4M:      3


so the majority of things seem to be under 1k


That's good


@didibus I have one issue in the clj-kondo issue tracker for linting in parallel. This work is in the pmap branch. It seems to work (using pmap), but there is one strange issue when I pmap over the sources of a jar file. Linting a directory does work. I still have to get to the bottom of this