cljdoc

Casey 2025-07-25T15:38:28.376539Z

πŸ‘‹ hi folks, I've been working on a tool to convert the cljdoc download bundles to offline docsets that can be loaded by Zeal or Dash. I know Dash users already have the custom cljdoc integration, but us non-macos folks who have to use Zeal have been left out. I want to compare if my generated Docsets are up to snuff by comparing them with the Dash generated ones. Could a macos Dash user kindly share one of the generated docset bundles with me?

Casey 2025-07-29T07:46:59.504239Z

Thanks this was really helpful!

πŸš€ 1
Casey 2025-07-29T15:39:58.543159Z

Ok, I've published the repo here: https://github.com/Ramblurr/cljdocset It's a babashka cli tool and I think the docs it generates are on par with the auto-Dash docs, and even better in some ways. Would love some feedback if people take it for a spin. Ideas for improvments: 1. Add a web server and host Dash/Zeal feeds so no one needs to install this tool directly (then can use add the feed url to Dash/Zeal) 2. Explore how to integrate into http://cljdoc.org @lee If you want to give the code a gander and let me know if you see a way forward with this, I'll be happy to talk about what to do next.

πŸ™ 1
πŸš€ 1
πŸŽ‰ 1
lread 2025-07-25T16:57:29.842269Z

Heya Casey! Are you planning on adding support to cljdoc? There's an issue for that: https://github.com/cljdoc/cljdoc/issues/646

lread 2025-07-25T16:58:45.063109Z

I'm not a Dash user, but do have a license for cljdoc support reasons. If no other Dash enthusiast chimes in, I'll try to help you out.

Casey 2025-07-25T17:44:33.204899Z

Yup I found both during my research. Currently I am implementing option 1 from that issue. I'm using hickory to parse the html and feed the vars into the SQLite database. It's working well! I wouldn't object to it being integrated into cljdoc itself, but to start I wanted something standalone just to prove it could work.

πŸ‘ 1
Casey 2025-07-25T17:47:16.641969Z

The Dash integration you mention is something that's only in the MacOS app and I suppose is doing something similar to what my code is doing. On Linux we have Zeal which can only consume prebaked docsets.

2025-07-25T17:56:02.186039Z

❀️ 2
2025-07-25T17:57:14.156329Z

Let me know if you want/need anymore

Casey 2025-07-25T17:58:39.866669Z

Could you please share metosin/reitit and weavejester/hiccup?

πŸ‘ 1
2025-07-25T18:04:23.587219Z

2025-07-25T18:04:46.740249Z

Casey 2025-08-17T06:26:29.955389Z

@lee I've been using this heavily the past weeks and it is very useful! I think it would be great to implement this into cljdoc. We could also provide XML feeds so users get auto updates of the docs. How do you suggest we proceed? I am happy to help maintain/own the code in cljdoc.

lread 2025-08-17T11:18:01.879699Z

That’s great Casey, I’ll take a peek at your code.

lread 2025-08-17T11:20:09.603319Z

And I’ll try out your tool with Zeal.

lread 2025-08-17T11:21:26.095299Z

If we integrate, would this be interesting to the author of Dash too?

lread 2025-08-17T13:10:51.019069Z

FYI: The bb example from the README did not work for me, but I can run via clojure like so:

clojure -M -m cljdocset.cli build --output-dir ./out rewrite-clj/rewrite-clj

lread 2025-08-17T13:14:26.971119Z

Ah interesting. You also download all images so docset is more truly offline.

lread 2025-08-17T13:25:52.905169Z

FYI: medley is in the deps.edn but seems unused.

lread 2025-08-17T13:39:25.814619Z

FYI: I installed Zeal via flatpak. The docset directory was different from what your README stated. I discovered the correct docset dir for my Zeal via Zeal->Edit->Preferences->Docset storage. For me it was: ~/.var/app/org.zealdocs.Zeal/data/Zeal/Zeal/docsets

lread 2025-08-17T13:41:09.705809Z

I expect this is the same for Dash (only used ages ago to test) but I don't find the separation of the API into "Functions" "Macros" and "Variables" terribly useful for clojure libs.

lread 2025-08-17T13:43:31.745399Z

Also probably same for Dash, but "Sections" section seems a bit weird, don't you think? It has headings from Guides with no context.

lread 2025-08-17T14:52:56.630509Z

The zeal docs say I can search within a single docset via <docsetname>:<search string>. What determines <docsetname>? I can't seem to get it to work with rewrite-clj, for example.

lread 2025-08-17T14:58:04.618669Z

Thinking about Dash again... Zeal and Dash seem to have a symbiotic relationship. Dash provided Zeal a bunch of docsets, and in return, Zeal promotes Dash as the solution on macOS. I don't think integrating this work into cljdoc would hurt Dash at all, but I'll probably ping the Dash author out of courtesy. I'm guessing he'd prefer your docsets because he would have to do no processing on them.

lread 2025-08-17T15:05:13.619669Z

So, this looks all very interesting for integration. But what problems might we have? 1. Will the processing time and resources be an issue? We regenerate zip offline download for each request (no caching). This is pretty heavy but I haven't seen an issue with it yet. Probably due to light demand? Your docset processing is heavier. I wonder if that could cause an issue. 2. What about bundling the images? This is great in that the docset is now truly offline. But the download is bigger. And do we potentially get into any trouble for redistributing images? That might be copyrighted?

Casey 2025-08-17T16:22:53.364689Z

> Ah interesting. You also download all images so docset is more truly offline. > That's important to me, I like to be able to work offline Re medley: good catch thanks Re install dir: you must have installed it via a flatpak or something? Depends on your install method. But I can add the instructions for finding the dir. Good idea. Re Sections/Macros/Functions: I think the Sections isn't that useful, I added it after looking at the docsets from Dash that Anders shared. The only reason to have them is so that they are indexed and can thus be searched. I think the others are useful, not for browsing but when searching you get a nice little icon. Re Dash integration: I agree with you. Dash has a bunch of features that Zeal lacks and will probably never get. One of those is w/e feature in Dash that exists now to show cljdoc. I don't think he will be peeved, he may be inclined to use cljdoc's feed if you choose to adopt this. I doubt he will care one way or another if it stays in my little repo like it is now. Re rest: will return to this thread later this evening:pray:

2025-08-17T16:26:34.054469Z

Absolutely loving this btw. When my last macbook bites the dust the path for my return to linux has been paved.

Casey 2025-08-18T08:35:47.715599Z

("this evening" is now "next morning", but better late than never) > The zeal docs say I can search within a single docset via <docsetname>:<search string>. What determines <docsetname>? I can't seem to get it to work with rewrite-clj, for example. Good point! I will investigate > Will the processing time and resources be an issue? I can't really speak to that. I of course don't know the compute resources / budget of http://cljdoc.org. I think generating the docset should only be done when its first requested, but it should be cached, maybe with a TTL. Demand won't be that high I imagine, but that's always relative to http://cljdoc.org's capacity. > What about bundling the images? do we potentially get into any trouble for redistributing images that might be copyrighted? IANAL. But why would images be special case versus the text of the documentation itself? A cursory look at http://cljdoc.org and I can't find a TOS, so I think bundling for offline use seems reasonable and aligned with the spirit of documentation hosting. > This is great in that the docset is now truly offline. But the download is bigger. The download is bigger for containing images, but my experience says 99% of docsets do not contain images, and if they do its a badge or two in the README. I don't think it makes a meaningful difference. We could add some safeguards like max images / max size processed if we think it is a concern.

lread 2025-08-18T13:32:02.475469Z

Cljdoc hosting is generously sponsored by Exocscale. We are currently running on a single compute instance with 8gb RAM and 4 CPUs. We make use of their Object Store for db backups. We out-source API analysis to CircleCI jobs, with their blessing (at least historically!). I'll ping the author of Dash, just to let him know what we are musing about, and to keep him in the loop.

Casey 2025-08-18T13:55:03.061219Z

> We out-source API analysis to CircleCI jobs, with their blessing (at least historically!). Interesting! We could do the same for the docset builds if CircleCI is still ok with it?

lread 2025-08-18T14:34:13.614219Z

That's a possibility.

lread 2025-08-18T14:43:51.665199Z

I think an initial, clear, and easy win would be to describe your new tool in the cljdoc docs. I'll go ahead and do that.

Casey 2025-08-18T14:55:14.296069Z

Sorry I don't follow, what do you mean by "describe the tool in the cljdocs"?

lread 2025-08-18T15:09:24.958719Z

I think you misread maybe? I wrote: "describe your new tool in the cljdoc docs". I'll describe it in the https://github.com/cljdoc/cljdoc/blob/master/doc/userguide/for-users.adoc.

πŸ‘ 1
lread 2025-08-18T15:09:52.571239Z

BTW, Dash author got back to me... no concerns at all.

lread 2025-08-18T15:31:43.638789Z

cljdoc docs update: https://github.com/cljdoc/cljdoc/pull/1068

lread 2025-08-18T15:32:51.074489Z

We could consider this done if we are happy with Option 1 of https://github.com/cljdoc/cljdoc/issues/646

lread 2025-08-18T15:39:51.390689Z

I just re-read issue 646, nice work by that lread fellow! I'd like to meet him someday!

πŸ˜„ 1
Casey 2025-08-18T21:33:00.177139Z

I was thinking more about how to support feeds. In Zeal you can add a feed url, a bit of xml described in 646. Each project in cljdoc would get its own feed url. Presumably a user could copy paste from somewhere (maybe the footer) a feed url. The issue is how and when cljdoc generates the docsets for the feeds. That could be tricky.

lread 2025-08-19T03:25:29.371529Z

Well, we could be done for now, @ramblurr. My analysis from 646, suggested Option 1 (which you have kindly implemented!) if there are only a handful of people interested in this feature. I think that's the case here. Watcha think? Does that make sense for now?