Since we may be about to lose historical search: If someone from this community wanted to build a fast search UI on top of the Slack data (eg sourcing it from Zulip mirror or Clojurians log)- would that be ok? Do they need to to be preapproved?
Given the policy we got consensus on wrt reusing all these logs for AI training data, I think that would be a preapproved effort
For those who are interested in either improving or verifying the completeness of existing Clojurians archives, I did a little research on this for another workspace recently and I figured I’d leave some helpful links: • Slack has an export feature available to admins that could be used to backfill, verify, or bootstrap an archive, https://slack.com/help/articles/201658943-Export-your-workspace-data. It includes all messages and links to files from public channels. • There’s also https://github.com/rusq/slackdump, an active open-source project that makes tools for archiving slack workspaces. It includes all public messages and channels, emojis, users, and can be configured to also download all attachments. In its readme, it also has links to some viewer programs that will read its dump format. It does not require admin access, only an API key, though it can only archive DMs the calling user was involved in. My 2c: I think it makes more sense to rely on a well-maintained public project to archive activity in a workspace than to maintain a custom impl. If I had the bandwidth to maintain a Clojurians archive, I would probably do the following: 1. Decide I’m going to build a new archiver based on slackdump. It’s well-maintained and less likely to change/disappear than the built-in export feature. 2. Run slackdump now before the workspace’s pro license expires 3. Set up slackdump to run automatically somewhere on a regular interval. There’s a script in the repo’s contrib/ for fetching changes since the last run. 4. Investigate existing viewers for the dump format, and either pick, fork, or build one depending on what that investigation yields.
I would just leave it zipped up - you can split the file into several sub 100 MB chunks to upload it somewhere and those that need the file can reassemble it after they download the chunks. Uploading the individual files will take much longer and much more space.
Splitting into 100mb chunks can be can be done with:
split -b 99m slack-export.zip slack-export-part-
To reassemble:
cat slack-export-part-?? > slack-export-again.zip
[https://unix.stackexchange.com/questions/751593/what-is-the-state-of-the-art-of-splitting-a-binary-file-by-size]I too would be interested in the the data set. Three potential projects I could do if helpful to the community: 1. Conventional (non vector) web search UI backed by Elastic (see search screenshot) 2. LLM fine tune to write better/idiomatic clojure code (open weights model published to be huggingface). 3. A browsable history of the Slack data. Note this would not be indexed by search engines - just for humans to view (not for SEO) These would all be non commercial and non revenue generating (no ads). Projects are just meant to help and grow the Clojure community. Also I would be happy to provide additional details and conform to branding/naming/attribution guidelines etc on all of the above. Also I think there is no problem to have several (overlapping/competing) implementations of something (e.g. AI models or archive search engines etc) - it’s possible this may lead to better results. We have a strong precedent of using this data through Clojurians log web pages and sync to Zulip. I understand we haven’t always had search over slack history (before we had Slack Pro) - but by some lucky turn of events we do have it now and I think we should make the most of it.
Can this also be used to backfill channels in zulip?
Don’t see why not
Even though Sean already ran an export, I think it’d still be a good idea to run a slackdump run, since its format is understood by other programs and it pulls the assets in too
Depends whether Zulip allows it or not, I guess.
And there are issues around matching users here to users there, I guess. Sounds very valuable, though.
I see the slack archive as potentially invaluable piece of important history and all round useful resource. It sounds like we have some partial archives (Clojurians log and zulip) but no complete archives anywhere. Potentially we only have a couple of weeks left to save the data.
All someone needs to export the public stuff is an API key, that seems relatively achievable
Even if we have Pro account- if we want to preserve the data then keeping an independent backup outside of Slack feels like a good idea.
I think probably the only missing piece is someone willing to take initiative and ask the admins for an api key
@max.r.rothman if @seancorfield or another admin got you an API key, would you be willing to lead the charge on this effort? A number of us want this for a number of reasons, so I'm sure folks will chip in.
I unfortunately don’t have the bandwidth to lead the effort, though I’m happy to contribute
Since it is very low effort, I have initiated a standard Slack export of all public messages for the entire history of this Slack. It warned me that might take "several days" so I'll post back once I have that and know how large it is...
This will not include any private channels or DMs or actual files (only links to files within Slack -- but the links have tokens that allow the files to accessed from outside Slack).
Well... The dump completed (faster than I expected!) and the .zip is 613MB. Inside it is a folder for every channel that has ever existed, and inside each of those folders is a .json file for every day (for ten years).
This is a sample message from today in #beginners so folks can see what we'd be dealing with for any sort of search/viewer/import to another service.
Wow
This is fantastic
Doesn't appear to have all of today's beginner messages. Is that one of three different thread json files for today?
That is a single message. Today's JSON file has all the messages.
Gotcha
I'm very interested in this dataset. A fairly straightforward use case I would like to try is to put every thread in a vector db and make it queryable via web interface so you could search for conversations/threads by semantic meaning.
I also have dozens of other ideas that I think would be useful for the broader community.
looks like hosting under 2G per repo is free on github https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage
However, > GitHub blocks files larger than 100 MiB. > To track files beyond this limit, you must use Git Large File Storage (Git LFS).
I'm starting this thread so folks can discuss the pros and cons of various Slack alternatives they'd prefer to use if we lose Pro sponsorship, and therefore lose our history and search (and, again, a reminder that we used to have very limited history and search for years until we were fortunate enough to get sponsored by Slack, and then have that sponsorship renewed by Salesforce for a year). A reminder that the active Clojure communities are all listed here: https://clojure.org/community/resources The most active community that is similar to Slack is Zulip (Clojurians Zulip Chat) which has channels and threads and decent moderation functionality (plus the ability to move threads etc). There's a somewhat active Discord server (Discljord) -- I'm not familiar with the moderation functionality there. There's still an IRC community (`#clojure` on libera.chat) for folks who like things "old-school". I don't see https://www.reddit.com/r/Clojure/ (Reddit) listed but that in turn has a few more communities listed in the right sidebar (an unmoderated Discord server and a matrix/riot-im room).
I lean towards Zulip as well With my team we tried Telegram, Element/Matrix and ended up pretty happy with Zulip, granted we're a small team but what made us make the jump is • We own the data, hosted on our own server • Topic based conversation makes it easier to navigate many topics and discussions (see Clojurians Slack for an example of this) • Fast and snappy (unlike Element) • Account management built in (no phone required too) • Web/Mobile/Phone clients are well • https://github.com/zulip • https://zulip.com/help/import-from-slack exists but I haven't tried it before Alex, what moderation capabilities do you find important? The only downside is that mobile push notifications require a payment even when self-hosting, I assume that's one of the ways they maintain themselves which is fair, it costs $6.6 monthly for yearly billing and $8 monthly for monthly billing, I don't know how many people use mobile but my team decided the pros are worth this con and just use web/desktop clients for notifications
My only gripe with Discord is that it constantly tries to shove Nitro down your throat. So much so that I only use the web client now where I can at least use my ad blocker.
Ah, no - not the only. Channels. Discord has two types of servers - Friends and Community. The latter is most suitable for us. On Community servers, all new channels are hidden by default. AFAIK, even if you add that channel to the Default group that's supposed to be visible by default - it will only be shown to new members, but all existing members won't have that new channel visible for them unless they enable it manually. I'm also not sure if it's possible to see how many members have a particular channel enabled for them. Another one, albeit minor - link previews. We have disabled link previews for specific websites here for various reasons. Apparently, there's just no way to do it on Discord. All that's possible is for senders to remove a preview from a specific message before sending it, for mods to remove it after the message is sent, and for viewers to disable all link previews just for them.
I think we should make a DM for this (yes I've been watching Conj vids), I think it'd be a good way to at least get a grip on what matters when choosing an alternative https://docs.google.com/spreadsheets/d/1rcnGOBH6nIiMD1oXfItcVJYjiDh8lc0DjaDp8DfE9p0/edit?gid=0#gid=0
Eh. Could be a neat exercise but there are no that many platforms to go around, and at the end of the day, even if you find something that's better that Slack in every regard (to me it's Zulip actually - I don't have any gripes with it at all), the largest hurdle is actually people going there - not the software itself.
I assume people would start going there once Slack changes should the pro membership drops, e.g. no history view and everything else changing and getting limited
Only the history will be removed, nothing else, as far as I understand. And as Sean has written - we've been without history for much, much longer than with it.
@adham.rasoul I see that you put "Good search through text messages" into the Telegram column. That is absolutely not the case, the search there is one of the worst ones I've ever used. It supports only two things: specifying text (fuzzy matching by default, double quotes supported) and specifying the user. If you navigate away from the search results panel, you cannot go back to it - you have to enter all the search details again and find the place where you were. There's no highlight of why a particular message was matched to the search query, so you have to skim through all the message to find the bit that interests you. It's awful.
And you put "Bad search experience" for Discord, but I find it to be the opposite. Explicit pagination, a separate persistent panel, multiple predicates.
I seem to be outdated on the Discord search, do you want to edit the cells? your points are all valid and the sheet should allow editing
Discord search has been like this for at least 7 years. :) Sure, I'll update it.
I've only been messaging in Discord lately and only infrequently, so I haven't really searched searched
Thanks!
In one of the last rounds of Slackocalypse discussions, someone created a DM but people's feelings about features were very subjective, so it was very hard to get a fair and objective comparison. It was also hard to get a good handle on administrative tooling / moderation -- because so few people have moderated more than one service so, again, it was very patchy and subjective. Based on what's already in that DM, plus what's been said about it here, I'd say Telegram is absolutely out of consideration. I find Discord a fairly horrible experience, compared to Slack and Zulip, but I'm in several servers -- including Discljord -- so it's not so horrible that I won't use it 🙂
I think it's great for us to discuss and consider alternative online venues for this community. I'm very much open to the possibilities. I'm not against self hosted options so we're free from corporate whims. However even if we lose Pro features, I think it's very likely my preference will be for us to just stay put and remain on Slack if we are able to put together a decent search/browse over the message history.
> someone created a DM but people's feelings about features were very subjective That's okay, people will hold different opinions but I believe discussing them and putting them out in some written form is good too > Telegram is absolutely out of consideration I agree.
> we're free from corporate whims. The downside is now we're at the whims of whoever is maintaining it, which I hope is less whim-y than corporate 😆
Many thanks for this discussion. In the last 6 years, Zulip has been a game changer for us at Scicloj. It allowed us to maintain decent knowledge management along the years. Knowledge simply emerges from conversations in a reasonably structured and interlinked way. Arguably, this matters so much in open-source collaboration where often new members join, people leave for a few months and return, etc. Zulip's structure and ergonomics offer a good balance of between fast chat and slow long-term discussion. This matters. A lot. Everything is simple, every channel, topic thread, and message has a URL that can be easily opened in a browser tab. Copying and pasting URLs is easy and behaves so nicely with channels and topic threads, showing their names in a sensible way. These details matter so much for staying coordinated and offering clarity to each other. There is great support for images, markdown, code highlighting, and more.
While I agree that we should consider Zulip as the main alternative, I do not agree that everything is simple over at Zulip. Actually I find the Ux utterly confusing. Everything I try to do either works very differently from what I am used to, or I fail to figure it out. Maybe it takes 6 years to get to where everything is simple? 😃 I think that we should make sure people have a nice time switching by putting together a guide for people coming from Clojurians Slack to Clojurians Zulip: • Doing A: ◦ In Slack you did like so ◦ At Zulip you do like so • Doing B: ◦ In Slack you did like so ◦ At Zulip you can’t do this (reason about possible mitigating features)
Old habits die hard. As we learnt last time we tried to push for Zulip.
Thanks for looking into it. You are right, there is some friction at the beginning. Do you think it may help to organize an online meeting where a few of us can demonstrate and practice some Zulip workflows?
That would help for putting together the guide, yes.
I'd love to attend such a meeting
Great 🙏 Here is a poll for those who wish to attend a meeting about the Clojurians Zulip chat: https://bit.ly/clj-zulip-poll We will practice using it and discuss some recommended practices.
@pez @adham.rasoul thanks for your responses, I just scheduled a call for Wednesday, hoping it still works for you. https://clojureverse.org/t/intro-to-the-clojurians-zulip-chat/ Anybody who is interested in Zulip is invited ☝️
Dropping some thoughts, which Slack and Zulip admins may have already discussed. A "Getting started" prompt will be useful for Slack -> Zulip emigres. • Can we land newcomers on a specific topic that links to a curated set of Zulip's own documentation and a handy youtube demo or two? • Personally, I much prefer Zulip over slack and discord and group mail too, but it takes a few hints / prompts to learn to use it successfully. e.g. 1 Topic name hints: A well-organized Zulip is useful. However, thinking of a topic name is always a question... Some channels can benefit from structure. Others can be more open-ended. • Generically: https://zulip.com/help/introduction-to-topics#how-to-start-a-new-topic. • Channel-specific hints: ◦ In the #introduce-yourself channel at Zulip, it's more useful to put one's own name in the topic, instead of a generic hello. ◦ Library channels can start topics for upcoming releases, potentially pull in github issues. These things get discussed anyway across slack / github. Having issue/tag/release-specific topics will help maintainers stay more sane. (https://zulip.com/integrations/ should help streamline this). ◦ Announcements can have a topic per <thing> (library / podcast / project) etc. which allows continuity of history for <thing>. e.g. 2 Reading guide: (because most people read / browse, rather than write) • Reading strategies: https://zulip.com/help/reading-strategies • Finding a conversation to read: https://zulip.com/help/finding-a-conversation-to-read • Decent-looking general use demo (covers a bunch of useful features): https://www.youtube.com/watch?v=9sAgPaj8OZk e.g. 3 Slurping allied topical discussions: • Similar to the HN channel, can we slurp Ask Clojure or Clojureverse or stackoverflow questions or even mailing list items as topics? ◦ These structurally align quite well with Zulip's threading model, not so much in Slack's model. ◦ Admins can make a choice to keep these "announce-only" as it may be better for discussions to happen at the remote site. ◦ Sidebar: I wonder if the discussion itself can be slurped and archived into message(s) under the topic created. • This will take work, over time, but the huge advantage of "web-public" Zulip is that it can become a searchable archive of topical discussions for all sorts of stuff slurped in from across the WWW.
Some of this is already being discussed in various topics over in the #zulip channel... but this is all good feedback!
I don't mind Slack, as I have to use it for work anyways, but I've noticed a strong trend in other programming communities to move to Discord and Zulip in recent years. I myself favor Discord today just because it's convenient for me to use fewer tools, but I'll definitely frequent the Slack server as long as I have to use Slack in general.
I do find it somewhat ironic that Slack was supposed to be a chat for developers and Discord for gamers, and lately Discord seems more geared towards developers than Slack. (but I guess it all changed with Salesforce)
I'm active in nearly all of those but my preference for "something like Slack" would be Zulip -- which is where the SciClojure community moved to some time ago -- and it has a fairly complete searchable archive of Slack channels, from the @zulip-mirror-bot
There are several Discord servers with varying levels of activity. Due to Discords common uses the moderation capabilities are quite extensive, way beyond most other systems. Of all of them, that would be the one I would entertain trying to build something new with core team resources, as we are already invested in it for core dev and the conj. The Conj server would be one option that we could even repurpose