spam-reports

Rupert (Sevva/All Street) 2024-12-14T14:37:54.571069Z

Since we may be about to lose historical search: If someone from this community wanted to build a fast search UI on top of the Slack data (eg sourcing it from Zulip mirror or Clojurians log)- would that be ok? Do they need to to be preapproved?

john 2024-12-14T16:13:27.114919Z

Given the policy we got consensus on wrt reusing all these logs for AI training data, I think that would be a preapproved effort

Max 2024-12-14T16:58:27.564139Z

For those who are interested in either improving or verifying the completeness of existing Clojurians archives, I did a little research on this for another workspace recently and I figured I’d leave some helpful links: • Slack has an export feature available to admins that could be used to backfill, verify, or bootstrap an archive, https://slack.com/help/articles/201658943-Export-your-workspace-data. It includes all messages and links to files from public channels. • There’s also https://github.com/rusq/slackdump, an active open-source project that makes tools for archiving slack workspaces. It includes all public messages and channels, emojis, users, and can be configured to also download all attachments. In its readme, it also has links to some viewer programs that will read its dump format. It does not require admin access, only an API key, though it can only archive DMs the calling user was involved in. My 2c: I think it makes more sense to rely on a well-maintained public project to archive activity in a workspace than to maintain a custom impl. If I had the bandwidth to maintain a Clojurians archive, I would probably do the following: 1. Decide I’m going to build a new archiver based on slackdump. It’s well-maintained and less likely to change/disappear than the built-in export feature. 2. Run slackdump now before the workspace’s pro license expires 3. Set up slackdump to run automatically somewhere on a regular interval. There’s a script in the repo’s contrib/ for fetching changes since the last run. 4. Investigate existing viewers for the dump format, and either pick, fork, or build one depending on what that investigation yields.

➕ 2
Rupert (Sevva/All Street) 2024-12-15T09:15:33.580739Z

I would just leave it zipped up - you can split the file into several sub 100 MB chunks to upload it somewhere and those that need the file can reassemble it after they download the chunks. Uploading the individual files will take much longer and much more space.

Rupert (Sevva/All Street) 2024-12-15T10:24:30.077079Z

Splitting into 100mb chunks can be can be done with:

split -b 99m slack-export.zip slack-export-part-
To reassemble:
cat slack-export-part-?? > slack-export-again.zip
[https://unix.stackexchange.com/questions/751593/what-is-the-state-of-the-art-of-splitting-a-binary-file-by-size]

Rupert (Sevva/All Street) 2024-12-15T11:31:13.509719Z

I too would be interested in the the data set. Three potential projects I could do if helpful to the community: 1. Conventional (non vector) web search UI backed by Elastic (see search screenshot) 2. LLM fine tune to write better/idiomatic clojure code (open weights model published to be huggingface). 3. A browsable history of the Slack data. Note this would not be indexed by search engines - just for humans to view (not for SEO) These would all be non commercial and non revenue generating (no ads). Projects are just meant to help and grow the Clojure community. Also I would be happy to provide additional details and conform to branding/naming/attribution guidelines etc on all of the above. Also I think there is no problem to have several (overlapping/competing) implementations of something (e.g. AI models or archive search engines etc) - it’s possible this may lead to better results. We have a strong precedent of using this data through Clojurians log web pages and sync to Zulip. I understand we haven’t always had search over slack history (before we had Slack Pro) - but by some lucky turn of events we do have it now and I think we should make the most of it.

cfleming 2024-12-15T20:12:30.197859Z

Can this also be used to backfill channels in zulip?

Max 2024-12-15T20:12:55.724749Z

Don’t see why not

Max 2024-12-15T20:14:54.646609Z

Even though Sean already ran an export, I think it’d still be a good idea to run a slackdump run, since its format is understood by other programs and it pulls the assets in too

cfleming 2024-12-15T20:14:59.843219Z

Depends whether Zulip allows it or not, I guess.

cfleming 2024-12-15T20:15:31.500589Z

And there are issues around matching users here to users there, I guess. Sounds very valuable, though.

Rupert (Sevva/All Street) 2024-12-14T17:33:11.434759Z

I see the slack archive as potentially invaluable piece of important history and all round useful resource. It sounds like we have some partial archives (Clojurians log and zulip) but no complete archives anywhere. Potentially we only have a couple of weeks left to save the data.

Max 2024-12-14T17:34:03.317509Z

All someone needs to export the public stuff is an API key, that seems relatively achievable

➕ 1
Rupert (Sevva/All Street) 2024-12-14T17:58:11.125899Z

Even if we have Pro account- if we want to preserve the data then keeping an independent backup outside of Slack feels like a good idea.

Max 2024-12-14T18:15:57.649329Z

I think probably the only missing piece is someone willing to take initiative and ask the admins for an api key

john 2024-12-14T20:58:39.021159Z

@max.r.rothman if @seancorfield or another admin got you an API key, would you be willing to lead the charge on this effort? A number of us want this for a number of reasons, so I'm sure folks will chip in.

Max 2024-12-14T20:59:52.834539Z

I unfortunately don’t have the bandwidth to lead the effort, though I’m happy to contribute

seancorfield 2024-12-14T21:12:09.819859Z

Since it is very low effort, I have initiated a standard Slack export of all public messages for the entire history of this Slack. It warned me that might take "several days" so I'll post back once I have that and know how large it is...

🎉 1
seancorfield 2024-12-14T21:14:04.878619Z

This will not include any private channels or DMs or actual files (only links to files within Slack -- but the links have tokens that allow the files to accessed from outside Slack).

👍 2
🙏 1
seancorfield 2024-12-14T22:34:53.320449Z

Well... The dump completed (faster than I expected!) and the .zip is 613MB. Inside it is a folder for every channel that has ever existed, and inside each of those folders is a .json file for every day (for ten years).

🎉 3
seancorfield 2024-12-14T22:36:40.085739Z

This is a sample message from today in #beginners so folks can see what we'd be dealing with for any sort of search/viewer/import to another service.

john 2024-12-14T22:37:00.137489Z

Wow

john 2024-12-14T22:38:42.699839Z

This is fantastic

john 2024-12-14T22:40:22.381509Z

Doesn't appear to have all of today's beginner messages. Is that one of three different thread json files for today?

seancorfield 2024-12-14T22:40:56.509389Z

That is a single message. Today's JSON file has all the messages.

john 2024-12-14T22:41:17.373979Z

Gotcha

phronmophobic 2024-12-14T22:42:56.385719Z

I'm very interested in this dataset. A fairly straightforward use case I would like to try is to put every thread in a vector db and make it queryable via web interface so you could search for conversations/threads by semantic meaning.

🔥 2
phronmophobic 2024-12-14T22:43:58.323019Z

I also have dozens of other ideas that I think would be useful for the broader community.

john 2024-12-15T00:50:27.616159Z

looks like hosting under 2G per repo is free on github https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage

john 2024-12-15T00:51:13.756909Z

However, > GitHub blocks files larger than 100 MiB. > To track files beyond this limit, you must use Git Large File Storage (Git LFS).

seancorfield 2024-12-14T22:51:42.001119Z

I'm starting this thread so folks can discuss the pros and cons of various Slack alternatives they'd prefer to use if we lose Pro sponsorship, and therefore lose our history and search (and, again, a reminder that we used to have very limited history and search for years until we were fortunate enough to get sponsored by Slack, and then have that sponsorship renewed by Salesforce for a year). A reminder that the active Clojure communities are all listed here: https://clojure.org/community/resources The most active community that is similar to Slack is Zulip (Clojurians Zulip Chat) which has channels and threads and decent moderation functionality (plus the ability to move threads etc). There's a somewhat active Discord server (Discljord) -- I'm not familiar with the moderation functionality there. There's still an IRC community (`#clojure` on libera.chat) for folks who like things "old-school". I don't see https://www.reddit.com/r/Clojure/ (Reddit) listed but that in turn has a few more communities listed in the right sidebar (an unmoderated Discord server and a matrix/riot-im room).

👍 1
adham 2024-12-15T08:08:29.745039Z

I lean towards Zulip as well With my team we tried Telegram, Element/Matrix and ended up pretty happy with Zulip, granted we're a small team but what made us make the jump is • We own the data, hosted on our own server • Topic based conversation makes it easier to navigate many topics and discussions (see Clojurians Slack for an example of this) • Fast and snappy (unlike Element) • Account management built in (no phone required too) • Web/Mobile/Phone clients are well • https://github.com/zuliphttps://zulip.com/help/import-from-slack exists but I haven't tried it before Alex, what moderation capabilities do you find important? The only downside is that mobile push notifications require a payment even when self-hosting, I assume that's one of the ways they maintain themselves which is fair, it costs $6.6 monthly for yearly billing and $8 monthly for monthly billing, I don't know how many people use mobile but my team decided the pros are worth this con and just use web/desktop clients for notifications

👍 1
p-himik 2024-12-15T08:10:42.951349Z

My only gripe with Discord is that it constantly tries to shove Nitro down your throat. So much so that I only use the web client now where I can at least use my ad blocker.

p-himik 2024-12-15T09:00:12.244419Z

Ah, no - not the only. Channels. Discord has two types of servers - Friends and Community. The latter is most suitable for us. On Community servers, all new channels are hidden by default. AFAIK, even if you add that channel to the Default group that's supposed to be visible by default - it will only be shown to new members, but all existing members won't have that new channel visible for them unless they enable it manually. I'm also not sure if it's possible to see how many members have a particular channel enabled for them. Another one, albeit minor - link previews. We have disabled link previews for specific websites here for various reasons. Apparently, there's just no way to do it on Discord. All that's possible is for senders to remove a preview from a specific message before sending it, for mods to remove it after the message is sent, and for viewers to disable all link previews just for them.

adham 2024-12-15T13:03:45.167539Z

I think we should make a DM for this (yes I've been watching Conj vids), I think it'd be a good way to at least get a grip on what matters when choosing an alternative https://docs.google.com/spreadsheets/d/1rcnGOBH6nIiMD1oXfItcVJYjiDh8lc0DjaDp8DfE9p0/edit?gid=0#gid=0

p-himik 2024-12-15T13:13:23.759359Z

Eh. Could be a neat exercise but there are no that many platforms to go around, and at the end of the day, even if you find something that's better that Slack in every regard (to me it's Zulip actually - I don't have any gripes with it at all), the largest hurdle is actually people going there - not the software itself.

adham 2024-12-15T13:24:33.386479Z

I assume people would start going there once Slack changes should the pro membership drops, e.g. no history view and everything else changing and getting limited

p-himik 2024-12-15T13:30:39.052309Z

Only the history will be removed, nothing else, as far as I understand. And as Sean has written - we've been without history for much, much longer than with it.

p-himik 2024-12-15T13:44:43.768219Z

@adham.rasoul I see that you put "Good search through text messages" into the Telegram column. That is absolutely not the case, the search there is one of the worst ones I've ever used. It supports only two things: specifying text (fuzzy matching by default, double quotes supported) and specifying the user. If you navigate away from the search results panel, you cannot go back to it - you have to enter all the search details again and find the place where you were. There's no highlight of why a particular message was matched to the search query, so you have to skim through all the message to find the bit that interests you. It's awful.

p-himik 2024-12-15T13:45:23.714089Z

And you put "Bad search experience" for Discord, but I find it to be the opposite. Explicit pagination, a separate persistent panel, multiple predicates.

adham 2024-12-15T13:48:34.446489Z

I seem to be outdated on the Discord search, do you want to edit the cells? your points are all valid and the sheet should allow editing

p-himik 2024-12-15T13:51:01.146249Z

Discord search has been like this for at least 7 years. :) Sure, I'll update it.

adham 2024-12-15T13:51:58.654069Z

I've only been messaging in Discord lately and only infrequently, so I haven't really searched searched

adham 2024-12-15T13:52:03.756469Z

Thanks!

seancorfield 2024-12-15T17:16:03.000139Z

In one of the last rounds of Slackocalypse discussions, someone created a DM but people's feelings about features were very subjective, so it was very hard to get a fair and objective comparison. It was also hard to get a good handle on administrative tooling / moderation -- because so few people have moderated more than one service so, again, it was very patchy and subjective. Based on what's already in that DM, plus what's been said about it here, I'd say Telegram is absolutely out of consideration. I find Discord a fairly horrible experience, compared to Slack and Zulip, but I'm in several servers -- including Discljord -- so it's not so horrible that I won't use it 🙂

Rupert (Sevva/All Street) 2024-12-15T18:51:55.336309Z

I think it's great for us to discuss and consider alternative online venues for this community. I'm very much open to the possibilities. I'm not against self hosted options so we're free from corporate whims. However even if we lose Pro features, I think it's very likely my preference will be for us to just stay put and remain on Slack if we are able to put together a decent search/browse over the message history.

adham 2024-12-15T19:00:34.721369Z

> someone created a DM but people's feelings about features were very subjective That's okay, people will hold different opinions but I believe discussing them and putting them out in some written form is good too > Telegram is absolutely out of consideration I agree.

adham 2024-12-15T19:01:15.122889Z

> we're free from corporate whims. The downside is now we're at the whims of whoever is maintaining it, which I hope is less whim-y than corporate 😆

Daniel Slutsky 2024-12-16T10:54:00.012879Z

Many thanks for this discussion. In the last 6 years, Zulip has been a game changer for us at Scicloj. It allowed us to maintain decent knowledge management along the years. Knowledge simply emerges from conversations in a reasonably structured and interlinked way. Arguably, this matters so much in open-source collaboration where often new members join, people leave for a few months and return, etc. Zulip's structure and ergonomics offer a good balance of between fast chat and slow long-term discussion. This matters. A lot. Everything is simple, every channel, topic thread, and message has a URL that can be easily opened in a browser tab. Copying and pasting URLs is easy and behaves so nicely with channels and topic threads, showing their names in a sensible way. These details matter so much for staying coordinated and offering clarity to each other. There is great support for images, markdown, code highlighting, and more.

👍 10
👍🏻 1
👍🏼 1
💯 6
pez 2024-12-16T12:39:07.062279Z

While I agree that we should consider Zulip as the main alternative, I do not agree that everything is simple over at Zulip. Actually I find the Ux utterly confusing. Everything I try to do either works very differently from what I am used to, or I fail to figure it out. Maybe it takes 6 years to get to where everything is simple? 😃 I think that we should make sure people have a nice time switching by putting together a guide for people coming from Clojurians Slack to Clojurians Zulip: • Doing A: ◦ In Slack you did like so ◦ At Zulip you do like so • Doing B: ◦ In Slack you did like so ◦ At Zulip you can’t do this (reason about possible mitigating features)

pez 2024-12-16T12:39:40.143049Z

Old habits die hard. As we learnt last time we tried to push for Zulip.

Daniel Slutsky 2024-12-16T12:41:50.510109Z

Thanks for looking into it. You are right, there is some friction at the beginning. Do you think it may help to organize an online meeting where a few of us can demonstrate and practice some Zulip workflows?

pez 2024-12-16T12:42:29.904539Z

That would help for putting together the guide, yes.

adham 2024-12-16T12:47:41.515269Z

I'd love to attend such a meeting

Daniel Slutsky 2024-12-16T12:50:47.612149Z

Great 🙏 Here is a poll for those who wish to attend a meeting about the Clojurians Zulip chat: https://bit.ly/clj-zulip-poll We will practice using it and discuss some recommended practices.

Daniel Slutsky 2024-12-16T15:53:30.396169Z

@pez @adham.rasoul thanks for your responses, I just scheduled a call for Wednesday, hoping it still works for you. https://clojureverse.org/t/intro-to-the-clojurians-zulip-chat/ Anybody who is interested in Zulip is invited ☝️

adi 2024-12-22T06:47:03.908799Z

Dropping some thoughts, which Slack and Zulip admins may have already discussed. A "Getting started" prompt will be useful for Slack -> Zulip emigres. • Can we land newcomers on a specific topic that links to a curated set of Zulip's own documentation and a handy youtube demo or two? • Personally, I much prefer Zulip over slack and discord and group mail too, but it takes a few hints / prompts to learn to use it successfully. e.g. 1 Topic name hints: A well-organized Zulip is useful. However, thinking of a topic name is always a question... Some channels can benefit from structure. Others can be more open-ended. • Generically: https://zulip.com/help/introduction-to-topics#how-to-start-a-new-topic. • Channel-specific hints: ◦ In the #introduce-yourself channel at Zulip, it's more useful to put one's own name in the topic, instead of a generic hello. ◦ Library channels can start topics for upcoming releases, potentially pull in github issues. These things get discussed anyway across slack / github. Having issue/tag/release-specific topics will help maintainers stay more sane. (https://zulip.com/integrations/ should help streamline this). ◦ Announcements can have a topic per <thing> (library / podcast / project) etc. which allows continuity of history for <thing>. e.g. 2 Reading guide: (because most people read / browse, rather than write) • Reading strategies: https://zulip.com/help/reading-strategies • Finding a conversation to read: https://zulip.com/help/finding-a-conversation-to-read • Decent-looking general use demo (covers a bunch of useful features): https://www.youtube.com/watch?v=9sAgPaj8OZk e.g. 3 Slurping allied topical discussions: • Similar to the HN channel, can we slurp Ask Clojure or Clojureverse or stackoverflow questions or even mailing list items as topics? ◦ These structurally align quite well with Zulip's threading model, not so much in Slack's model. ◦ Admins can make a choice to keep these "announce-only" as it may be better for discussions to happen at the remote site. ◦ Sidebar: I wonder if the discussion itself can be slurped and archived into message(s) under the topic created. • This will take work, over time, but the huge advantage of "web-public" Zulip is that it can become a searchable archive of topical discussions for all sorts of stuff slurped in from across the WWW.

❤️ 3
seancorfield 2024-12-22T06:48:52.088049Z

Some of this is already being discussed in various topics over in the #zulip channel... but this is all good feedback!

🙏 2
bozhidar 2025-04-15T11:35:40.970939Z

I don't mind Slack, as I have to use it for work anyways, but I've noticed a strong trend in other programming communities to move to Discord and Zulip in recent years. I myself favor Discord today just because it's convenient for me to use fewer tools, but I'll definitely frequent the Slack server as long as I have to use Slack in general.

bozhidar 2025-04-15T11:36:26.826129Z

I do find it somewhat ironic that Slack was supposed to be a chat for developers and Discord for gamers, and lately Discord seems more geared towards developers than Slack. (but I guess it all changed with Salesforce)

seancorfield 2024-12-14T22:52:52.413049Z

I'm active in nearly all of those but my preference for "something like Slack" would be Zulip -- which is where the SciClojure community moved to some time ago -- and it has a fairly complete searchable archive of Slack channels, from the @zulip-mirror-bot

➕ 9
Alex Miller (Clojure team) 2024-12-15T00:31:49.675879Z

There are several Discord servers with varying levels of activity. Due to Discords common uses the moderation capabilities are quite extensive, way beyond most other systems. Of all of them, that would be the one I would entertain trying to build something new with core team resources, as we are already invested in it for core dev and the conj. The Conj server would be one option that we could even repurpose

🚀 2
🤔 1