Fork me on GitHub

What would be the fastest way to find all disjoint supersets of many smaller sets? My google-fu helped me to find solutions for the opposite problem - superset decomposition/partitioning. And my intuition tells me that I can just use an algorithm for finding all components of a graph (where an input set represents a fully connected graph), but I'm not sure that's the best approach. Just to make what I want more clear. Suppose I have sets #{1, 2, 3}, #{2, 3, 4}, #{7, 8, 9}, #{9, 10}. I want to end up with #{1, 2, 3, 4} and #{7, 8, 9, 10} since the first two sets both have 2 and 3 and the last two sets both have 9.

Vincent Cantin09:09:26

1. give each set an ID (it may work faster than to test set equality each time) 2. Build an index element -> set of set-ids 3. Use the index to connect all your graphs. You should find an algorithm which is close to linear w.r.t. the total number of elements.

Vincent Cantin09:09:30

the algorithm is indeed a graph component painting

Vincent Cantin09:09:16

That’s also based on my intuition.

Vincent Cantin09:09:57

At that point, I would use Google, no need to re-invent the wheel.


Indeed. :) The idea that graph component painting may be done in linear time makes inventing something else unnecessary since I highly doubt a more efficient algorithm is possible. But of course I may be wrong.

Vincent Cantin09:09:57

When I was young, I found an algorithm to solve this problem in n * log_2(n) for the worst case scenario, so I think it’s either this or faster via hash-maps which I did not consider at that time.

Vincent Cantin09:09:41

on internet, people call it graph component labeling


Awesome, thanks! "Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction" - huh, that's a lot of names.

Vincent Cantin10:09:16

I think that the link between your problem and the connected-component labeling is where you have elements which belong to multiple sets at the same time. The index I mentioned let you know about those elements.

Vincent Cantin10:09:00

I think the graph in your case is a hashmap of set-id -> set of set-ids

Vincent Cantin10:09:22

the elements in the sets are not important at that point.


From that Wiki page I found out about and oh boy is it a rabbit hole.

👍 3
Vincent Cantin10:09:25

“The best is the enemy of the good”. I suggest to pick a solution good enough and move to your next task :-)


That's exactly what I'm about to do. :) Already found that has an implementation for that. I'll just make sure that it has good enough timing for the data sizes that I have.

👍 3
Shantanu Kumar09:09:33

The website seems to in error state right now.

😕 3

"my server is too old. So now I'm upgrading Ubuntu." - the main issue is that he has chosen the most arduous path. He just reaped what he has sown. Also there's a mixed talk about macOS and Ubuntu - I have no idea what the initial message author is doing.


And to be fair, most of the issues that author describes would manifest in the exact same way with Clojure. Simply because the issues come from outside of the application. Like Facebook API, SSL certificates, HTTPS on localhost, MongoDB not working, migrating from SysV to Upstart.

👍 3

A badge of honor to have a project live that long


@U2FRKM4TW although I mostly agree with you, there's a crucial point we disagree - that the issues would manifest in Clojure. Most things in the Rails world are closely tied - some specific version of Rails depends on a specific version of Ruby, and most libraries also depend on a specific version. Also, Rails have zero worries about backward compatibility, and is itself composed by 5-7 libraries, and all versions should match.


So, if you want to upgrade Rails (the framework) you have to upgrade 5-7 libraries, all that have slightly different APIS for each version bump even when no feature really changed - most of the times is just a change in the methods. Also... the test libraries' versions also are tied with specific Rails versions, so you can't really depend on your tests...


@UPGJNT3UY We still have code in production that we wrote in 2009 (legacy code, prior to our adoption of Clojure, that we still plan to rewrite to Clojure at some point). A lot of projects I've seen over my career have survived for a decade (or more).


An insurance company that I worked at, as one of my first jobs, had projects that had been in continuous production usage for close to twenty years 👀


(COBOL code that had been written in the very early days of the language that still ran, although it had been ported to more modern servers)


@U3Y18N0UC I was talking specifically about issues that are not related to the app. I have even listed them explicitly. Do you really think using Clojure would somehow help you avoid those issues?


Maybe some parts. For example, Facebook login would probably not need an upgrade from clojure then ring/pedestal/whatever. Then, he would not need to upgrade the server, for example. But yeah, other issues would arise in the same way. That's the parts we agree :)


Co-incidentally I just wanted to make a small tweak to my blog, which is based on octopress / Ruby, but I had to set up rvm all over again and figure things out for the nth time, probably because I'm only touching this once or twice a year

Drew Verlee14:09:52

I switched to croygen because I didn't want to waste time on these types of issues.


yes, I should switch too


Cryogen looks nice, having a look


I've used hugo for a while which works well enough too


I'm still waiting for spec2 to stabilize, so I can convert my mini cms project for a blog/personnal site. Been 2 years, but should be fine I guess.

Drew Verlee19:09:19

i'm re watching phillip walders "categories for the working hacker" and he has this example (see picture). Anyone clear on what "h" is here? I can't imagine this is a typo of some sorts but right now the best i can tell is that he should have written id_C. That C appears in the top picture but not in the bottom is also confusing.


I don't know any CT, but h seems to be a function, just like f and g. The bottom line just shows how function composition works (i.e. (comp (comp f1 f2) f3) is the same as (comp f1 (comp f2 f3)) )


I was gonna say, it just looks like he's showing that function composition is associative, in that if you have a series of function to compose, like f g h You can start composing the individual functions together (to eventually form the final big composed function) in any which order; first combining f and g for instance, and then composing that with h, or first composing g and h, and composing the resulting function with f


Well, if we're speaking categories, I guess those aren't functions


Its been a while, I suppose those'd be morphisms / arrows?

Drew Verlee21:09:33

That is what i believe is being communicated here cameron. But H is a symbol that doesn't appear in the arrow and object model unlike any other symbol. It's easier for me to assume that i'm missing something then he didn't translate it correctly.


@U0DJ4T5U1 Looking at that diagram, I can only assume that he didn't want to make the diagram more complex by adding a D (circle) with an h (function) from C to D. He shows f ; g going from A to C, and illustrates the identity functions, and then associativity.


This page has notes on the talk that confirms that -- "Another law states that if you compose three functions, it doesn't matter what order you compose them, i.e., composition is associative." (a law not illustrated by the diagram)

👍 3

He does say in the video “say that h goes from C to D”, so I think Sean is spot on. Ended up watching the whole thing, Phil is a great teacher. It brought me to “the First Monad tutorial” I’m finally reaching the point of understanding monads, and watching this had me wondering “sure, but how do I compose them??”. After some searching I found “monad transformers” and once again am completely intimidated 😅 To track state and output to the console and potentially fail, the types in Haskell become gnarly pretty quickly.