This page is not created by, affiliated with, or supported by Slack Technologies, Inc.

## 2020-09-27

## Channels

- # announcements (1)
- # babashka (7)
- # beginners (60)
- # calva (24)
- # cider (2)
- # cljdoc (9)
- # clojure (21)
- # clojure-australia (1)
- # clojure-europe (2)
- # clojurescript (26)
- # core-async (89)
- # datalog (11)
- # datomic (40)
- # fulcro (24)
- # graalvm (5)
- # luminus (1)
- # meander (1)
- # numerical-computing (1)
- # off-topic (3)
- # shadow-cljs (6)
- # test-check (10)
- # tools-deps (7)
- # xtdb (18)

If I have a websocket component that has an internal `clojure.core.async/mult`

to distribute all incoming messages, does it make sense to have that component extend the `clojure.core.async/Mult`

protocol? Is that the intended use of the protocol? or is it better to just make my own `tap`

functions in my own namespace

Given N sets, how do I find out the largest possible set (or sets) that is a subset of all N sets?

`(magic [#{1 1} #{1 2} #{1 3}]) ;; => #{1}`

Brute-force solutions wouldn't cut it for the size of my inputsThat's not what I want, as the inputs can be disparate and most likely the intersection will be `#{}`

99% of the times
So I should reword: `of all N sets`

-> `of the most sets as possible`

I still don't understand. What would your desired function return for `[#{1 2} #{3 4}]`

, for example?

Or better yet, can you describe the algorithm? Because I cannot parse that sentence with "->". :)

something like the union of the results of intersecting of each subset with the union of all subsets?

yeah, oops, I guess you could intersect each subset with union of all the other subsets?

I don't know. I think the task is not defined clearly enough for us to come up with any robust solution.

But it feels like a task of turning a 2D metric into a 1D one, and that always requires some particular approach to dimensionality reduction.

> What would your desired function return for [#{1 2} #{3 4}], for example?
that's a pretty degenerate case (for my needs) since it's very small and there's nothing in common. I don't care if it returns #{} or something else
I'm trying to find the largest possible subsets given N sets of size ~~1000 to ~~100000 each, which are presumed to have quite a lot in common.
There will be many possible answers, and what a 'good' anwer is can be tweaked. e.g. maybe for me a set with 300 elements that is a subset of 40 other sets (but *not* a subset of 20 other sets), is interesting enough

Combine the sets into a bag having all elements. Then use frequencies on the bag to find the largest count?

What would that return for `[#{1 2} #{1 2} #{3} #{3}]`

? The frequencies are the same, yet there's no `#{1 2 3}`

subset.

`#{1 2}`

seems a good answer, because its size is large, even if it's not a subset of all inputs. Ultimately I'm not seeking an unequivocal mathematical function, but something that gives good answers to humans for the desired size range (1000-100000).
at sizes 1-10 it's probably easy to get lost because the differences are too small to be evident.
for example, for an input of 100 sets sized 10000 each:
* an answer of a set sized 5000 which is a subset of 80 sets is very valuable
* an answer of a set sized 9999 which is a subset of 1 set is not very valuable
* an answer of a set sized 2 which is a subset of 80 sets is not very valuable

...maybe my problem is too specific, sorry for that. at first I thought it could be a commonly solved problem.

The main issue is that you have 2 numbers (subset size and the amount of its supersets) that you have to reduce to just one ("score" of a particular subset). It's a dimensionality reduction problem, and there never is a cookie-cutter solution. You will have to somehow write that function that takes two numbers and outputs just one.

This is a quick informative read: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006907