This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-08-31
Channels
- # announcements (3)
- # beginners (139)
- # boot (28)
- # cider (40)
- # cljdoc (1)
- # cljs-dev (30)
- # clojure (61)
- # clojure-conj (1)
- # clojure-dev (113)
- # clojure-germany (4)
- # clojure-italy (29)
- # clojure-nl (3)
- # clojure-russia (2)
- # clojure-spec (38)
- # clojure-uk (53)
- # clojurescript (188)
- # core-async (4)
- # css (2)
- # cursive (7)
- # data-science (5)
- # datomic (14)
- # emacs (1)
- # figwheel-main (192)
- # fulcro (37)
- # jobs-discuss (1)
- # mount (4)
- # off-topic (47)
- # pedestal (7)
- # portkey (14)
- # re-frame (4)
- # reagent (22)
- # reitit (2)
- # remote-jobs (1)
- # ring (6)
- # shadow-cljs (65)
- # spacemacs (7)
- # specter (6)
- # yada (8)
I'd pay someone to implement a BloomFilterHashMap, like in this paper http://webee.technion.ac.il/~ayellet/Ps/nelson.pdf, in cljc with something like this https://github.com/davidwclin/cljc-bloom/tree/master/src
Well I think it'd be generally good for everyone to have one around, for the same reasons described in the paper
Personally, I'd like to implement a graph db over an indexical log over a BFHM and play with turning the error rate way up, but pass a ton of data through it, to see if you can get probabilistic inference in a way similar to ANNs.
Looks like there's a java implementation here https://github.com/egrim/java-bloomier-filter
i'm still not sure i get what the structure does. It caches a function where that function sends the vast majority of inputs to one output and a small subset to different inputs. It guarantees that the small subset is accurate and the vast majority is mostly accurate?
I think what's happening with that partitioning structure is that if you happen to get a positive, since it might be a false positive, it sends it through another bloom filter. And it does so in cascading way, in order to reduce false positives in a nested domain sort of way
Though I could be wrong. But I'm still unsure how a functional relation is embedded in what still seems to me to be an existence check, however nested.
One use case is distributed malware detection https://arxiv.org/pdf/1601.01405.pdf
Fast scanning speed with less memory
usage: By layering a cache-efficient bloom
filter on top of the more costly bloomier
filter, BitAV manages to increase end-to-end
throughput of the average-case input by 14×,
and requires less memory to do so than traditional
algorithms
another paper on bloomier filters: https://arxiv.org/pdf/0807.0928.pdf
Some bloomier constraints I just read on a ppt:
- Extend [bloom filters] to handle approximate functions.
- Each element of set has associated function value.
- Non-set elements should return null.
- Want to always return correct function value for set elements.
- A false positive returns a function value for a non-null element.
another cool thing related to this, theta sketches: https://datasketches.github.io/docs/TheChallenge.html
@dpsutton For example, you can have a large set of weak passwords in a bloom filter on the front end, and you can ask it if the password chosen by the user definitely exists in the set, then you can inform the user. The Bloom filter makes doing this both fast and light on memory.
Yeah, I'm trying to grok what an "approximate function" is, in the context of a hash collision domain
Yeah, I guess this paper is exploring along the lines of my hunch https://www.computer.org/csdl/proceedings/iccvw/2009/4442/00/05457544-abs.html
Well, that java implementation seems pretty simple. It's always awkward converting a codebase to clojure, just so you can understand the code base 😆
But I'd really like to understand it and I'd never met a piece of pure clojure code I couldn't manage to understand, which is why I'm willing to commission the work 🙂 Already got one offer via DM for $500
Similar to the vein of "Papers we love," I'd like to start a "papers we'd like to commission the replication of in clojure"
Okay, so we've got one offer for 500$. The offer is still open, if others would like to do it for cheaper. Also, please chime in on a thread here or DM me directly if you'd like pitch in on the fund to get this implemented. Also, if you have any recommendations for how to organize the public funding of a data structure like this, please advise, thanks!
keep up the great work guys we need more libs 🙂
I'd also pitch in on a 'clojure-Xprize' for Ewin Tang's quantum-slaying recommender algorithm mentioned above. I wonder if there's a platform for that sort of thing that would fit with the clojure community.
@vijaykiran Are more episodes coming on defn sir?