Fork me on GitHub
#releases
<
2021-02-13
>
dpsutton21:02:12

Hi. Its not really a release as its just addressable as git coordinates right now, but i wanted some feedback if this was a general thing people might want. If you ever have a sequence operation that ends with (->> ... (sort-by f) (take N)) you pay the price to realize the entire collection. sorting kinda naturally falls outside of transduction contexts. This uses a bounded min-heap to accumulate the largest N seen so far enabling (transduce xf (keep-heap 100 compare-fn) reducible-coll). It was motivated by querying a db, scoring in Clojure and only wanting to keep the top 100 results sorted by the scoring. The use of the heap enables taking N sorted results without keeping all of the elements in memory to sort them. https://github.com/dpsutton/heap-keep

🆒 12
pithyless16:02:14

This is cool - thanks for sharing. There is an x/sort-by in https://github.com/cgrand/xforms (that just builds up a complete collection). Have you considered possibly submitting keep-heap as an additional transducer to xforms (vs releasing a new lib)?

dpsutton16:02:10

i hadn't. as its only a single function i wasn't sure where it would be most useful or if many people would even find it useful

dpsutton16:02:33

and yeah i looked at the x/sort-by but it realizes the whole collection which is the primary evil i'm trying to avoid