This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-05-30
Channels
- # aleph (1)
- # beginners (126)
- # cider (2)
- # clara (38)
- # cljsrn (2)
- # clojars (2)
- # clojure (49)
- # clojure-dev (31)
- # clojure-dusseldorf (1)
- # clojure-finland (1)
- # clojure-france (6)
- # clojure-italy (13)
- # clojure-nl (12)
- # clojure-russia (9)
- # clojure-sg (1)
- # clojure-spec (33)
- # clojure-uk (83)
- # clojurescript (206)
- # community-development (3)
- # core-async (40)
- # cursive (4)
- # datomic (7)
- # duct (21)
- # emacs (9)
- # fulcro (36)
- # funcool (2)
- # graphql (12)
- # instaparse (4)
- # jobs (4)
- # lumo (24)
- # mount (1)
- # nyc (4)
- # off-topic (29)
- # onyx (1)
- # pedestal (2)
- # random (4)
- # re-frame (60)
- # reagent (136)
- # remote-jobs (1)
- # schema (1)
- # shadow-cljs (20)
- # spacemacs (6)
- # specter (14)
- # tools-deps (2)
Hi. I would like to propose to add to the Clojure library a variant of the reduce
function which wraps its result via reduced
on early termination. That would simplify the implementation of a lot of transducers which are currently using the reduce
function and also will make them more efficient as they won't need wrap the reducing functions that they use inside their calls to reduce
and they won't need to test if each value to be reduced is already reduced (this is a performance waste - https://github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L7535).
For example, the clojure.core/cat
function: https://github.com/clojure/clojure/blob/clojure-1.9.0/src/clj/clojure/core.clj#L7544
Maybe that new function can be called wrap-reduce
or rreduce
or reduce*
or something else that you would see fit.
Is this Slack channel the right place to propose new functions ?
@vincent.cantin How many transducers use reduce
tho'? Isn't that the only instance in core?
I am not sure, but I often have to use reduce
in my own transducers.
Some other transducers in the core may also use the same wrap/unwrap caveats but written differently.
Hmm, I haven't used reduce
in any of my transducers... what sort of stuff are you doing? (genuinely curious)
I will take a look in details and come back here to report. (within a few days)
That issue with cat
is that it needs to double wrap the value because both reduce
and the transducers unwrap reduced
values.
You get the point.
using reduce
is a HUGE performance gain.
I'm struggling to think of the sort of transducers that would use reduce
-- other than cat
.
(my-repeat n)
for example
Any transducer that spit more output than input.
tree traversers are also another example.
I'll be interested to hear what the Clojure/core folks say tomorrow about how common they consider that and whether they think it warrants a custom version of reduce
(which would not unwrap a reduced
value).
So, if you're writing transducers that use reduce
, you're having to do the equivalent of preserving-reduced
already in your own code?
(i.e., in real world production Clojure code that you're writing)
The only transducers that typically call reduce are expanding transducers (ones that produce more than one value per “step”), primary of which is mapcat which gets this via cat
preserving-reduced and all that are there to support these cases. I don’t think a custom version of reduce is a good idea. I do agree this is a subtle area.
In fact, the reduce function may be called in 2 types of transformations: 1. less data in the output than in the output due to a repeating process (the filter functions are not concerned). 2. more data in the output than in the output due to a repeating process.
hm ... please ignore what I said for 1. as it does not need to use the reduce function, it can be done incrementally.
That transducer would benefit from a variant of reduce: https://github.com/cgrand/xforms/issues/20#issuecomment-366954688
To summarize, the reduce
function destroys the information about weather or not its reduction was early terminated, and the only existing way to know that information by using a hacky workaround that uses more memory and cpu.
Early termination is an infrequent case (unless you (take 1)
a lot). So its cost is spread over the processing of all items that led to it.
In x/for
I have arbitrarily nested reduce
which means that on early termination you may have arbitrarily nested reduced
. Turns out it’s useful because the nesting depth allows to tell apart early termination from :while
(which should just terminate the current level).
My issue is about performance for the additional test reduced?
on the elements returned by the downstream function rf
(which has to be run against every single element) before its value is passed to the reduce
function.
The double wrapping in reduced
only happens once for each transducing process, so that part is not an issue.
I think I should write some code as a prof of concept once I have time, that would be more clear.
I propose to have an intermediary function reduce*
(for instance) that does not deref the reduced
result.
given that Rich decided not to do that, I’m going to assume he didn’t like it
Early termination is an infrequent case (unless you (take 1)
a lot). So its cost is spread over the processing of all items that led to it.
I write reducing transducers and need preserving-reduced fairly regularly, but even then it's about once for every app I work on. Tree reducers are the most common case, but still that's pretty rare.