Fork me on GitHub
#clojure
<
2024-02-18
>
Lidor Cohen14:02:50

Hello everyone! I have a nested transducer that looks something like this:

(into data
      (comp
       (map some-map-fn)
       (map (fn [v]
              (into {}
                    (comp
                     (filter some-filter-fn)
                     (map another-map-fn))
                    v))))
      other-data)
I wanted to know if I can prevent the realization of the nested transducer somehow? I know that (into {} ...) realizes the computation and I would like to skip that intermediate representation...

Noah Bogart14:02:59

Why does it have to be nested?

Lidor Cohen14:02:16

can every nested transducer be writen as unnested?

Lidor Cohen14:02:54

in this case I nestedly iterate over other-data (i.e collection of collections)

Noah Bogart15:02:18

Oh i see, interesting. No, i don’t think you can avoid the inner realization. Imagine you were returning an empty map for each item in the current step of the collection. You’d still need to count or map over the current step, right?

🙏 1
Noah Bogart15:02:04

If they’re consistent lengths, you could mapcat and then partition, but you’d need to look at the performance/memory difference to see if it’s worth it

p-himik17:02:00

@U8RHR1V60 Is data a map? If yes, you can use mapcat. If no, then those maps created by (into {} ...) are not in any way intermediate - they end up directly in the result.

🙏 1
👍 1
Lidor Cohen17:02:38

Yes, it is actually Thank you 🙏:skin-tone-3: I'll try mapcat

p-himik17:02:55

Then, unless I'm missing something, it can be written as

(into data
      (comp
        (mapcat some-map-fn)
        (filter some-filter-fn)
        (map another-map-fn))
      other-data)
Note that there's also keep - sometimes it's more useful than a filter+`map` combination.

Lidor Cohen17:02:42

Great 😃 this is very helpful! Thank you 🙏:skin-tone-3:

👍 2
Lidor Cohen09:02:13

So I tried using mapcat and keep but I think I abstracted away importatnt detail from the example that I can't seem to get over: other-data is a pair: [a b] b is a collection I need to filter b by some pred but then I need to map over what's left (i.e keep) and use a to compute the new value In that case I can only think of associng a to each value in b in mapcat and then I can keep but I don't think it's a better solution. Or I need to nest keep in mapcat, so now I have this:

(into data
        (mapcat (fn [[a b]]
                  (into {}
                        (keep (fn [v]
                                (when (perd? v)
                                  (map-fn v a))))
                        b)))
        other-data)
It is better (in both readability and performance) but I still couldn't avoid the intermediate realization

p-himik09:02:31

Yeah, I'd nest. But you don't need that (into {} ...) (unless you need to keep only unique keys) - you can just use keep by itself. As an alternative, I think you can use things like loopr from dom-top or probably for from xforms.

onionpancakes04:02:56

Use eduction inside a mapcat to avoid intermediate colls. I would try replacing the into {} with eduction.

(into data
            (mapcat (fn [[a b]]
                      (eduction (keep (fn [v]
                                        (when (perd? v)
                                          (map-fn v a))))
                                b)))
            other-data)

onionpancakes04:02:34

cat and by extension mapcat is implements the concatenation by reducing the coll. eductiontype is a reducible type, so whatever is in the eduction is only drawn out when cat begins reducing it.

👍 1
🙏 1