Fork me on GitHub
#clojure-dev
<
2021-06-17
>
Ben Sless06:06:10

Not sure this is the proper channel, and maybe it should go on ask.clojure later, but it seems like apply and transducers have some friction between them. Ideally, I would not want to hold on to the resulting sequence at all, not even chunks, just pull items out of an iterator then invoke, but I don't see how that's possible given the current implementation.

delaguardo08:06:42

Could you explain more what is the problem? Some example would be nice. I'm not sure there is a direct connection between apply and transducers

Ben Sless08:06:42

I've seen this pattern more than once, of (apply f (map g coll)) . Why should I want to hold on to the original collection? Why allocate it at all? I think this generalizes to transducers, too.

Ben Sless08:06:58

Particularly to the case of Eduction which seems to serve exactly the purpose of avoiding allocation

borkdude09:06:07

@UK0810AQ2 Interesting point. You really just want to have f see g applied to all its args, no need to allocate anything. Perhaps (trans-fn f g) or so could work ;)

Ben Sless09:06:06

Sort of. I went through an iterator:

(defn- consume
  [^java.util.Iterator it]
  (loop [coll []]
    (if (.hasNext it)
      (recur (conj coll (.next it)))
      coll)))

(defmacro ^:private -apply-to-it
  ([f it]
   `(-apply-to-it ~f ~it [] 0 20))
  ([f it args depth max-depth]
   (if (= depth max-depth)
     `(if (.hasNext ~it)
        (apply ~f ~@args (consume ~it))
        (~f ~@args))
     (let [g (gensym)]
       `(if (.hasNext ~it)
          (let [~g (.next ~it)]
            (-apply-to-it ~f ~it ~(conj args g) ~(inc depth) ~max-depth))
          (~f ~@args))))))

(defn apply-to-it
  "Like apply but does no intermediate allocations, consumes its argument
  as an iterable."
  [f ^Iterable it]
  (let [^java.util.Iterator it (.iterator it)]
    (-apply-to-it f it)))
bit yucky, though

borkdude09:06:45

(defrecord TransFn [f arg-fn]
  clojure.lang.IFn
  (invoke [_] (f))
  (invoke [_ a1] (f (arg-fn a1)))
  (invoke [_ a1 a2] (f (arg-fn a1) (arg-fn a2))))

((->TransFn + inc) 1 2) ;;=> 5

Ben Sless09:06:58

That's when you know how many args you're getting, doesn't help with the restFn case

hiredman15:06:18

That's not apple, that is reduce

noisesmith16:06:04

right, apply doesn't force or hold on to anything

user=> (apply (fn [x y & _] (+ x y)) (range))
1

hiredman16:06:08

the complaint here is that a lot of functions used with apply are really binary associative operations with a var args case added and handled via reduce, and that internal fold is not exposed so you can't fuse other operations into (transducers)

hiredman16:06:24

which is, of course, a bad complaint

hiredman16:06:42

it conflates apply and reduce, which are not at all the same thing

hiredman16:06:09

and if you want the fold over the binary associative operation exposed, then just don't use the varargs case, do you own reduce

hiredman16:06:42

which is what the code above basically does, it is a reduce over an iterator, annoyingly called apply, and of course reduce already has a fast path for iterators

hiredman16:06:37

fruit of the poison tree, the root of the poison tree being conflating reduce and apply

Ben Sless16:06:41

Thank you for assuming the wrong use case. I know to pick reduce when it's appropriate.

Ben Sless16:06:20

The case I had the misfortune to come across is exactly that of unknown functions which can take any number of arguments

Ben Sless16:06:00

You could argue the code is bad and you'd be right, but please don't make assumptions

noisesmith16:06:36

I might not understand what you are saying about apply here - how does apply hold onto a collection?

noisesmith16:06:53

also I'm not seeing the connection between apply and transducers here at all

hiredman17:06:29

yes, there are no transducers at all in the examples given

Ben Sless17:06:41

Generally, if we look at (apply g (map f coll)) it would be nice if I could (apply g (->Eduction (map f) coll)) without allocating an intermediate sequence, instead pull the elements directly out of the iterable. The way map is implemented it wouldn't be possible, but with Eduction it should be

hiredman17:06:04

that still isn't about transducers, that is about iterators, and replacing clojure varargs which pass the collected varargs as a seq, as an iterator instead

hiredman17:06:02

it still bakes in assumptions about the usage of varargs inside the functions being called, the assumption being the function unrolls arguments, and doesn't just do something with the args as a seq

Ben Sless17:06:05

as far as I know most functions used in this system that way are not varargs functions, its just that at runtime their arity is unknown (it's a badly written interpreter)

seancorfield22:06:51

Question about select-keys before I spend time writing this up on http://ask.clojure.org — I have a custom type that behaves like a hash map (it’s an extension to APersistentMap that allows keys to be strings or keywords and also case-insensitive — for non-Clojure language interop reasons). When I call select-keys on it, I get a regular Clojure hash map whose keys are the “original” keys from my custom hash map (in this case, they’re uppercase strings) and that makes the result pretty useless in code that follows and it’s because select-keys explicitly uses {}. If select-keys instead used (or (empty map) {}) — or some optimized form of that — then it would preserve the underlying custom hash map type (which would be super-convenient). Is such a proposed change likely to be considered? (I can understand the answer of “no” here on the grounds that this is an edge case that almost no one is going to run into — and I have a workaround: don’t use select-keys on this custom type 🙂 )

👍 3
Alex Miller (Clojure team)22:06:43

I think there might be a ticket abou this

seancorfield22:06:45

Oh, sorry, I should have looked before I “leaped”…

Alex Miller (Clojure team)22:06:18

I would worry that this would be a breaking change for the cases where someone is using select-keys specifically to lose the special map type-ness of the source

seancorfield22:06:01

An interesting take — and a valid concern, yes.

Alex Miller (Clojure team)22:06:06

was closed as won't fix

seancorfield22:06:14

Probably why I couldn’t find it on ask. Fair enough. I’ll tackle this a different way then.

seancorfield22:06:59

Thanks. It turns out my custom hash map type doesn’t implement IObj so that would be another breakage when using (empty my-map) in select-keys. So it’s clearly a terrible idea! 😐

hiredman23:06:54

I am actively using select-keys to get around the partition thing

seancorfield23:06:02

“the partition thing”?

hiredman23:06:42

the thing were if you transduce over a plan, and the transduce does a partition-by, the final partition gets reduced after plan has closed everything

seancorfield23:06:13

Ah, gotcha. And right now the select-keys approach lets you extract columns without realizing the row into a hash map in full — and you still get a hash map back. Yes, makes my change even more of a bad idea 🙂

rickmoynihan06:06:42

Was a conclusion ever reached on this issue? Is the mistake here having reifying a map facade over a mutable connection that is then combined with resource-managing reducers? Or is it inherent transducers and reducibles?

rickmoynihan06:06:35

I’m asking because I’d like to offer similar affordances to next.jdbc but also avoid recreating this issue in a similar API I’m building; one that works over RDF data sources, not JDBC ones.

seancorfield06:06:06

What do you mean by "this issue" cause we covered quite a bit of ground 😊

seancorfield06:06:29

There was clear consensus that my proposed change to select-keys was a bad idea - for several reasons in the end.

seancorfield06:06:02

And my discussion was nothing to do with next.jdbc by the way.

rickmoynihan10:06:44

Apologies I meant this (from the next.jdbc) docs: > Note: you need to be careful when using stateful transducers, such as partition-by, when reducing over the result of plan. Since plan returns an IReduceInit, the resource management (around the ResultSet) only applies to the reduce operation: many stateful transducers have a completing function that will access elements of the result sequence -- and this will usually fail after the reduction has cleaned up the resources. This is an inherent problem with stateful transducers over resource-managing reductions with no good solution. Which is what I thought @U0NCTKEV8 was referring to here: https://clojurians.slack.com/archives/C06E3HYPR/p1623971622083700?thread_ts=1623971154.083300&amp;cid=C06E3HYPR

seancorfield15:06:30

Yes, he was referring to that, but that what wasn't what my discussion in #clojure-dev was about -- and his comment was just pointing out a use case that would actually be broken by my proposal for select-keys.

seancorfield15:06:56

As for your question, I don't think there's anything more to add over what is in the next.jdbc docs: "This is an inherent problem with stateful transducers over resource-managing reductions with no good solution."