beginners

2025-10-23T12:11:10.330399Z

Let's say I have a collection of players, where each player has a position, a name, and a ranking. If I wanted to transform that collection to a map of position to name of the player with the highest ranking, in Java I could do something like (not near an ide, so forgive any not-quite-right code):

players
  .stream()
  .collect(groupingBy(
      Player::position,
      collectingAndThen(
         maxBy(Player::rank),
         Player::name)));
In Clojure, I could do multiple passes to massage the values into the shape I want:
(->> players
     (group-by :position)
     (update-vals
         #(apply max-key :rank %)))
     (update-vals :name)
But is there a way in Clojure to do something more like the Java version, where I just compose the end result I want and then feed the players through?

emccue 2025-10-25T14:58:34.652949Z

(-> players
    (List/.stream)
    (Stream/.collect 
      (Collectors/groupingBy
        :position
        ...)))

emccue 2025-10-25T14:59:07.757529Z

and so on - i do think doing this in normal clojure is the way to go, but if you feel a burning desire I think interop is in a good enough place to just use it

Roman Liutikov 2025-10-23T12:18:15.455299Z

You probably want a transducer, but afaik there are no ready made functions in stdlib similar to the ones you have in Java streams

Roman Liutikov 2025-10-23T12:18:31.893149Z

note to self: modern Java looks great

πŸ‘ 1
πŸ‘πŸ» 1
Harold 2025-10-23T12:38:17.018119Z

Here's a one-pass one:

user> (def data
        (for [[i r] (map-indexed vector (shuffle (range 20)))]
          {:position (rand-nth [:a :b :c])
           :name (str "player-" i)
           :rank r}))
#'user/data
user> data
({:position :c, :name "player-0", :rank 6}
 {:position :c, :name "player-1", :rank 5}
 {:position :a, :name "player-2", :rank 0}
 {:position :b, :name "player-3", :rank 14}
 {:position :a, :name "player-4", :rank 9}
 {:position :c, :name "player-5", :rank 17}
 {:position :a, :name "player-6", :rank 16}
 {:position :a, :name "player-7", :rank 8}
 {:position :b, :name "player-8", :rank 13}
 {:position :c, :name "player-9", :rank 19}
 {:position :b, :name "player-10", :rank 18}
 {:position :c, :name "player-11", :rank 4}
 {:position :c, :name "player-12", :rank 11}
 {:position :a, :name "player-13", :rank 15}
 {:position :a, :name "player-14", :rank 1}
 {:position :c, :name "player-15", :rank 12}
 {:position :b, :name "player-16", :rank 7}
 {:position :b, :name "player-17", :rank 10}
 {:position :a, :name "player-18", :rank 3}
 {:position :b, :name "player-19", :rank 2})
user> (->> data
           (reduce (fn [eax {:keys [position name rank]}]
                     (let [highest (get-in eax [:r position])]
                       (if (or (nil? highest)
                               (> rank highest))
                         (-> (assoc-in eax [:out position] name)
                             (assoc-in [:r position] rank))
                         eax)))
                   {})
           (:out))
{:c "player-9", :a "player-6", :b "player-10"}
As Roman indicated, a transducer could also be concocted, and it would be a good exercise to do so, though I think in the end it wouldn't read as well as the reduce. If the data were large, and performance actually mattered, clojure.core might not be the best tool for the job. We have an open-source namespace of high-performance primitives for this kind of work: https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.reductions.html

Bob B 2025-10-23T12:38:48.398319Z

comp for just combining the :name and the max-key

πŸ‘πŸ» 1
2025-10-23T14:00:53.321709Z

@roman01la yeah, the streams API is quite nice, and when you bring records and sealed types into the mix, data oriented programming is pretty good. Clojure still wins on the dynamic front ("bag of data" maps are really handy) and guaranteed immutability, but there's a lot to like with the direction Java's gone (and the amount of Clojure learnings that can be ported back now)

2025-10-23T14:04:56.263549Z

@hhausman thanks for the one-pass example. My main motivation for this question was that, while a one-pass is always possible with reduce, it tends to end up bespoke to the particular problem. That's what I really like about the Java approach: collectors are nicely composable, and so the solution ends up being snapping pieces together rather than writing the transformation directly. There are things in Clojure that work that way, and I wish I could do in Java, so it's pretty interesting to find one going the other way, where Java actually wins on the out-of-the-box composabilty

πŸ‘ 1
Alex Miller (Clojure team) 2025-10-23T14:14:22.952049Z

You can also go hybrid and use the Java stream api with Clojure functions (since 1.12)

πŸ‘πŸ» 1
Harold 2025-10-23T15:13:24.776959Z

Good points - I think the deeper thought here is really about 'sequence of maps' vs. 'stream of objects'. It sounds like you've got the right idea about why maps are better than objects, the tradeoffs between streams and sequences are perhaps more subtle (but also worth consideration). I do think a stateful transducer max-by would be another good exercise, and it would allow some generalization of the one-pass solution. In real life, with small data, the clojure you started with is actually amazing, all of the intermediate results in the ->> are useful in their own right (and are also immutable (!), and also share structure (!)), and hanging on to them to solve other problems would be a very common use case. Neat question! simple_smile

πŸ‘πŸ» 1
2025-10-23T16:53:14.982019Z

@alexmiller you mean something like this?

(-> (.stream players)
    (.collect
      (Collectors/groupingBy
        :position
        (Collectors/collectingAndThen
          (Collectors/maxBy :rank)
          :name))))
I thought it would be an abomination, but it doesn't look half bad πŸ˜„

Roman Liutikov 2025-10-23T16:55:35.207699Z

This looks good indeed. I actually prefer going raw interop for simple or one time constructs instead of using wrappers, this is also a good opportunity to learn something from underlying platform. Whether it’s java or js

πŸ‘πŸ» 1
Alex Miller (Clojure team) 2025-10-23T17:29:24.509779Z

Might not be useful here, but can also trivially parallelize the stream version, especially if players is a vector, which has a good Spliterable impl now

πŸ‘πŸ» 1
Bob B 2025-10-23T19:01:14.260989Z

and I could be mistaken here, but the name collectingAndThen gives me the impression that the java version does the grouping first and then the maxing

2025-10-23T20:24:17.402329Z

collectingAndThen is kind of like Clojure's completing: it takes a collector and produces a new collector that applies the given finishing function. So in this case, the collector reduces the vals to a single max by score, and then, once it's done that, it applies the finishing function to extract the name. groupingBy takes a categorising function and a downstream collector that it will use to reduce the values of each map entry (in this case into a single value of the highest ranked player, which is then finished by extracting the name)

πŸ‘ 1
2025-10-23T22:03:59.608849Z

All that to say that (and I haven't looked at the implementation) it could conceivably be determining the max as it builds the groups

James Amberger 2025-10-27T16:28:26.624199Z

here’s my take

(loop [a input, b nil]
    (let [i (first a)]
      (if-not i
        (update-vals b :name)
        (recur
         (next a)
         (assoc b
          (:position i)
          (if ((fnil > 0 0)
               (get-in b [(:position i) :rank])
               (:rank i))
            ((:position i) b)
            i))))))

jumar 2025-10-24T07:22:54.159139Z

I wouldn't really complicate this any further:

(def players ...)

(-> (group-by :position players)
    (update-vals (fn [position-players]
                   (:name (apply max-key :rank position-players)))))
;; => {:c "player-0", :b "player-4", :a "player-2"}
Is there anything wrong with that? Note that your original solution is using ->> which won't work with update-vals

Harold 2025-10-24T12:38:08.743879Z

@jumar - that one goes over the players each twice (one time for group-by, an one time for max-key). Going over the sequence twice is probably not a big problem in practice (and it's easy to address when it is), but the question of how to most elegantly only go over the sequence once remains an interesting one. I like the easiness of what you've done.