sql 2023-10-30 | Slack Archive

cddr12:10:03

I have a query that returns events in a particular order allowing me to (partition-by :match-id) to get the events in nice workable batches. Previously, this query was running against athena and the results were generated by an iteration and I was able to generate a lazy sequence and give that to the partition-by call. Now I’m trying to do the same thing with jdbc.next. plan gives you an IReducible right? Is there a way I can take that reducible, and return a lazy sequence of the rows in batches, partitioned by some column in each row (assuming the rows are ordered by that column in the query).

cddr13:10:43

Hm, I thought this might work…

(transduce
   (partition-by :match_id)
   (fn
     ([] nil)
     ([acc] acc)
     ([acc m]
      (prn :processing-match (:match_id (first m))
           :first-row (first m))))
   (jdbc/plan ds ["select match_id, match_level_id
                     from read_parquet('/tmp/arqam-01.parquet')
                    where match_id in (1119060, 1119640)
                    order by match_id, match_level_id"]))

…but it prints this…

:processing-match 1119640 :first-row {:match_id 1119640, :match_level_id 1}
:processing-match nil :first-row {row} from `plan` -- missing `map` or `reduce`?

cddr13:10:25

Hm, this actually works as expected if I pass a literal vector as the reducible.

(transduce
   (partition-by :match_id)
   (fn
     ([] nil)
     ([acc] acc)
     ([acc m]
      (let [events (map #(select-keys % [:match_id :match_level_id]) m)]
        (prn :processing-match (:match_id (first events))
             :first-row (first events)))))
   [{:match_id 1 :match_level_id 1}
    {:match_id 1 :match_level_id 2}
    {:match_id 2 :match_level_id 1}
    {:match_id 2 :match_level_id 2}])

:processing-match 1 :first-row {:match_id 1, :match_level_id 1}
:processing-match 2 :first-row {:match_id 2, :match_level_id 1}
:processing-match 1 :first-row {:match_id 1, :match_level_id 1}
:processing-match 2 :first-row {:match_id 2, :match_level_id 1}

hiredman15:10:08

This if you add a map transducer which does the select keys before the partition it will work

cddr15:10:09

sweet! that worked. cheers

hiredman15:10:15

There is a Note at the end of this section about it https://github.com/seancorfield/next-jdbc/blob/develop/doc/getting-started.md#plan--reducing-result-sets

2023-10-30

Channels