Fork me on GitHub
#data-science
<
2022-01-29
>
SK17:01:51

hi! In tablecloth, how to access the last N rows when calculating a new column? Eg how to modify the below code not to sum up the values from the current row, but sum the values from the past 2 rows?

(-> [[:a [1 2 3]] [:b [6 7 8]]]
    (tablecloth.api/dataset)
    (tablecloth.api/add-column :sum #(map + (:a %) (:b %)))
    )
Is there some way to access the "current index" somehow and get access to the last N rows as dataset that is passed to my function?

jsa-aerial21:01:27

I don't really understand what you are asking, but you will have much more luck here: https://clojurians.zulipchat.com/#narrow/stream/151924-data-science/topic/tech.2Eml.2Edataset

jsa-aerial21:01:02

Do you mean something like this?

(-> [[:a [1 2 3]] [:b [6 7 8]]]
    tc/dataset
    (tc/add-column
     :sum (fn[ds]
            (let [avals (->> :a ds (coll/sliding-take 2)
                             (mapv (fn[[x y]] [x (or y 0)])))
                  bvals (->> :b ds (coll/sliding-take 2)
                             (mapv (fn[[x y]] [x (or y 0)])))]
              (mapv (fn [a b]
                      (->> b (concat a) vec (apply +)))
                    avals bvals)))))

=> _unnamed [3 3]:

| :a | :b | :sum |
|---:|---:|-----:|
|  1 |  6 |   16 |
|  2 |  7 |   20 |
|  3 |  8 |   11 |

jsa-aerial21:01:43

Hmmm, can't figure out slack here - code formatting and such is also far better in Zulip than slack

SK21:01:46

@U06C63VL4 yeah I did something similar, I mean take all "a" and "b" values and do some regular sequence processing via clojure and add the column as a whole

SK21:01:23

however what I'm asking is, if there is any way of adding a column, that is more "row" oriented

SK21:01:52

so in case of this newly added column, the values are not given as a whole, but a function is taken, which gets the last N rows compared to the current row and the function calculates the current value based on the last N rows

SK22:01:55

eg this would be an easier and more intuitive way to calculate a moving average

jsa-aerial22:01:58

That's what the code I posted does - for the last 2 rows

SK22:01:28

then I probably don't understand something

SK22:01:38

where is the sliding-take defined?

jsa-aerial22:01:53

But really, you should be over on Zulip as there may be other ideas and options and that is where data science clojure hangs out - NOT on slack

jsa-aerial22:01:17

Sliding take is in one of my utils libraries

jsa-aerial16:01:28

Right - this is cleaner and likely way faster

(-> [[:a [1 2 3]] [:b [6 7 8]]]
    tc/dataset
    (drl/rolling {:window-type :fixed
                  :window-size 2
                  :relative-window-position :right
                  :edge-mode :zero}
                 {:suma (drl/sum :a) :sumb (drl/sum :b)})
    (tc/add-column
     :sum (fn[ds]
            (dsc/column-map
             (fn[a b] (+ a b))
             :long
(ds :suma) (ds :sumb))))) @UDRJMEFSN Is there a way to pass multiple columns to a reducer? So, we would have simply {:sum (drl/sum :a :b)} sort of thing.

chrisn19:01:33

I haven't at this time inplemented a multi column reducer for rolling but I agree it is useful.

👍 1
chrisn15:02:15

multi-column reducers for rolling will be in once java-api is merged.

👍 1
sova-soars-the-sora18:01:46

Has anyone used wav2vec with Clojure? I'm interested in making a simple program to transcribe audio clips but it looks like the world is Python-centric 🐍