What is an idiomatic way to express this pandas instruction with tech.ml.dataset?
ds[ds["Column A"]<5] = 42Do you actually wish to mutate your dataset in-place like in pandas? Or to create a new dataset with the values less than 5 replaced with 42?
In the spirit of immutability: Get a new dataset 🙂
Good question, I am realizing I don't know what might be the most efficient way. The following does work:
(require '[tech.v3.dataset :as ds])
(def ds
(ds/->dataset {"Column A" (range 10)}))
(ds/column-map ds
"Column A"
#(if (< % 5) 42 %))
_unnamed [10 1]:
| Column A |
|---------:|
| 42 |
| 42 |
| 42 |
| 42 |
| 42 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
But there may be more efficient ways, if your dataset is big.If performance is not an issue, then I think the above is just fine.
It would be helpful to bring this discussion to the Clojurians Zulip chat. https://scicloj.github.io/docs/community/chat/ Some people there will benefit from this question and also probably help.
thanks for your help. In the future, I will ask those kind of questions on zulip