Fork me on GitHub
#data-science
<
2023-06-22
>
Nick00:06:44

Does anyone know how to append at techml dataset to another techml dataset? I have the following dataset:

(def ds (ds/->dataset [{"DestinationName" "CZ_1", "ProductName" "FG_1", "Quantity" 100, "Allocated-Qty" 0, "SourceName" "DC_1", :ratio 0.5} {"DestinationName" "CZ_1", "ProductName" "FG_1", "Quantity" 100, "Allocated-Qty" 0, "SourceName" "DC_2", :ratio 0.5}]
))
If I call
(ds/concat [ds ds])
I will get two datasets. To illustrate, the code below will produce a two vectors of maps. Instead of one vector of maps.
(map ds/mapseq-reader (ds/concat [ds ds]))
I would like them "unioned" "appended" or "merged" so I have a single dataset. I need this for a recursive function that expects a single dataset and can't seem to figure the most idiomatic way to accomplish this?

Nick02:06:31

This is what I have working, but don't know if there is a better way

(def combined-ds (ds/concat [ds ds]))

(defn merge-vectors-of-maps [vecs]
  (ds/->>dataset (reduce into [] vecs)))

(def merged-ds (merge-vectors-of-maps [(ds/mapseq-reader(first combined-ds)) (ds/mapseq-reader(second combined-ds))]))

genmeblog06:06:55

(ds/concat ds ds) should work, don't use vector.

Nick18:06:29

(ds/concat ds ds) doesn't work unfortunately. It creates two datasets in the same entity. The code I wrote above does work and I'll stick with it. I don't know what the use case of ds/concat is, but I'll let it go as the functions above get me through.

genmeblog19:06:30

Strange. ds/concat is exactly what you need. And I have never encountered any problem with it.

seb23107:06:38

@U04A1LVBBPU have you tried (ds/concat-copying ds ds)? Worked locally for me. Note no vector required. However if you need the datasets in a vector try (apply ds/concat-copying [ds ds]) I'm not sure why ds/concat didn't work for you, but as I understand concat-copying it works like a reducer

genmeblog08:06:41

I verified it on 7.000-beta-38 and it works and gives the same result as your merged-ds . btw. when you call (ds/concat [ds ds]) it just returns [ds ds].

(def ds (ds/->dataset [{"DestinationName" "CZ_1", "ProductName" "FG_1", "Quantity" 100, "Allocated-Qty" 0, "SourceName" "DC_1", :ratio 0.5} {"DestinationName" "CZ_1", "ProductName" "FG_1", "Quantity" 100, "Allocated-Qty" 0, "SourceName" "DC_2", :ratio 0.5}]))

;; => _unnamed [2 6]:
;;    | DestinationName | ProductName | Quantity | Allocated-Qty | SourceName | :ratio |
;;    |-----------------|-------------|---------:|--------------:|------------|-------:|
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_1 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_2 |    0.5 |

(ds/concat ds ds)

;; => _unnamed [4 6]:
;;    | DestinationName | ProductName | Quantity | Allocated-Qty | SourceName | :ratio |
;;    |-----------------|-------------|---------:|--------------:|------------|-------:|
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_1 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_2 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_1 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_2 |    0.5 |

(ds/concat-copying ds ds)

;; => _unnamed [4 6]:
;;    | DestinationName | ProductName | Quantity | Allocated-Qty | SourceName | :ratio |
;;    |-----------------|-------------|---------:|--------------:|------------|-------:|
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_1 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_2 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_1 |    0.5 |
;;    |            CZ_1 |        FG_1 |      100 |             0 |       DC_2 |    0.5 |

Nick01:06:49

Thanks for taking the time to look at this guys. I am sorry, it was my fault putting in (ds/concat [ds ds]) instead of (ds/concat ds ds). genmeblog, you are correct, (ds/concat ds ds) works as expected

👍 4