Fork me on GitHub
#data-science
<
2022-07-07
>
Benjamin08:07:06

I have csv where 1 coll is like this: "foo,bar,baz". Do you have a tip for unrolling this into rows? I see that io.string-row-parser has the option to provide custom readers

genmeblog08:07:12

There is a bunch of functions in tablecloth to perform such operations. Looks like you need separatehttps://scicloj.github.io/tablecloth/index.html#Separate

genmeblog08:07:38

Also pivoting operations can be interesting for you: https://scicloj.github.io/tablecloth/index.html#Reshape

genmeblog08:07:59

I want to encourage you to skim this documentation 🙂 It's full of snippets for probably most of the functionality covered by TC. Also includes partially translated documentation from dplyr, tidyr and data.table

Benjamin08:07:34

yea it's well documented. Thanks a lot

Benjamin08:07:41

it's like getting a piece of datascience cake 😄

😀 3
Benjamin13:07:08

(ds-reduce/aggregate
 {:n-elems (ds-reduce/row-count)
  :cos_id_count (ds-reduce/count-distinct :cos_id)
  :genre_count (ds-reduce/count-distinct :genre)}
 [my-ds])

            reductions.clj:  144  tech.v3.dataset.reductions.SetConsumer/combine
            Consumers.java:   39  tech.v3.datatype.Consumers$StagedConsumer/combineList
            reductions.clj:  151  tech.v3.dataset.reductions/distinct/reify
                  impl.clj:  235  tech.v3.dataset.reductions.impl/aggregate-reducer/reify
any idea why this can happen?

Benjamin13:07:37

|                  :genre |
|-------------------------|
|                 Finance |
|           Music & Audio |
|             Photography |
| Video Players & Editors |
|               Education |
|             Photography |
|                  Puzzle |
|                   Tools |
|                  Casual |
|             Photography |
|            Art & Design |
|                  Casual |
|               Lifestyle |
|       Books & Reference |
|            Productivity |
| Video Players & Editors |
| Video Players & Editors |
|                  Social |
| Video Players & Editors |
|         Auto & Vehicles |
it is something about this data, when I take 10 of these it still works. Guess the combine path is different

Benjamin13:07:56

update: I don' think there is anything special with the data I posted

chrisn16:07:47

That looks like an issue - can you file issue with code and data?

chrisn17:07:24

Thanks - :thumbsup: