Fork me on GitHub
#data-science
<
2018-04-02
>
stardiviner05:04:24

How can I only get a small part of Incanter read-dataset of big CSV file?

metasoarous18:04:58

@stardiviner I don't know that incanter has lazy CSV reading capabilities (though someone please correct me if I'm wrong). You could take a look at this thing I wrote: https://github.com/metasoarous/semantic-csv

👍 4
stardiviner13:04:45

Thanks for your library, this is great.

🙏 4
aaelony22:04:50

way easier to use unix head command to get a small part of it

metasoarous23:04:45

@aaelony Well, depends on what you need; If you need the first N rows, then sure. But if, say, you need to do a filter, or some aggregation, semantic-csv lets you consume large csv files lazily.

metasoarous23:04:28

As a filter you could even stream them into an incanter dataset datastructure if you like

aaelony23:04:20

How can I only get a small part of Incanter `read-dataset` of big CSV file?

aaelony23:04:50

that's how I'd do it. Make the file small, read it in. Perhaps incanter couldn't read his entire file? dunno