data-science 2020-08-12 | Slack Archive

chrisn11:08:11

We have a new post up exploring memory mapping and the new Apache Arrow data format via Clojure and https://github.com/techascent/tech.ml.dataset. A few times in my career I have used memory mapping and found it both simpler and faster than stream-based IO. Using it we can 'load' datasets far larger than physical RAM or load only 1 column/row out of many in a dataset without loading the rest. We hope you enjoy this simple demonstration! https://techascent.com/blog/memory-mapping-arrow.html

👍 37

🚀 6

💯 3

David Pham11:08:33

Thanks!

2020-08-12

Channels