Fork me on GitHub
#aws
<
2019-05-18
>
kulminaator10:05:28

just a report from the field. i used the cognitect-labs/aws-api for some reasonably heavyweight stuff yesterday for the first time and the thing just worked. very nice & thank you to the authors of that code 🙂. long version of what i did: we have a big set of json files stored on s3 (every file holds multiple s3 objects) and i needed a quick indexing engine for it that can scale beyond of what i want to host on actual hard disks, so figured i will store the indices on s3 as well. i implemented a proof of concept in our one day hacking marathon that indexed 800 megabytes of json records (just a day of sample data from our datastore) downloaded them with aws api, parsed with data.json, hashed the things and fitted them into index nodes (resulted in 1.5 million tiny index files) and pushed the index nodes back into aws (for reasonable performance used 16 parallel threads to do the latter). the only issues i met along the way were my own typos (oh boy typos are annoying if you splash 60 megabytes of data into your terminal 😄). finished the proof of concept in one working day, good stuff.

parrot 4