Fork me on GitHub

well, yada docs were in markdown but we are in the process of moving to asciidoc for all the great reasons here: - having had some good experiences with DocBook at various points in my career, it's a great choice - only gradually beginning to exploit the benefits of asciidoc so far, but for me there's no turning back to markdown -thanks to @dominicm for introducing me to it


yes, leanpub authoring is markdown only (and I've run into problems with chinese characters in one of my yada examples, which is still an outstanding issue) - but fortunately leanpub let you upload any generated pdf so you can still sell books without buying into their markdown build pipeline. So eventually I'll upload a PDF generated from a DocBook pipeline (and it will a much better looking document than what is possible with their markdown stuff)


agile_geek: re your ETL question - I'd go with plain clojure and then s3 for the data lake. You can use spark to process the data in s3 (as an HDFS like store) if you have some big data processing to do. You can do all the spark in EMR which you can control from amazonica. Data in and out of s3 can be done by amazonica too.


@otfrom: that option had crossed my mind too. Bit like the bit of code I did for Mastodon C last year. I've found out a little more about the data and systems and I suspect you're right. However the work is to figure out the landscape and problems to solve first... assuming I win it.