Fork me on GitHub

Parsing academic articles has a side-effect of teaching me computer science history. The latest thing to blow up the parser is &#xB;, or vertical tab alignment. > Vertical tab was used to speed up printer vertical movement. Some printers used special tab belts with various tab spots. This helped align content on forms. VT to header space, fill in header, VT to body area, fill in lines, VT to form footer. Generally it was coded in the program as a character constant. From the keyboard, it would be CTRL-K. Of course, it's inserted in random places such as in the middle of a university name, where it makes no sense, so unfortunately it teaches me very little of the use in academic articles, other than that (in this case) the article information was probably stored in an old version of MS SQL.


(csvs-chan) is supposed to represent all the entries presented by a bunch of csv files


it makes the cpu melt though

Alex Miller (Clojure team)18:06:14

it's going to basically open up 8 files at a time and churn through them

Alex Miller (Clojure team)18:06:20

if anything blocks inside, it's going to burn one of those pool threads too

Alex Miller (Clojure team)18:06:32

really, this is what pipeline-blocking is designed for


also it is doing io inside a go block which makes me sad


the threadpool go block's run on is limited and shared, so you should avoid clogging it up with blocking io


if you need to do blocking io, the easiest thing to do is to use thread instead of go

💡 4

pipeline-blocking is a good idea there, but one difference there is merge will basically shuffle your csv lines and pipeline-blocking will preserve order

✔️ 4

how does one estimate the n (parallelism factor)?

Alex Miller (Clojure team)18:06:15

how many cores do you have?

Alex Miller (Clojure team)18:06:06

or maybe even less as you're i/o bound

✔️ 4