I have a query about the parallelism of dataset/sort-by-column and . If this is off-topic, kindly point me in the right direction.
I am using these fn's in a server backend that is also doing a lot of other "kafka things", so predictable resource consumption is important for me for this application. I've found that these fn's consume the CPU completely, which starves the other threads to the point of brokenness.
Has anybody else encountered this behaviour as a problem? Is there any advice available of for customizing this?dataset/dataset
I've been reading the code and think I was wrong about dataset/dataset. Also, it seems like sort-by-column has an https://cnuernber.github.io/tmdjs/tech.v3.dataset.html#var-sort-by-column flag for :parallel?. I'll try this so long.
Sounds interesting, if no one replies here you can also try asking on the zulip: https://clojurians.zulipchat.com/#narrow/channel/236259-tech.2Eml.2Edataset.2Edev It's another place for such discussions.
I've confirmed that it works as expected. As in, setting :parallel? to false (default is true) does indeed serialize the sorting. It is documented in the docstring of those fn's just not in the published docs (that I linked to.) Thanks for the link to the zulip.
Without looking (and I could be wrong), my guess would be that the TMD parallelization bottoms out at the fork-join commonPool - if other code in the project does the same, they can cooperate, and make good use of available system resources.