Fork me on GitHub
#clojure-europe
<
2020-11-19
>
dharrigan07:11:58

Good Morning!

ordnungswidrig08:11:27

Good moriningnging!

jasonbell10:11:45

Hello Cleveland!

otfrom10:11:14

@borkdude that transit flush issue is interesting. I'm sidestepping it by having a small number or large objects I'm writing so I'm dodging that issue

otfrom10:11:43

I suppose the question is "what disk format should I be using?"

otfrom10:11:58

I wasn't happy with nippy or csv

otfrom10:11:29

If csv works as a format I could go for arrow and http://tech.ml.dataset, but it would be weird to then drop the dataset aspects of it

borkdude10:11:35

@otfrom I'm side-stepping it by first writing into an in-memory bytearray

otfrom10:11:02

at least your solution is on purpose rather than by accident like mine 🙂

ordnungswidrig13:11:16

Can you migitate it by wrapping the actual writer in a buffering writer?

otfrom13:11:45

@ordnungswidrig I'm not currently having a problem with it, but I seem to be hitting the sweet spot of only writing a few big objects to a file

borkdude13:11:13

I'm using this in clj-kondo where the files I'm writing aren't that big, so all in memory is fine

dominicm14:11:05

@ordnungswidrig nope. Buffering output streams follow flushing rules too.

ordnungswidrig14:11:53

That would rewuire a custom stream that reconciles the flashes?

dominicm18:11:47

@ordnungswidrig Yep. I tried to write one, it's very difficult because you need to sometimes flush in order to actually stream things to the browser to reduce overall latency (otherwise you're basically just doing a ByteArrayOutputStream).

ordnungswidrig18:11:18

it’s an interesting question and I think the “flush management” must be a separate concern than the wire encoding. Acutally you need to combination of time and size based decision. But honestly I do not understand how this can have an impact on chunking, which is a transport encoding.

dominicm18:11:57

@ordnungswidrig Well, when you flush, that causes a cascade of flushes which means the gzipoutputstream just flushes out the chunk it has right now, which the http chunking output stream then delivers.

dominicm18:11:06

Transit shouldn't be calling flush, period.

👍 3