So, there's currently a bug in all the methods of that are copying from an input-stream.
They all perform some variation of reading from the input stream, and continuing to do so until the read method returns a non-positive number.
However, according to the javadoc for , this is incorrect, it should continue until it returns a negative number:
> the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
Some implementations of InputStream are expected to return 0 bytes from read and still need to be read from, as is made clear by the choice of implementation of https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/io/InputStream.java#L795
> while ((read = this.read(buffer, 0, DEFAULT_BUFFER_SIZE)) >= 0) {
I know that the current process for "simple fixes" is anything but trivial with getting patches into Clojure, but I'm curious if this would be something that would be worth the effort of me trying to go through the process to become a contributor to submit patches for?
There is a jira for this and I’ve spent quite some time on it. It is not quite as obvious as you may think
I would appreciate a link to the ticket
On a bus, hard for me to search on my phone sorry
ah, all good! thanks anyway
I seem to recall that I made the obvious change and things I did not expect broke
Well that has some unfortunate implications for the users of 😅
In particular, I think not all streams necessarily behave the way the javadoc says
and the originating ask: https://ask.clojure.org/index.php/8450/strange-behavior-with-clojure-java-io-copy
Anyways, I would be happy for work on it and would get it into the 1.13 list
I'm just not sure what the actual resolution on it would be if the conclusion was "things break when you use it the way the JDK intends" and that's enough reason not to make the obvious change.
I'd be very curious to hear what exactly it is that would break with this. @nbtheduke you have a pretty extensive test suite for changes to clojure, right? Do you happen to have a link on that so I could try out the change against that to see some of what might break under the change?
Consider that fud from my memory u til you verify
If you can review the patch and the comments there and see if that patch and report back on if that is good or something else is needed, that would be helpful
after the core team got https://github.com/clojure/test.regression set up, i stopped working on my core_regression library because i considered its actual purpose (nudge the core team into setting up such a thing) complete lol
I can run a patch through our regression tester and get much wider coverage on it
Sounds good
but in case you wanna try it out: https://github.com/NoahTheDuke/core_regression/
Interesting that the Apache folks admitted their library code was broken -- and they fixed it -- but they also cautioned that there are other non-compliant stream implementations out there, that might (incorrectly) return 0...
(I had forgotten about this issue -- but then saw that I'd answered the Ask, and that led to a bug report against Apache commons-compress by Juha)
I feel like the JDK accepting returning 0 as expected is an endorsement of that behavior
waits for someone to scream "undefined behavior"... 😉
yeah, arguably
Perhaps the "right" approach for Clojure is to stop at 0 by default but provide an option to override that for dealing with badly-behaved InputStream implementations?
but also I don't actually think that it's undefined. The first paragraph says data is read, and the second says that data is attempted to be read, and that if none is actually read and the reason is because it's the end of stream, it will return -1.
That sounds to me like it's defined that if the application blocks until data is available, attempts to read bytes, but reads 0 bytes, returning 0 is valid.
I wrote my thoughts up on Ask so I won't repeat them here 🙂