Fork me on GitHub
#tools-deps
<
2023-06-12
>
borkdude15:06:31

deps.clj downloads the tools jar if it's not installed yet on a system, but 1 in 100 (or fewer) times I get complains that deps.clj returns:

Error: Could not find or load main class clojure.main
Caused by: java.lang.ClassNotFoundException: clojure.main
because of an invalid downloaded jar which is then solved by deleting ~/.deps.clj so it forces a re-download. I can repro this by messing with the tools jar:
echo '' >  /Users/borkdude/.deps.clj/1.11.1.1347/ClojureTools/clojure-tools-1.11.1.1347.jar
A solution to solve this problem could be that I mirror the tools.zip in deps.clj github releases along with a .sha256 file so I can verify if the download was successful. Or perhaps http://clojure.org could provide a checksum file (that I can verify in clojure/Java)

borkdude15:06:02

Is there anything against me mirroring the tools.zip on github releases of deps.clj? Or would you be in for solution 2, then I could just avoid doing so

dominicm15:06:16

Why is the jar invalid?

dominicm15:06:20

How do you download the jar?

Alex Miller (Clojure team)15:06:13

there is a checksum file in maven already

dominicm15:06:38

Are the clojure tools uploaded to maven?

Alex Miller (Clojure team)15:06:48

oh nvm, you're talking uber jar here

borkdude15:06:50

yeah the zip file

borkdude15:06:22

for example

borkdude15:06:30

for zip and tar.gz files for other projects I now upload a .sha256 file for validation, e.g. see here: https://github.com/clj-kondo/clj-kondo/releases/tag/v2023.05.26 I directly copied this idea from graalvm: https://github.com/graalvm/graalvm-ce-builds/releases/tag/vm-22.3.2

dominicm15:06:35

If you checked the jar for clojure.main (or something) after downloading, that would serve the same purpose, right?

dominicm15:06:48

I’m still wondering why the jar is invalid without you knowing, though.

borkdude15:06:03

yeah I think so, although there are more files in that .zip file

Alex Miller (Clojure team)15:06:24

the .tar.gz sha is in https://download.clojure.org/install/stable.properties but we don't currently make or publish a zip sha

dominicm15:06:38

Wait, the zip unzips but the files that come out aren’t valid?

borkdude15:06:00

yeah that's weird. so it could also be an unzip problem perhaps

dominicm15:06:25

Depending on what you’re using, I would have thought the checksumming in the zip would have prevented this.

dominicm15:06:29

Something smells fishy here.

Alex Miller (Clojure team)15:06:31

maybe it's unzipping a partially downloaded zip?

borkdude16:06:08

or the user aborts while unzipping? dunno, but it sometimes happens

dominicm16:06:21

Do you keep the zip file around?

Alex Miller (Clojure team)16:06:22

like the first half of the zip. I don't know anything about zip but maybe it supports this so you can zip while still downloading

borkdude16:06:37

deps.clj first downloads, then unzips

dominicm16:06:40

I vaguely recall that zip stores the file listing at the end just to be annoying.

borkdude16:06:46

the .jar file does exist on disk, next time this happens for someone I should ask them to send this .jar file to me, but it's hard to know what exactly goes wrong. Getting rid of ~/.deps.clj solves the problem though. Perhaps I could also try/catch and check for clojure.main in the error message and offer a better suggestion

Alex Miller (Clojure team)16:06:59

seems like it still needs a bit more examination (is there somewhere you can detect the problem and save off the badness when it happens) but happy to publish additional sha files if that's helpful

borkdude16:06:01

I guess I could add to the error message: please go to the #CLX41ASCS channel and post your .jar for examination ;)

borkdude16:06:43

Re-downloading automatically seems risky as if I make a mistake somewhere you could get into a loop

dominicm16:06:52

If you kept the zip around, you should be able to re-validate the jars against the zip.

borkdude16:06:50

yeah, good idea, I'll keep the .zip around. I currently delete it

dominicm16:06:27

If you wanted to check, you could .getCrc on your ZipEntry and calculate the Crc for the file on disk & compare them.

dominicm16:06:50

Not sure how fast that would be, but maybe worth doing on failure to launch or something. Crc32 is pretty fast.

borkdude16:06:44

oh that might be a good check, I'll try it out

borkdude19:06:34

I'm getting -1 on all entries from the tools.jar zip

borkdude19:06:47

which means "unknown"

borkdude19:06:18

ah, the crc is known after you read it

borkdude08:06:09

I wonder how I can purposely damage a .zip file to test this, because I'm still not 100% sure if what I did makes any sense. I'm getting the crc32 of the entry and then read it through a checkedinputstream and then compare the crc32, but who says that those aren't always the same, even in the case of a weirdly downloaded zip file

borkdude09:06:50

When I truncate some of the last bytes, I can't even unzip it:

dd if=clojure-tools-1.11.1.1347.zip of=tools-corrupted.zip bs=1 count=17999800

dominicm09:06:42

I would expect that the file either unzips falsely or is "damaged" after the fact.

dominicm09:06:59

If you keep the zip around you can revalidate the files against it.

borkdude09:06:16

how would that work?

borkdude09:06:36

the crc32 codes are -1 when I read them, which means they weren't in the zip file when the the file got zipped, right

dominicm09:06:59

I'm not sure I follow

borkdude09:06:02

they only become some number after I've actually processed the entry, which tells me it's lazily computed based on the data that was already there, which is kind of pointless

dominicm09:06:21

You need to consume the stream for the CRC to be calculated, yeah

dominicm09:06:38

So you could just read the whole thing and then check it against the CRC of what's on disk

borkdude09:06:01

I'm checking this:

(= (.getCrc entry) (-> cis (.getChecksum) (.getValue)))
but you are suggesting calculating the crc32 of the file on disk, rather than from the checked input stream which was used to copy to disk?

dominicm09:06:21

File on disk compared with the one in the entry

dominicm09:06:37

Sorry, not both *

borkdude09:06:37

so not based on the checked input stream?

dominicm09:06:48

I think they're both the same value tbh.

borkdude09:06:05

yeah, I think so too, the above comparison is always true, I think, no matter how corrupted

dominicm09:06:20

You can use the file Vs the zip as a checksum

dominicm09:06:45

Although the CRC should be written in the zip somewhere, too

borkdude09:06:48

so comparing to on disk would detect some kind of data write failure?

borkdude09:06:02

yeah, I think you need to produce a zip file with explicit crc32 on

dominicm09:06:07

Or if the file had been modified (eg by antivirus)

dominicm09:06:17

Oh I thought zips had them by default.

dominicm09:06:30

It could be a limitation of the Java zip library, too

borkdude09:06:24

ah right it has crc32 by default

borkdude09:06:46

will keep an eye out until the next failure and will ask for the zip file so I can double-check if this really helps

borkdude09:06:04

I guess if I could manually change the crc code of the tools jar in an existing .zip file, I could test if the current deps.clj would detect this (and give the better error message)

dominicm09:06:38

You could also create a new zip with a different tools jar in ☺️

dominicm12:06:01

The CRC of the tools jar in the zip would be different than the one on disk.

borkdude12:06:00

but not if you unzip that zip with the different tools jar

borkdude12:06:19

I just want a zip file that I can throw at this function and then the function complains

dominicm12:06:56

Ah, then you’re probably into some kind of malicious crc fiddling of the zip file, yeah.

dominicm12:06:00

Time to pull out your hex editor 😄

😫 2