This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-11-24
Channels
- # announcements (11)
- # babashka (11)
- # beginners (36)
- # biff (14)
- # cider (2)
- # clj-commons (9)
- # clojure (34)
- # clojure-czech (2)
- # clojure-europe (65)
- # clojure-nl (2)
- # clojure-norway (12)
- # clojure-uk (4)
- # clojuredesign-podcast (7)
- # clojurescript (5)
- # cursive (8)
- # deps-new (6)
- # hugsql (1)
- # humbleui (2)
- # hyperfiddle (5)
- # leiningen (21)
- # off-topic (2)
- # polylith (5)
- # practicalli (1)
- # releases (1)
- # sci (64)
- # sql (9)
- # squint (43)
- # test-check (6)
- # vim (7)
Good morning and tgif
Morning!
Hi everyone, I need some help with handling zip files and keeping the extracted entries with their content in memory (in a vector or map or anything) for further processing.
I have this function below (based on https://stackoverflow.com/a/5428265)
This function works fine if called directly but as soon as I want to do anything with the entries from the zip the stream is closed.
To make sure the file stays open I have a with-open
and a doall
to make sure the lazy sequence is initialized.
What am I missing that could be causing the side effects of the stream closing here? I have tried several things but all seem to come back to either the stream closing prematurely or the zipfile being closed before anything can be done with it.
(defn unzip
[{:keys [^File file path filename]}]
(let [zipfile (or file (io/file (str path filename)))]
(with-open [zf (ZipFile. zipfile)]
(doall ; initialize the lazy sequence immediately
(map
#(assoc {}
:filename (.getName %)
:data (map
(fn [entry]
(string/trim entry))
(line-seq (io/reader (.getInputStream zf %)))))
(enumeration-seq
(.entries zf)))))))
When I call it like this I can get all entries just fine:
(unzip {:file (io/file "path/to/file.zip")})
As soon as I try anything that creates a new lazy-sequence the stream is closed prematurelySwitching the let and with-open results in No matching field found: close for class java.io.File
any difference if you only pass the name not the file object ... cos that should be wrapped with-open
too 🙂
slurp as replacement for io/file? that results in an exception and returning the actual zip contents bytes being thrown haha.
Execution error (InvalidPathException) at sun.nio.fs.UnixPath/checkNotNul (UnixPath.java:90).
What is it that you want to do? Open the zipfile, mutate some stuff and then zip it again?
No open the zip, get the contents (xml files) and send each entry (in memory) as sequence to another function
I dont mind either way as each file (entry) in the zip would be sent of to another function. The xml will be read as line-seq after getting it from the zip anyway. I now have this inside the unzip.
Something like this should work:
(defn process-zip [zip-file-path f]
(with-open [zip-file (java.util.zip.ZipFile. zip-file-path)]
(let [entries (.entries zip-file)]
(doseq [entry (enumeration-seq entries)]
(with-open [entry-stream (.getInputStream zip-file entry)]
(f entry-stream))))))
(process-zip "example.zip" println)
The let didnt add much:
(defn process-zip [zip-file-path f]
(with-open [zip-file (java.util.zip.ZipFile. zip-file-path)]
(doseq [entry (enumeration-seq (.entries zip-file))]
(with-open [entry-stream (.getInputStream zip-file entry)]
(f entry-stream)))))
this has a side effect of not returning anything due to using doseq tho. that means that f should contain everything that needs fondling with 🙂 this is something I can work with, I'm just wondering if there is a way to get the data out to be able to continue processing outside of the doseq
:data
entry is a lazy sequence. doall
realize first map, but inside created structure is another map which stays lazy after file is closed.
@U1EP3BZ3Q adding another doall for the data entry doesnt solve it tho
@U6T7M9DBR I got it working with this thanks! Moved everything inside f and added another binding for the entry name
(defn process-zip [zip-file-path f]
(with-open [zip-file (java.util.zip.ZipFile. zip-file-path)]
(doseq [entry (enumeration-seq (.entries zip-file))]
(with-open [entry-stream (.getInputStream zip-file entry)]
(f entry entry-stream)))))
I realise that you've got it sorted now, but I didn't spot in the thread where you'd worked out what the cause of the problem was in your original code? If you have, feel free to ignore me 😉 ... but you had a lazy seq of lazy seqs and you were only doall
ing the outer one ...
(defn unzip
[{:keys [^File file path filename]}]
(let [zipfile (or file (io/file (str path filename)))]
(with-open [zf (ZipFile. zipfile)]
(doall (map ;; <- outer lazy seq that is realised inside the dynamic scope
#(assoc {}
:filename (.getName %)
:data (map ;; <- inner lazy seq that isn't realised
(fn [entry]
(string/trim entry))
(line-seq (io/reader (.getInputStream zf %)))))
(enumeration-seq
(.entries zf)))))))
you could fix that by changing the second map
to mapv
or use slurp
there or something? Does that make sense? or am I talking nonsense?@U0P0TMEFJ I indeed have solved it using @U6T7M9DBR’s example. haha, can't believe it was that simple :man-facepalming::skin-tone-2: I changed the inner map to mapv and that magically solves the issue. I tried so many different approaches yesterday that I couldn't see what I had and hadnt tried anymore. However I did try using a doall for the inner map but that didnt fix it.
So whats different in mapv from map other then that it doesnt return a lazy sequence but an initialized vector? Because doall map doesnt work
(defn unzip
[{:keys [^java.io.File file path filename]}]
(let [zipfile (or file ( (str path filename)))]
(with-open [zf (java.util.zip.ZipFile. zipfile)]
(doall
(map
#(assoc {}
:filename (.getName %)
:data (doall
(map
(fn [entry]
(clojure.string/trim entry))
(line-seq ( (.getInputStream zf %))))))
(enumeration-seq
(.entries zf)))))))
this works for me ... you need doall
on both map
sThanks @U0P0TMEFJ
yeah ... me too ... search and replacing map
for mapv
work for me in the code you posted ... I tend to use them over things like doall
when mixing with io or dynamic scoped things to try and avoid these sorts of problems ...
Ill have look at that doall again after lunch #omnomnom . I usually skip doall as much as I can too, as I also never use for.
Imho it's slow and ugly :face_with_hand_over_mouth: I replace it with keep 👀 or just reduce
Get well soon @U0P0TMEFJ
yeah ... I tend to only use for
when I want inner loops ... I find filter
, map
etc easier to read ... thanks @U02CX2V8PJN
... 'nu breekt mijn klomp' seriously tho, now it works using doall
wth. I guess I tried so many approaches that I got things mixed up or something
thanks @U6T7M9DBR