This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-07-22
Channels
- # announcements (1)
- # aws (4)
- # beginners (73)
- # bristol-clojurians (1)
- # calva (7)
- # cider (5)
- # cljs-dev (11)
- # cljsrn (3)
- # clojure (30)
- # clojure-europe (24)
- # clojure-italy (2)
- # clojure-nl (3)
- # clojure-spec (7)
- # clojure-uk (69)
- # clojurescript (109)
- # cursive (21)
- # datascript (1)
- # datomic (72)
- # events (1)
- # fulcro (5)
- # graalvm (2)
- # helix (12)
- # hoplon (12)
- # jobs (2)
- # juxt (2)
- # kaocha (7)
- # keechma (1)
- # lambdaisland (5)
- # meander (18)
- # off-topic (52)
- # pathom (9)
- # re-frame (18)
- # reagent (5)
- # reitit (7)
- # sci (6)
- # shadow-cljs (76)
- # sql (9)
- # testing (4)
- # tools-deps (14)
- # xtdb (28)
My first non-trivial contribution to a Clojure tool that I use every day got merged this morning! https://github.com/weavejester/cljfmt/commit/edee90c060ab8ffd92f8ad04610d56d34aac4bb7
Congratulations, very useful one! Personally I am convinced that 100% of hashmaps should be newline-delimited (and 80% of them should be vertically aligned... that's another story)
Thank you! I certainly like to have them newline delimited for my own code.
I wonder if this is a java bug
(ins)user=> (->> (io/file ".") (.listFiles) (filter #(= (str %) "./foo")))
(#object[java.io.File 0x1e6dad8 "./foo"])
(ins)user=> (->> (io/file ".") (.listFiles) (filter #(= (str %) "./foo")) (run! #(.delete %)))
nil
(ins)user=> (->> (io/file ".") (.listFiles) (filter #(= (str %) "./foo")))
()
(ins)user=> (->> (io/file ".") (.listFiles) (filter (comp #(contains? % (char 65533)) set seq str)))
(#object[java.io.File 0x6a9950f1 "./�Ϲ}^"] #object[java.io.File 0x7ad54c55 "./���JY"])
(ins)user=> (->> (io/file ".") (.listFiles) (filter (comp #(contains? % (char 65533)) set seq str)) (run! #(.delete %)))
nil
(ins)user=> (->> (io/file ".") (.listFiles) (filter (comp #(contains? % (char 65533)) set seq str)))
(#object[java.io.File 0xde18f63 "./�Ϲ}^"] #object[java.io.File 0x108bdbd8 "./���JY"])
the sakura terminal emulator creates weird files in my home directory, containing weird characters (and the content is some config), I can't get a shell glob to expand to the file name
java can find and delete other files, but there's something (at the fs layer?) that silently breaks on this input
I was able to delete these files in the past via find
with a clever selection predicate, but I'm more interested in the strange phenomenon that I can create a File object from the name / path of a file, but that can't be used to correctly do fs operations on that file
Perhaps it converts the non-utf-valid parts of the filename to unicode "invalid character" on .listFiles?
do the java.nio.file.*
functions work? eg. https://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#delete(java.nio.file.Path)
the bytes as UTF-8
(ins)(user=> (->> (io/file ".") (.listFiles) (filter (comp #(contains? % (char 65533)) set seq str)) (map #(seq (.getBytes (str %) "UTF-8"))))
((46 47 -17 -65 -67 -49 -71 125 94) (46 47 -17 -65 -67 -17 -65 -67 -17 -65 -67 74 89))
Well, I can't say if it's a valid "invalid char" unicode or an invalid unicode just by looking at it, so 🙂
you can recreate the exact string using those bytes
user=> (map #(String. (byte-array %) "UTF-8") *1)
("./�Ϲ}^" "./���JY")
Yeah, that's not the problem; I don't remember any specifics about how java represents unicode and about encodings for invalid char seqs
So, does .exists
work + returns true on one of those http://java.io.File instances?
(ins)user=> (Paths/get (first names) (into-array String []))
#object[sun.nio.fs.UnixPath 0x5136207f "./�Ϲ}^"]
(ins)user=> (Files/delete *1)
Execution error (NoSuchFileException) at sun.nio.fs.UnixException/translateToIOException (UnixException.java:92).
./�Ϲ}^
is it still broken if you get the path from its parent directory, https://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#list-java.nio.file.Path- ?
ie. instead of creating a path from the name. get the path by calling list
on the parent directory of the problematic file?
I will try that
could be a linux filesystem bug
or corner case the fs never intended to handle'=
to be clear, here's my hypothesis: 1) filenames are not valid unicode, 2) java tries to represent them as valid unicode on dir read, and ends up having "invalid char" in filenames instead
the filesystem likely doesn't actually care and just stores file names as bytes, not caring about encoding
yeah, I think sakura overflowed a buffer and created a write-only filename that can't be used for lookup - the thing that actually lets me delete this file is running find
, selecting the file based on a predicate, adn deleting, and find is smart enough not to do a file -> name -> file translation
but not by prohibiting such names, or at least not always; details vary for different FSes
@noisesmith can't you somehow make a File from a binary string?
I could try, checking the API
so there is a mismatch between file names as bytes and filenames as strings of utf8/utf16 characters
I have only seen those options used for windows and mac filesystems (fat of various flavors and whatever the mac one was that started with h)
@hiredman that's actually a fair point; I have no idea, could be that e.g. ext[234] fs just mandates utf8 or something, indeed. or even not care, actually -- I don't know
Here’s the thing, though: on Unixes, paths are fundamentally bytes. The arguments and return types of the standard Posix OS interfaces open(2) and opendir(2) use C char* strings (because we still live in 1969).
This means that your operating system can, and does, lie about its filesystem encoding. As we discovered in the early days of beets, Linuxes everywhere often say their filesystem encoding is one thing and then give you bytes in a completely different encoding. The OS makes no attempt to avoid handing arbitrary bytes to applications. If you just call fn.decode(sys.getfilesystemencoding()) in attempt to make turn your paths into Unicode text, Python will crash sometimes.
https://beets.io/blog/paths.html not sure if they are authoritative
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=4899439 is better, but all the filesystem bugs are marked as duplicates of that one bug
that looks like precisely what I'm seeing, so it's a weird linux behavior combined with java not being messy enough to accommodate
it would be funny to make a library on top of Unsafe for the sole purpose of accessing impossible files
you could probably use java native access (JNA) to directly call the filesystem apis
I've used https://github.com/Chouser/clojure-jna/ for a few projects and it works well