Fork me on GitHub
#clojure
<
2020-01-12
>
Cora (she/her)04:01:28

this is more of a java question, but does java ever have access to more than one filesystem? there's FileSystems/getDefault and the docs say "The default file system, obtained by invoking the FileSystems.getDefault method, provides access to the file system that is accessible to the Java virtual machine."

Cora (she/her)04:01:22

and "the file system that is accessible to the Java virtual machine" implies there's one that's accessible

Cora (she/her)04:01:32

and that any others are custom ones you have to manually provide

seancorfield04:01:13

Yeah, it supports a file system abstraction over other things as well, such as a ZIP file, for example. I use that in depstar

didibus04:01:20

Well, I'm not too sure, but I'd assume there can be many filesystems in use, say you have a USB key plugged in for example, and your hard-drive, that would now be two accessible file systems

Cora (she/her)04:01:29

are those FileStores, though?

Cora (she/her)04:01:42

if it's not mounted, is it even accessible?

didibus04:01:03

Hum, ya, that's where I'm not really sure of any details 😛

Cora (she/her)04:01:38

hmmm that's interesting

Cora (she/her)04:01:42

I'm trying to write a clojure filesystem abstraction over http://java.io.File and java.nio.* -- FileSystems determine a lot of things like file separators and similar

Cora (she/her)04:01:24

and so if the library is being asked to join paths, for instance, you want to know the file system

noisesmith19:01:08

a File object can already do path joining for you in the constructor and static methods as appropriate, doesn't that already resolve the issue?

Cora (she/her)19:01:37

it defaults to the default file system for doing that

noisesmith19:01:55

it should use the fs that the File itself would end up on?

noisesmith19:01:21

I'm probably missing something here

Cora (she/her)19:01:22

it would use the default one, yes. but there are use cases for other file systems

Cora (she/her)19:01:28

like zip, for instance

noisesmith19:01:58

what I'm saying is the constructor for File should know which fs it ends up on based on the path, and use the correct separator

Cora (she/her)19:01:55

you can make your own filesystem with your own custom file separator even

Cora (she/her)19:01:33

so if you want to support these file systems in your file api you need to make it so the file system can be changed

Cora (she/her)19:01:55

but that's a drag since 99.99999% of the time you're just dealing with the default file system

Cora (she/her)19:01:04

so that's what I landed on, just having support for that

Cora (she/her)19:01:02

that's what most things land on, too, it seems, because supporting anything else is a huge pain in the ass

Cora (she/her)04:01:33

I'm thinking I'll use a dynamic var where you can provide another filesystem but default to using FileSystems/getDefault

didibus04:01:59

Paths is one thing I was thinking

didibus04:01:34

Like if a NTFS partition is mounted on linux

didibus04:01:04

Though I'm guessing "mounting" would actually have the OS adapt the NTFS filesystem to the current one

seancorfield04:01:13

@corasaurus-hex dynamic variables are... problematic... start by assuming the filesystem will be passed into all your functions and provide an extra arity that calls each function with the default filesystem).

seancorfield04:01:52

Where shall I start? 🙂

seancorfield04:01:55

I don't have the time/patience tonight for it to be honest, sorry. Just don't do it 🙂

seancorfield04:01:14

I'll be happy to explain when I'm back at work on Monday tho'...

Cora (she/her)04:01:16

if you think of any writing on it please share

didibus04:01:21

I only know that you have to be careful not to depend on them in a context that runs after the binding is gone.

didibus04:01:39

And thread boundaries

seancorfield04:01:05

It's really problematic if a library uses a dynvar because then you can't have multiple instances in use and, yeah, the threading issues. And it makes testing harder.

seancorfield04:01:18

And it's not considered idiomatic either.

didibus04:01:58

Hum.. isn't the whole point of dynamic var is they enable multiple instances?

seancorfield04:01:25

@didibus No, they break that use case.

didibus04:01:04

I'm not following that I guess. Each caller has it's own binding established no?

seancorfield04:01:38

For example, when I took over clojure.java.jdbc, it relied on a dynvar for the DB and that made it really hard to use in code that needed to talk to multiple databases because you end up being forced to call binding everywhere around every call to the library.

Cora (she/her)04:01:31

in this case I'm using a dynvar for cwd, which has to be used with a (with-cwd new-cwd body) in order to temporarily change it

didibus04:01:50

Ok, I see what you mean in that sense.

Cora (she/her)04:01:59

because you can't change the cwd otherwise in java

Cora (she/her)04:01:28

and I'd do something similar with a filesystem

Cora (she/her)04:01:48

(def ^{:dynamic true
       :doc "The current file system"}
  *file-system*
  (FileSystems/getDefault))

(defn file-system
  []
  *file-system*)

(defn change-file-system
  "must be called within a with-file-system"
  [file-system]
  (set! *file-system* file-system))

(defmacro with-file-system
  [file-system & body]
  `(binding [*file-system* *file-system*]
     (change-file-system ~file-system)
     [email protected]))

didibus04:01:01

I normally use dynvars for cross-cutting concerns

didibus04:01:15

Like injecting a specific logger or metric object

Cora (she/her)04:01:40

and then use the *file-system* to resolve paths

didibus04:01:13

They're also better then global configs.

didibus04:01:35

But Sean is right, that for options, having an options map might be better

Cora (she/her)04:01:31

I'm not convinced but am totally open to being convinced

Cora (she/her)04:01:37

this is basically for scripting contexts

Cora (she/her)04:01:03

there's a trade-off to be made here and I want to understand it before I make it

Cora (she/her)04:01:26

I get that it's an implicit argument to every function, essentially

Cora (she/her)04:01:45

and if you pass a handle to another thread then things get funky

Cora (she/her)04:01:56

but I have a hard time imagining someone running into an issue in this context

Cora (she/her)04:01:39

and people would rarely need to ever swap out the filesystem, so making it an option to every function seems like a lot

Cora (she/her)04:01:56

I'll have to think more about this

didibus04:01:47

Well, I'm not against them, and I do use them, like I said, for cross-cutting concerns.

Cora (she/her)04:01:01

yeah, I could see it for logging or debug flags and stuff

Cora (she/her)04:01:11

this is pretty essential stuff

didibus04:01:19

As long as you are careful around lazy-seq and thread boundaries.

didibus04:01:41

But I think what I got from Sean and his java.jdbc example, it is a usability thing for the library

didibus05:01:04

Say I want the cwd to be set to x/y/z for a number of operations

didibus05:01:48

Now I need to either do everything inside one (with-cwd ...) or I need to keep repeating (with-cwd x/y/z) over and over in many places

didibus05:01:20

Where as it might be better to allow the user to do: (def options {:cwd "x/y/z"})

didibus05:01:12

And then do whatever I want passing those along

Cora (she/her)05:01:13

and I guess if you wanted to change it for a block you could use a let for options

Cora (she/her)05:01:22

instead of a dynvar

seancorfield05:01:57

You need to consider whether folks using your lib are likely to need multiple filesystems or not (in your case).

didibus05:01:07

Ya, and if I really didn't want to repeat myself, I could partial all the functions or something like that too, if I don't like typing options on every call

seancorfield05:01:15

But depstar is pretty simple and it already needs two filesystems.

Cora (she/her)05:01:46

I'm really writing some nice filesystem stuff for use in babashka so my primary concern is the scripting context

Cora (she/her)05:01:05

but if it can be more broadly applicable I'd like to consider it

didibus05:01:06

It can depend if you're going for ultimate convenience or flexibility.

didibus05:01:28

Taking an options map I think is a good middleground between the two.

Cora (she/her)05:01:28

I mean I can add an extra layer for convenience

Cora (she/her)05:01:50

and have a lower level that's option-heavy

Cora (she/her)05:01:13

that might be a good choice

didibus05:01:22

Ya, that can make sense too. Probably easier to add bindings over API that take options already then the other way around

Cora (she/her)05:01:29

I already had a plan to add a posix-oriented convenience layer

didibus05:01:16

But, if you're imaging it being command line driven, I could see even going for a global context.

Cora (she/her)05:01:38

there's still threading in babashka

Cora (she/her)05:01:48

and so a global context probably isn't great still

Cora (she/her)05:01:51

(def home
  ^{:doc "The current user's home directory."}
  (str (System/getProperty "user.home")))

(defn file-separator
  [& {:keys [path file-system]}]
  (cond
    file-system (.getSeparator file-system)
    path (-> (as-path path) .getFileSystem .getSeparator)
    :else (File/separator)))

(defn expand-home
  "Takes a path and expands leading reference to `~` to be the current user's home directory."
  [path]
  (if (.startsWith path "~")
    (str home (subs path 1))
    path))

(defn file-separator-regex
  [& {:keys [path file-system]}]
  (re-pattern (str "\\Q"
                   (file-separator :path path
                                   :file-system file-system)
                   "\\E")))

Cora (she/her)05:01:57

I'm already doing some of this

Cora (she/her)05:01:07

with taking a path and a filesystem to get path separators

didibus05:01:41

Well, I mean like, would you ever want two cwd at the same time? If not, the global context can be locked or using an atom under the hood for example

Cora (she/her)05:01:01

I think I'd like to have two cwd, yeah

Cora (she/her)05:01:10

different threads working in different directories

didibus05:01:13

(change-cwd "x/y/x) could swap! some atom for example

didibus05:01:21

Ah okay then

Cora (she/her)05:01:41

hmmm it might be useful to swap out the default for testing

Cora (she/her)05:01:01

because things like File/separator use the default filesystem

Cora (she/her)05:01:16

and if I could swap it out for a fake filesystem it would at least make tests more meaningful

Cora (she/her)05:01:35

(deftest file-separator
  (testing "returns the default file separator"
    (is (= "/" (fs/file-separator))))
  (testing "returns the file separator for a path"
    (is (= "/" (fs/file-separator :path "/"))))
  (testing "returns the file separator for a file system"
    (is (= "/" (fs/file-separator :file-system (FileSystems/getDefault))))))

Cora (she/her)05:01:48

that test is pretty useless, and specific to the current filesystem

Cora (she/her)05:01:21

so a dynvar might be fine but don't give easy access to change it

didibus05:01:17

Dynvar or options map are similar-ish, but with dynvar you need to nest things to use the configured values, where with options you can have things flat and thus more easily reused in varying contexts. I'd say that's the biggest difference in terms of usability.

didibus05:01:51

Do you know if babashka does binding conveyance ?

didibus05:01:55

Like on future?

Cora (she/her)05:01:55

I don't know, no

didibus05:01:47

I tried, it seems it does

didibus05:01:06

user=> (def ^:dynamic *a* 100)
#'user/*a*
user=> (binding [*a* 200] *a*)
200
user=> (binding [*a* 200] @(future *a*))
200

Cora (she/her)05:01:07

I figured it would but was about to test

didibus05:01:30

And this is an example of what to be aware with dynvar:

user=> @(binding [*a* 200] (delay *a*))
100

didibus05:01:01

So that's something your users would need to be aware of as well

Cora (she/her)05:01:41

the global state isn't great

Cora (she/her)05:01:18

well, I need to think on this

Cora (she/her)05:01:20

and play with it

Cora (she/her)05:01:37

if it's something I'll do it's something that will be optional

didibus05:01:44

user=> (def ^:dynamic cwd "z/y/z")
#'user/cwd
user=> (defn read-file "A mock to demonstrate" [file] (str cwd))
#'user/read-file
user=> (binding [cwd "a/b/c"] (map read-file ["file1.txt" "file2.txt"]))
("z/y/z" "z/y/z")

didibus05:01:07

This is probably the worst offender which will trip up people

Cora (she/her)05:01:47

yeah that's definitely a problem. it only works if you stick to this library's functions

Cora (she/her)05:01:18

and I have to be really really really careful

Cora (she/her)05:01:23

ok I'm convinced

didibus05:01:02

well I'm making up here that read-file is from your library, but because I call it in a lazy seq and it is realized after the binding context has exited, it is using the wrong cwd

didibus05:01:15

haha, ya, I mean again you'd know best how you anticipate your lib to be used and if these edge cases would be common issues faced by users or not

didibus05:01:48

just laying down the edge cases I've faced in the past with dynvars

Cora (she/her)05:01:05

I really appreciate it. I'd have to supply functions that replicate all the core functions that deal with files so that they include consideration for cwd and any other library functions wouldn't work unless the user supplied adapters themselves

Cora (she/her)05:01:28

it's definitely a problem

borkdude08:01:04

@didibus Yes, babashka supports binding conveyance

murtaza5213:01:34

I have a list of symbols (def my-sym '[ a b]). These symbols have not been declared yet. In a function -

(defn abc
  [a b]
 my-sym)
I want the above fn (abc 1 2) => [1 2] however the above returns the unevaluated [a b] symbol names. How do I get the fn to return [1 2]

Premm Krishna Shenoy07:01:51

If I am right. def is evaluated on loading the file and is not a runtime. http://squirrel.pl/blog/2012/09/13/careful-with-def-in-clojure/

kulminaator13:01:21

with def you define a value

kulminaator13:01:00

why do you expect it to behave like a function afterwards ? 🙂

kulminaator13:01:21

or like a macro .. i'm not even sure what you have meant here

murtaza5213:01:14

There is a list of symbols which I need to reuse in multiple places, I could have written the fn simply to return [a b], but then I would need to keep my multiple fns in sync

kulminaator13:01:54

i think you are expecting def to behave like #define in C

kulminaator13:01:21

like a templating tool ? 🙂 but that's not what it is ...

jumpnbrownweasel13:01:54

It is usually not a good idea to assume local var names when reusing code. But perhaps something like this would help remove redundancy?

(defn abc [& args]
  (vec args))

8
vlaaad15:01:21

Also known as vector

jumpnbrownweasel18:01:46

Right :-) I had to assume that the example given was a partial snippet, but if that's all that's needed then of course vector will do it.

jumpnbrownweasel13:01:30

Macros can be used to do such things, but like I say it is not a good idea to assume local var names.

murtaza5213:01:08

cant I make it work without macros, something like (eval (map symbol my-sym)) (which gives an error)

Mustakim Patvekar13:01:29

Obviously it will give error because when you try to

(eval [a b])
 
it will say
Unable to resolve symbol: a in this context
because it is not defined in the context

Mustakim Patvekar13:01:15

If you want a function which simply returns a vector of args passed to it, you can use the function abc as suggested by @UBRMX7MT7

jumpnbrownweasel13:01:44

There may be some way to get eval to work for this. But the common practice is that macros and eval should be avoided unless a function really won't meet the need. I don't use eval at all.