boot 2015-10-25 | Slack Archive

yep

yesterday i got a lot done

martinklepsch16:10:06

cool, then I won’t feel as bad for asking stuff on sundays 😉

micha16:10:09

my changes add full support for all boot env vars in boot.properties files

micha16:10:40

you can have boot.properties in the cwd, in a parent directory if it's a git repo, and in BOOT_HOME/boot.properties

micha16:10:13

it processes them in order, merging the properties, BOOT_HOME, then git project, then cwd

micha16:10:21

then it merges env vars

martinklepsch16:10:23

I had this idea yesterday of a higher order task that executes some other tasks with a scoped fileset, e.g. only stuff in boot/worker/

micha16:10:27

then it merges system properties

micha16:10:59

also boot --version prints all config that starts with BOOT_

for any of those places

micha16:10:18

so when someone pastes the output we can see their whole setup

martinklepsch16:10:22

nice

micha16:10:37

yeah i made a macro for that

micha16:10:09

(with-env [{:dependencies ... :source-paths ...}] ...)

martinklepsch16:10:43

I got pretty far with an approach only using fileset https://gist.github.com/martinklepsch/6ff00508cc49158f270b

micha16:10:45

basically it just does what clojure's binding macro does

yeah you can do the same with the fileset

martinklepsch16:10:11

Not sure if that’s a total mess or not but it felt pretty sweet to just derive all from fileset

micha16:10:17

you'll need a macro

martinklepsch16:10:29

what for?

micha16:10:31

yeah i like it

micha16:10:30

what is the "directory in fileset to use as root"?

martinklepsch16:10:17

if I have :source-paths #{“boot”} I could do something like this:

(with-dir
    :dir (-> settings :pod :dir)
    :task (comp (pom) (aot :all true) (jar :file "boot-pod.jar")))

micha16:10:02

what is settings there?

martinklepsch16:10:33

map of stuff — what’s being passed as :dir is ”pod/src/“

alandipert16:10:38

i wonder if the "env' should be a fileset instead of an atom

alandipert16:10:56

am i on a different page?

micha16:10:57

there is no point i don't think

micha16:10:04

since you can't undo adding something to the classpath

micha16:10:12

i mean not really

micha16:10:16

you can remove it from the atom

micha16:10:23

but that doesn't remove it from the classloader

micha16:10:44

when you make a pod you pass it an immutable env map

martinklepsch16:10:31

@alandipert: I had that thought too

alandipert16:10:40

what about a pod with a fileset in it

micha16:10:03

tasks need to run in the main thread though

martinklepsch16:10:18

@micha: why?

micha16:10:22

if you're not running it in the main thread then it's just a function in a pod which we already have

micha16:10:58

well you can alraedy run functions in pods, the main thread is the place where you can mutate the state of the build jvm environment

micha16:10:06

which is where tasks are useful

alandipert16:10:33

i guess no matter what there needs to be a calling environment

alandipert16:10:50

we can add more things to the "world" inside the env/fileset, but there still needs to be something that takes that as an argument

micha16:10:40

there is the option of using a shebang boot script

micha16:10:48

and kicking off multiple boots

micha16:10:02

you can look at the boot.App class

micha16:10:11

there is the runBoot() method

micha16:10:14

static method

micha16:10:20

that's what i used in the boot server experiment

micha16:10:41

you can run completely separate boot builds that way

and now that toby fixed our leak problem it should actually work

martinklepsch16:10:03

the issue I have with set-env! is that it feels unintuitive how it affects watching etc. tasks that set-env! some paths don’t compose very well with watching

micha16:10:48

yeah i suspect that shuffling the env around isn't the real solution

martinklepsch16:10:57

yeah

micha16:10:07

the shebang thing though

martinklepsch16:10:12

that’s why I was looking into scoping while fileset is being passed around

micha16:10:16

that could be the way to do a multi-build like boot

micha16:10:44

like actually start multiple builds at once

micha16:10:53

and let them run in their own threads/runtimes

martinklepsch16:10:27

sounds awesome

martinklepsch16:10:38

you mentioned removing things from classpath isn’t possible, can you think of another approach that would allow temporary scoping as it would be needed in the with dirs task?

micha16:10:01

i think you'll end up with weird edge cases

micha16:10:23

when i used it it was just to hide things from tasks that do their work in a pod

micha16:10:37

like the boot-jetty task, for example, i wanted to hide some dependencies from it

micha16:10:49

because when you make a pod you use the existing env plus whatever deps the pod has

micha16:10:08

so i used my macro to temporarily hide some deps from the task, and then replace them

micha16:10:33

it only needed to huide them during the task construction phase

martinklepsch16:10:34

couldn’t you just have filtered deps before creating the pod?

micha16:10:42

it wasn't my pod

micha16:10:14

in general i think it's good how we make pods in tasks

micha16:10:24

by adding pod deps to the global env deps

micha16:10:43

becaus that gives the user power to hide things they don't want to be in the pod or whatever

micha16:10:50

if they need to

micha16:10:30

that's how you can start a separate boot process in your boot script

micha16:10:58

that reuses the existing worker pod

martinklepsch16:10:34

> well you can alraedy run functions in pods, the main thread is the place where you can mutate the state of the build jvm environment > which is where tasks are useful can you elaborate on that? what are the useful bits about mutating the state of the build env? for building a jar I wouldn’t need this right?

micha16:10:39

thta makes a new worker

micha16:10:03

well like you said with the watcher for instance

micha16:10:10

there is a lot of state in the main thread

micha16:10:25

that's why boot.core isn't available in pods

micha16:10:55

you definitely can't remove classes from the classpath once they're loaded in a classloader

micha16:10:26

you can remove directories though, but only because of some clever things in the watcher

when you remove a directory from the env it just uses the watcher to remove files from the fixed directories the classloader was created with

micha16:10:00

it doesn't actually remove directories from the classloader

martinklepsch16:10:11

I’m thinking I could make a pod, give it a dir I created as classpath + required deps + fileset and then it gives me a fileset back

micha16:10:32

it could

micha16:10:38

but what's the point?

martinklepsch16:10:53

isolation

micha16:10:58

you could just run your function in a pod like we already do

and have it write to a tempdir, like normal

micha16:10:13

the fileset doesn't give you anything

martinklepsch16:10:35

I want to run tasks in the pod pom, jar etc

micha16:10:47

yeah that's impossible reall i think

micha16:10:54

except when you use tasks like functions

micha16:10:12

i mean like the boot cljs compiler functions

micha16:10:14

that run in the pod

martinklepsch16:10:36

aren’t tasks functions once the factory fn is called?

micha16:10:44

yes

but they need a reference to the next task, etc

micha16:10:02

which can't pass between pods of course

martinklepsch16:10:24

I can return the fileset from the pod and call the next handler myself (I think ..)

micha16:10:31

andif the task references anything in boot.core it can't run in a pod

yeah as long as it doesn't use boot.core stuff it could work

micha16:10:20

what problem are we solving?

micha16:10:40

seems like there might be some other ways we could try too

martinklepsch16:10:19

yeah coming to think the same

martinklepsch16:10:06

problem: we have 5 libs. we want to have a boot watch build-all install like thing that builds libs whose files have changed and installs them.

martinklepsch16:10:44

It felt really enabling to program things with values/fileset, I hope we can have more of that and less of set-env!

micha16:10:47

yeah you should try the runBoot() approach

micha16:10:56

you can do that from a task i bet

micha16:10:10

we need the target task thing though

i guess your tasks that you call from the runboot can set-env! :target-ath

micha16:10:50

so that would be ok

micha16:10:04

it would just be boot build-all-dev or something

martinklepsch16:10:08

I think I’ll go with the stupid non-watch supporting way for now, wanted to do the fileset patch stuff today too 😄

micha16:10:47

you could have the watch in the individual tasks you call with runboot

micha16:10:03

how about this

where project1,2 are profile type tasks

https://github.com/apenwarr/redo

you could then use the :pipelines arg to spin up multiple runBoots

micha16:10:55

with the given args for each one

micha17:10:01

i guess the problem is when they depend on each other

micha17:10:13

then you really only want one pipeline

martinklepsch17:10:42

@micha: do you have the with-env macro handy?

micha17:10:54

the with-env macro can be simplified, it was just a quick thing for a specific issue

martinklepsch17:10:28

how does boot/boot/project.clj synchronize things if it doesn’t depend on anything?

micha17:10:10

makefile

micha17:10:32

the makefile does all the orchestration

micha17:10:43

it's lame but it mostly works

micha17:10:28

the boot server concept could actually be useful for the multibuild case

micha17:10:46

use it with something like make or the other cooler thing

micha17:10:50

i forget what it's called

onetom17:10:54

micha: cmake 😉

micha17:10:14

no the crazy one

micha17:10:12

whew! i thought i lost it

micha17:10:16

micha17:10:36

with the boot server in place we can use this to fully orchestrate

micha17:10:38

i think

micha17:10:58

it's a whole philosophy

micha17:10:10

kind of into it

martinklepsch17:10:58

@micha: the barbarywatchservice thing — is that still only for Java 6?

micha17:10:11

it's still the only way to get fsevents in osx

micha17:10:23

it works in java 8

micha17:10:28

that's what we use on osx

micha17:10:33

otherwise you can only have polling

martinklepsch17:10:37

yeah, was just wondering if it’s for java 6 compat

micha17:10:39

which is slow and eats cpu like a mofo

micha17:10:04

no boot doesn't try to support java 6 at all

micha17:10:13

the nio stuff is crucial

martinklepsch17:10:14

ok cool

martinklepsch17:10:52

regarding boot/boot/project.clj again: I see a jar is made but it doesn’t depend on anything, is this really just supposed to be an empty jar?

micha17:10:49

yes

micha17:10:58

i didn't know how to put just apom in maven

micha17:10:03

*a pom

micha17:10:09

it's there for synchronization

micha17:10:15

because clojars isn't transactional

micha17:10:21

that's the last thing that gets pushed

martinklepsch17:10:26

micha17:10:32

and boot looks for the version of that when it wants to update

micha17:10:39

that way it doesn't do a partial update

martinklepsch17:10:40

[#"^(?!boot\.repl-server).*$”] — this means all but boot.repl-server right?

micha17:10:47

yeah

martinklepsch17:10:01

we need that for the aot task 😉

micha17:10:10

we have a better thing

micha17:10:12

clojure

micha17:10:20

like you pass in the namespaces you want to aot

martinklepsch17:10:33

heard about that but is it any good?

martinklepsch17:10:34

😄

micha17:10:38

like in the project.clj there i have to later remove things that got included that should have been

martinklepsch17:10:47

but then I need to list all the things

micha17:10:53

you can use a function for that

micha17:10:59

tools.namespace etc

micha17:10:14

that's what we do in the tests and stuff

martinklepsch17:10:28

meh, some option like :exclude would be way more user friendly

micha17:10:45

i mean that regex is not helping anyone understand what's going on

micha17:10:48

or making it easier

micha17:10:57

cause when i put it in there i had to research regexes again

micha17:10:01

and test it

micha17:10:12

if i had a function i could have used filter in a second

martinklepsch17:10:18

agree about regexes but just a set with ns that shouldn’t get aot'd

micha17:10:28

yeah, we could

micha17:10:36

clojure.set/difference is a thing though

micha17:10:55

:exclude would work there though

martinklepsch17:10:08

can’t follow?

micha17:10:40

you can use that i mean

micha17:10:07

:namespaces (set/difference (all-nses) #{'foo.bar})

martinklepsch17:10:54

@micha: true but all that requires me to pull in c.t.n and that seems overkill for saying “all but these"

martinklepsch17:10:14

not saying it’s hard or impossible it’s just not what I consider “user-friendly"

micha17:10:22

we could add c.t.n to boot.core

micha17:10:33

a function to give you the namespaces etc

micha17:10:40

but yeah

micha17:10:55

we should do both

micha17:10:57

lol

martinklepsch18:10:09

I’m thinking it would be cool to move more towards things operating on a fileset. Pods could setup isolated envs based on those and some other params and then outside of pods we can combine filesets again and work with them as regular values. There’s not so much stuff in boot.core that would actually be useful if you’re just operating on filesets.

micha18:10:35

agree

micha18:10:45

we can almost pass filesets into pods

martinklepsch18:10:00

we can no?

micha18:10:01

just need to encode File objects

martinklepsch18:10:09

ah, yeah

micha18:10:15

which we can do, of course

martinklepsch18:10:15

@micha: in that branch I just pushed to boot build-lib fails because nrepl-server gets aot’d but when I just to boot core it works fine

martinklepsch18:10:49

not sure what’s going on there

martinklepsch18:10:51

I assume set-env! doesn’t reset classpath and some other build step has :aot all

martinklepsch18:10:25

@micha: do you see a way around unnecessary fs operations without blob ids of previous file? you think it’s ok to assume if lastmod of blobfile equals lastmod of tmpfile they’re the same?

(commit! [this]
    (util/with-let [{:keys [dirs tree blob]} this]
      (apply file/empty-dir! (map file dirs))
      (doseq [[p tmpf] tree]
        (let [srcf (io/file blob (id tmpf))]
          (file/copy-with-lastmod srcf (file tmpf))))))

martinklepsch18:10:55

Also what you you think about storing blob ids instead of last id. With something like this we could very easily lookup previous versions, potentially we could even store the same ids and have “generations”.

martinklepsch18:10:56

With that we could super easily create a previous fileset from the current. (at least for files that are still present)

micha18:10:58

i wasn't suggesting storing "last id", but rather storing the current state of the filesystem

micha18:10:02

which is necessarily a singletom

micha18:10:09

-ton

micha18:10:21

lol singletom == onetom?

onetom18:10:32

martinklepsch18:10:15

@micha: I know what you were suggesting but I don’t see much drawbacks with storing that information in the fileset itself. Also where exactly would we store the filesystem state? How would we have access to it when commit! is called?

micha18:10:52

storing it in the fileset itself would defeat the purpose, because the fileset is immutable, and the underlying filesystem that we want to keep a record of is not

micha18:10:06

we'd store the filesystem state in an atom in the tmpdir namespace

micha18:10:11

since it's a singleton

micha18:10:17

and we update it each time we commit!

micha18:10:29

to reflect the current state of the filesystem

micha18:10:57

so when commit! runs it compares the cached state against the fileset, does a diff to get the patch, and then applies the changes

micha18:10:13

and then it updates the cache, which is the fileset

micha18:10:37

basically the tmpdir namespace needs to hold onto the fileset that was used in the latest commit! call

micha18:10:55

then when commit! is called again with a different fileset it can compare the two and make a path

micha18:10:58

*patch

martinklepsch18:10:09

@micha: it would only store state of the blob store part of the filesystem which could be considered immutable to some extent?

micha18:10:21

it would store the fileset object

micha18:10:28

it has all the information it needs there

micha18:10:43

and you can make a patch with the diff function in that namespace very easile

micha18:10:01

diff the two filesets and you have your patch

micha18:10:12

then you update those things that changed, were added, or were removed

micha18:10:24

and reset the atom to contain the new fileset

martinklepsch19:10:30

It would certainly be a straightforward approach I’m just thinking embedding this information in the fileset keeps more doors open (e.g. parallel tasks)

micha19:10:47

it would defeat the purpose though

martinklepsch19:10:52

What do you think about my point that blob storage is (mostly) immutable?

micha19:10:04

it is, but the working set is not

micha19:10:22

blob storage is append-only, which is good

micha19:10:47

but if you keep the information about the hard links that are supposed to be on the disk in the fileset it would be useless

micha19:10:03

because if you call commit! a few times it would no longer reflect anything useful

martinklepsch19:10:15

I’m really only suggesting to make TmpFiles contain ids instead of id — nothing else added

micha19:10:21

maybe i misunderstand the objective

micha19:10:46

i don't see the purpose of that

we're trying to optimize the commit! process, right?

martinklepsch19:10:25

yes

micha19:10:38

by diffing what's on the disk with what's in the fileset object, right?

micha19:10:47

without scanning the whole disk etc

martinklepsch19:10:08

for one it would be useful in that step, we could just check if last two ids of tmpfile match.

micha19:10:22

define "last two"

micha19:10:33

so here's the situation basically

martinklepsch19:10:44

two most recent

micha19:10:45

task pipeline: A -> B -> C -> D

micha19:10:56

so A does things and passes fileset to B

micha19:10:58

and so on

micha19:10:06

each one adds history to the fileset

micha19:10:10

via :ids

now suppose that B is the watch task

micha19:10:30

so C and D have called commit! on the fileset

micha19:10:44

so they've added files that B never saw in its fileset or the history

micha19:10:00

now B calls commit! on the fileset it saved, the one given to it by A

micha19:10:10

now the working set is corrupted

micha19:10:22

because all the files that C and D added weren't removed

micha19:10:35

because they don't appear in the history of the fileset that A gave to B

micha19:10:44

if you want to keep a history you can, because you can hold onto a fileset if you want to, and put them in a vector or whatevee

micha19:10:52

a vector of filesets

micha19:10:04

but that won't help the commit! optimization things

martinklepsch19:10:19

for removing stale things we would still use file-seq and delete the ones not in fileset

micha19:10:21

commit! needs to compare the mutable filesystem to the immutable fileset

micha19:10:44

that would be more IO

martinklepsch19:10:55

listing files is very fast

micha19:10:03

so is deleting hard links

micha19:10:06

which is what we do

micha19:10:24

i don't see any advantage

martinklepsch19:10:29

but making them is not (cheap). also deleting is probably still more expensive than ls.

micha19:10:07

if we want to make commit! more efficient by patching then there is one way to do that

micha19:10:31

save the fileset in tmpdir atom after every commit! and patch against that

martinklepsch19:10:47

the advantage would solely be to have one datastructure that provides a datastructure->filesystem abstraction that could be used in any context and does not rely on global state

micha19:10:00

but it does rely on global state

micha19:10:08

the filesystem is a globally shared resource

micha19:10:17

there is no way to get around that

martinklepsch19:10:37

but it only stores hashes/just strings

micha19:10:57

the filesystem stores files and directories

micha19:10:12

i'm talking about the working set, not the blob dir

micha19:10:38

take react for example

micha19:10:50

you don't store a history of dom state in the virtual dom

micha19:10:05

you have a virtual dom and the current real dom

you diff those

then apply patch

same thing here

the key element is that the actual classpath (which will be populated with hard links) is global singleton and mutable

so it makes sense that we rely on global state to mnage it

micha19:10:38

because the whole point of it is to do global stateful things

martinklepsch19:10:59

in the context of boot the point is to do global stateful things but the general idea of fileset has nothing to do with classpath, that’s just how boot uses it

right, that's where commit! comes in

micha19:10:29

commit! is classpath

micha19:10:40

that's where immutable becomes mutable

micha19:10:53

and glob al

martinklepsch19:10:03

commit! is classpath only because dirs are on classpath, not because it has inherently something to do with classpath?

micha19:10:27

ok filesystem then

martinklepsch19:10:35

anyways, I’ll try to come up with something more tangible and then we can discuss further

micha19:10:42

filesystem is globally stateful

micha19:10:49

excellent!

martinklepsch19:10:39

I think my main point is: fileset is so awesome it should be usable as a library too. global state to make patches makes that harder

martinklepsch19:10:52

thanks for the discussion ✌️