This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-08-28
Channels
- # aleph (1)
- # announcements (16)
- # bangalore-clj (1)
- # beginners (78)
- # cider (109)
- # clara (3)
- # cljdoc (6)
- # cljsjs (3)
- # clojure (209)
- # clojure-dev (11)
- # clojure-europe (1)
- # clojure-france (9)
- # clojure-italy (13)
- # clojure-nl (3)
- # clojure-spain (2)
- # clojure-spec (19)
- # clojure-uk (50)
- # clojurescript (41)
- # clojutre (2)
- # core-async (45)
- # cursive (2)
- # datomic (14)
- # emacs (6)
- # figwheel-main (1)
- # fulcro (101)
- # graalvm (1)
- # graphql (3)
- # jobs-discuss (3)
- # kaocha (12)
- # leiningen (8)
- # music (4)
- # off-topic (47)
- # parinfer (8)
- # pathom (17)
- # pedestal (53)
- # re-frame (47)
- # reagent (22)
- # reitit (4)
- # shadow-cljs (49)
- # tools-deps (87)
I'm wondering how can I test running clj
for the 1st time in a deps.edn
project, without rm -r ~/.m2
I remember in the past I could just do something like env M2_HOME=$(realpath .)/m2 boot repl
or something like that, but now this is the only way I found to affect that path:
mvn -Dmaven.repo.local=$(realpath .)/m2 help:evaluate -Dexpression=settings.localRepository
but if I try something similar with clj
, it still seem to use the default location (since it's not downloading anything):
rm -r .cpcache; time clj -J-Dmaven.repo.local=$(realpath .)/m2 -e 1
And, btw, I'm still seeing abysmal performance when accessing Maven Central from Hong Kong.
After rm -r ~/.m2
, it took 9minutes to download 25MB of jars (just clojure, netty, aleph, compojure).
While my connection is shit and hardly ever goes over 2MByte/s and it's below 1MB/s mostly, 9minutes is unreasonable.
It seems to me that the main reason is the number of requests, which are also made serially (and might not even use keep-alive?)
So I don't think only S3 is slow.
I've also checked the related DNS chain:
. 1735 IN CNAME .
. 21599 IN CNAME .
. 29 IN A 151.101.196.215
and that IP points to San Jose.I had similar issues with the nixos binary caches. When they switched from AWS CloudFront to Fastly (which is significantly cheaper, plus they actually sponsor them), then the performance from Hong Kong dropped drastically.
you can set :mvn/local-repo
in your deps.edn
to use a different (presumably empty) local repo dir
you can clj -Sdeps '{:mvn/local-repo "foo"}'
so that's command line
the local repo is the maven cache directory
usually it's ~/.m2/repository
are you using latest clj
?
1.10.1.469
clj -Sverbose
will tell you
we are making use of maven session caches as of that version, so would be interested in whether that makes a difference for you
it should mostly cut down on repeated download of metadata files though, which are pretty tiny
it would make sense to use a more common option for this, like -v
; i keep forgetting it 😕
jars are serially downloaded
there are ways to change that, but, it's somewhat involved
is there an off the shelf solution for setting up a maven repo manager specifically for clojure usage? the https://maven.apache.org/repository-management.html page mentions a lot of options. a few years ago i've tried a few of them but i was struggling to make them work. it should be practically a caching http proxy... i was really hoping to find some small, turn-key solution, which i would run in the cloud, so i can share it between the office an home
there are several turn-key products for this, not sure if any qualify as "small"
but there is also JFrog and Artifactory
sorry, those are the same
Artifactory is by JFrog
you can just use a caching http proxy too, you just don't get the nice management interfaces, and the ability to merge sources, etc
latest nixpkgs contains 1.10.1.469 on its master branch. downloading it now:
uhu:multiboxx onetom$ nix-shell -I nixpkgs=~/nixpkgs/ -p clojure
these paths will be fetched (307.02 MiB download, 990.58 MiB unpacked):
/nix/store/0d8mpzq2dah05xqd6i1c9g02blvsvcnj-bash-interactive-4.4-p23-man
looks like it will take about 10minutes.
i will try a clean download of the same 25MB dependencies afterwardsthere are some benefits to using a maven-aware product I think
I guess one question I have is why you care what the perf is if you do it just once and then cache it forever?
@hiredman and what would you recommend as a caching http proxy? would it work with https? so for example datomic can be cached too?
@alexmiller these things never happen only once. i have multiple personal computers, multiple office computers, more and more servers and hoping to add more colleagues too. these caches should be primed on all those filesystems and having a bad 1st experience doesn't help...
I think it could be done, but it is a trade off, if you really want something a lean caching solution a caching proxy would be that, but you will have to invest time getting it working
also on CI systems it's good to be able to build stuff afresh from time to time at least
I'm not trying to be flip about it, obviously perf matters, just trying to probe a little closer
the structure of maven is designed that you don't need to refresh - these are immutable, uniquely versioned artifacts
CI is one use case where we see this come up
more specifically, i have 2 clojure projects im working on at the moment and i just wanted to know how much are their dependencies
at my last job we had a nexus setup that aggregated a number of different repos, and our builds were setup to only check our nexus, and that worked pretty well
but most CI allow you to keep a maven cache alive
i could have built an uber jar, but my knowledge is outdated on how to do that with t.d im just getting back to using clojure after a ~2yr break (when i was working with ethereum and js...)
https://github.com/clojure/tools.deps.alpha/wiki/Tools#packaging for making uberjars with t.d.a.
there might also be better maven central mirrors you could use from HK, not sure
clj (latest) supports Maven mirrors
i looked into the mirrors but most of them doesn't work. there is a UK one which is not offline but to do a performance test, i wanted to know how can i relocate the m2 cache, since it's not fun to delete my ~/.m2 directory 🙂
there's one on google storage I know
<settings>
<mirrors>
<mirror>
<id>google-maven-central</id>
<name>Google Maven Central</name>
<url>https://maven-central.storage.googleapis.com</url>
<mirrorOf>central</mirrorOf>
</mirror>
</mirrors>
</settings>
looks like they have an asia-pacific one there too
i was googling for hong kong maven central proxy
and came across these:
• http://repo.maven.apache.org/maven2/.meta/repository-metadata.xml
• https://maven.apache.org/guides/mini/guide-mirror-settings.html
• and one more which was also listing ibiblio.{net,org}, which are down
thanks a lot, @alexmiller ! if these work, it will really help with years of frustration 🙂
please report back, would be very interested to hear if those are any better
hmm... it took one minute to download 18MB, but only the 1st dependency was printed
[nix-shell:/Volumes/Data/lab/multiboxx]$ clj -Sverbose
version = 1.10.1.469
install_dir = /nix/store/0k1f3llrx4aggbgx7rhh70mrardqvrgz-clojure-1.10.1.469-prefix
config_dir = /Users/onetom/.clojure
config_paths = /nix/store/0k1f3llrx4aggbgx7rhh70mrardqvrgz-clojure-1.10.1.469-prefix/deps.edn /Users/onetom/.clojure/deps.edn deps.edn
cache_dir = .cpcache
cp_file = .cpcache/1871601414.cp
Refreshing classpath
[nix-shell:/Volumes/Data/lab/multiboxx]$ time clj -e 1
Downloading: riddley/riddley/0.1.12/riddley-0.1.12.pom from
1
real 1m18.496s
user 0m13.404s
sys 0m0.742s
[nix-shell:/Volumes/Data/lab/multiboxx]$ ls m2/
cheshire/ clj-tuple/ commons-io/ me/ riddley/ tigris/
clj-http/ com/ commons-logging/ org/ seancorfield/
clj-soup/ commons-codec/ etaoin/ potemkin/ slingshot/
2nd run is printing all the downloaded deps:
[nix-shell:/Volumes/Data/lab/multiboxx]$ rm -r m2 .cpcache/; time clj -e 1; du -hsc m2
Downloading: org/clojure/clojure/1.10.1/clojure-1.10.1.pom from
...
Downloading: org/apache/httpcomponents/httpmime/4.5.2/httpmime-4.5.2.jar from
Downloading: com/fasterxml/jackson/dataformat/jackson-dataformat-smile/2.7.5/jackson-dataformat-smile-2.7.5.jar from
Downloading: org/clojure/data.codec/0.1.0/data.codec-0.1.0.jar from
1
real 1m30.756s
user 0m16.289s
sys 0m1.102s
18M m2
18M total
this is magnitudes better!
i guess i should make a counter test with the old version just to confirm im not just experiencing some change in network conditionsusing -Sforce
will let you force a classpath recompute (don't need to nuke your .cpcache then)
is that difference with just the new clj version or the new clj version + mirror?
this is my deps.edn
:
{:paths ["src" "rsc"]
:deps {org.clojure/clojure {:mvn/version "1.10.1"}
org.clojure/data.csv {:mvn/version "0.1.4"}
clj-soup/clojure-soup {:mvn/version "0.1.3"}
me.raynes/fs {:mvn/version "1.4.6"}
org.clojure/java.jdbc {:mvn/version "0.7.9"}
org.xerial/sqlite-jdbc {:mvn/version "3.28.0"}
com.microsoft.sqlserver/mssql-jdbc {:mvn/version "7.2.2.jre8"}
seancorfield/next.jdbc {:mvn/version "1.0.5"}
etaoin {:mvn/version "0.3.5"}}
:aliases {:test {:extra-paths ["test"]}}
:mvn/local-repo "m2"
:mvn/repos {
"central" {:url " "}
;"uk" {:url " "}
"clojars" {:url " "}
}}
which i guess, i can simply do by replacing the :url
for the "central"
entry to the google mirror's url, right?
but currently im seeing 7-30kbyte/s download rates with the old version, so it will take awhile...
nope, it took only ~3mins:
uhu:multiboxx onetom$ clojure -Sverbose
version = 1.10.1.466
install_dir = /nix/store/m2v1xd7cybi544jyl77w3mzjw1xklw41-clojure-1.10.1.466-prefix
config_dir = /Users/onetom/.clojure
config_paths = /nix/store/m2v1xd7cybi544jyl77w3mzjw1xklw41-clojure-1.10.1.466-prefix/deps.edn /Users/onetom/.clojure/deps.edn deps.edn
cache_dir = .cpcache
cp_file = .cpcache/1871601414.cp
Refreshing classpath
^Cuhu:multiboxx onetom$ rm -r m2 .cpcache/; time clj -e 1; du -hsc m2
rm: cannot remove '.cpcache/': No such file or directory
...
Downloading: org/clojure/data.codec/0.1.0/data.codec-0.1.0.jar from
1
real 3m21.764s
user 0m18.844s
sys 0m1.207s
18M m2
18M total
still, it's quite significant difference, so thanks a lot for telling me about this enhancement and for implementing these improvements!
clj 1.10.1.466
mirror: https://maven-central-asia.storage-download.googleapis.com/repos/central/data/
uhu:multiboxx onetom$ rm -r m2; time clj -Sforce -e 1; du -hsc m2
...
real 1m40.781s
user 0m21.119s
sys 0m1.290s
18M m2
18M total
2/3 of the dependencies come from maven central and the rest is clojars:
uhu:multiboxx onetom$ rg -c maven-central-asia 466-deps.txt
71
uhu:multiboxx onetom$ rg -c 466-deps.txt
22
uhu:multiboxx onetom$ wc -l 466-deps.txt
92 466-deps.txt
(it almost adds up :)clj 1.10.1.469
mirror: https://maven-central-asia.storage-download.googleapis.com/repos/central/data/
uhu:multiboxx onetom$ rm -r m2; time clj -Sforce -e 1; du -hsc m2
...
real 0m44.863s
user 0m13.960s
sys 0m1.059s
18M m2
18M total
pretty big combined change! :)
I don't believe there is anything region specific. they do have a cdn mirror though
I have no idea of difference in perf though
this google mirror understands HTTP2:
uhu:reap onetom$ curl -vso /dev/null --http2 2>&1 | rg -i http2
* Using HTTP2, server supports multi-use
is the HTTP client in clj
talking HTTP2?that's several layers below the code I'm in, so don't know
it would not surprise me if the answer was no
the maven resolver uses org.apache.httpcomponents/httpclient and the version it's pinned on does not look like it supports http 2 (the newest series does though)
so my strong guess would be: no, but it potentially could
in my other project; 24MB of deps:
clj 1.10.1.469
mirror: https://maven-central-asia.storage-download.googleapis.com/repos/central/data/
[nix-shell:~/lab/reap]$ rm -r m2; time clj -Sforce -e 1; du -hsc m2
...
real 2m53.788s
user 0m24.007s
sys 0m1.858s
24M m2
24M total
clj 1.10.1.466
mirror: https://maven-central-asia.storage-download.googleapis.com/repos/central/data/
[nix-shell:~/lab/reap]$ rm -r m2; time clj -Sforce -e 1; du -hsc m2
...
real 3m33.490s
user 0m26.609s
sys 0m1.819s
24M m2
24M total
that's less of a difference, but still quite noticable (~20%)
anyway, enough testing for tonight; it's 4am in HK, so i shall sleep.
thanks again for your attention and advice!