This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-06-05
Channels
- # babashka (14)
- # beginners (62)
- # calva (1)
- # cider (54)
- # clj-kondo (3)
- # cljdoc (15)
- # cljs-dev (2)
- # clojure (180)
- # clojure-europe (5)
- # clojure-italy (4)
- # clojure-losangeles (1)
- # clojure-nl (2)
- # clojure-spec (10)
- # clojure-uk (39)
- # clojurescript (85)
- # core-async (9)
- # core-logic (1)
- # core-typed (5)
- # data-science (27)
- # datomic (2)
- # emacs (15)
- # figwheel-main (98)
- # fulcro (26)
- # graphql (15)
- # helix (1)
- # jobs-discuss (26)
- # kaocha (1)
- # off-topic (54)
- # other-lisps (1)
- # re-frame (21)
- # reagent (1)
- # reitit (3)
- # shadow-cljs (49)
- # spacemacs (12)
- # specter (5)
- # xtdb (2)
I can check for keys :on, :on-interface, :method-builders
etc. but not sure if these are implementation specific
I am not aware of any other way than checking for keys like that, and if you want to tighten it up further, do some checks on the values associated with those keys to verify their type and the types of the elements within sub-collections. Those things are implementation-specific, but the details have not changed for quite a few Clojure versions. I have no idea whether the details are similar for ClojureScript, if that is even a concern of yours.
It would be a good idea for you to restrict that implementation-specific checking to as small a portion of your code as you can, e.g. a single function or two.
Thanks! This is in the context of language tooling - I find that working with protocol-heavy libraries causes a lot of friction when looking up method implementations, which can be scattered around in many different files
if you have a var you can also look up the source code for that var from the metadata. this may be an alternative way to determine if the var refers to a protocol map
The problem I actually have is finding the source location for method implementations - eg.
(defprotocol Proto
(my-method [x]))
;; perhaps in another file:
(extend-protocol Proto
String
(my-method [x] "string!"))
Then given #'my-method and String, I want to know where the method body is definedif this is for dev tooling, I might consider just enumerating all the namespaces and building an index
yeah, I might follow up in #cider later on - though I'm surprised this isn't a more common pain point
Is there a way to bit-shift-right
biginteger
s? I don't want to have to re-write my algorithm, but it relies on bitshifting and I didn't realize I'd be dealing with numbers outside of ints range and that bigintegers wouldn't support it.
I think bigintegers are just java.math.BigInteger
, so https://docs.oracle.com/javase/8/docs/api/java/math/BigInteger.html#shiftRight-int- should work
Babashka is getting so much praise in this thread https://news.ycombinator.com/item?id=23418699 I am really eager to try it out as I have yet to.
Babashka is awesome. It really complements Clojure and ClojureScript well. For scripting use case, or anything you don't need good performance for, it is great, since it starts so fast, and all you need to run Clojure code is one bb binary.
I regret not using it for something like advent of code. I was so comfortable just using Leiningen that I didn’t bother to try anything else.
You should try your code as is running on bb. If you're mostly only using clojure.core, clojure.set, etc. It should work as is
oh shoot! i’ll try that i didn’t even think i could do that
Also, for advent of code, normal Clojure might still be good, since you'll probably work connected at the REPL anyways. Though bb now supports nRepl as well, but not all of its middlewares
but yes the code just uses clojure core, no dependencies or anything
gotcha, makes sense
@christian.gonzalez Welcome to join #babashka if you have any questions.
Quick spec question. I want to match a variety of instructions preceded by a line number. For ex:
10 START
20 MOV A B
Right now, each of my instruction’s spec is preceded by the line number
Instead, I’d like to generically say that an instruction is any of START/MOV/CMP preceded by a line number
The thing you're spec'ing is a string? If so then you can spec with string? and a regex.
No, each line is a vector of symbols and the whole program is a vector of those vectors
(s/def ::mov-inst (s/cat :ln ::line-num :m ::mov :r1 ::register :r2 ::register))
That’s what I have. But ideally I’d like to have the move instruction not include the line number as a spec, but generally specify it outside
As of now, I am having to specify the line number in every instruction I define
can I spec a list where the first element meets a certain spec and the rest of the elements meet another?
Haven't tried this (and a bit new to spec myself) but, maybe something like:
(s/def ::mov-inst (s/cat :m ::mov :r1 ::register :r2 ::register)
(s/def ::instruction (s/or :mov ::mov-inst :oth ::other-inst)
(s/def ::line (s/cat :ln ::line-num :intr ::instruction)
Yeah I did that. But that doesn’t work because the can imples that mov is a vector of some sort
so it would work if I gave ie
10 [MOV A 10]
But not
10 MOV A 10
Oh, yeah. See what you mean.
I'm clearly missing something, because that seems to be at odds with the https://clojure.org/guides/spec#_sequences. It says, "When regex ops are combined, they describe a single sequence. If you need to spec a nested sequential collection, you must use an explicit call to `spec` to start a new nested regex context." and it has examples of nested and un-nested sequence specs.
Right, me too.
It gives the exact counterexample as above
Afraid I'll have to defer to somebody else at this point.
spec-xperts
To be fair, s/cat isn’t a regex form
it isn’t a regex op
you're using s/or above, which is not a regex op
Worked like a charm! Thanks
Sorry for the naïve question, but I’m wondering if there is a reason why clojure.core/keep
only has a single sequence, and transducer arities please?
I can’t tell if it’s just because that function hasn’t had a lot of love, or if there was some reason for it
No idea if this is the real reason. But if keep
has the multiple sequence arities, then why not every sequence function? remove
/`filter` etc? In general, just going from sequence->sequence is simpler than also supporting sequences->sequence. And maybe map is just the special case for historical reasons or because it supports the most common usage? And it sort of covers those other cases because you can go from sequences->sequence with map and then use any of the sequence functions on the resulting sequence
The core libraries already have lots of redundancy. That’s clear simply from the implementation of many functions (that are implemented using other core function. But those extra functions are useful.
You can always use (comp (map …) (remove nil?))
but I like keep
and find that I use it a lot. Indeed, I have to use that remove
trick a lot because keep
doesn’t do what map
does, despite being an almost identical implementation for the 2 arities that it has.
The thing is, every function in core is like a mini-dsl, and you could keep enhancing it with more and more options and arities. Where do you draw the line? So I think it is just it didn't get the love, or more that, the line has to be drawn somewhere. What if one day there is a better use case for taking more arities, and we can't because we've used it for this?
whos streaming clojure content during pandemic ?
How do people feel about keywords mapping to functions in maps as an alternative means of polymorphism? I'm currently using :extend-via-metadata
with protocols but I'm thinking this alternative approach might be a bit easier ergonomically. Is there a reason to not do this? Assuming I will control all the functions I want to implement on these things?
(defn start [{::keys [start-fn] :as this}]
(start-fn this))
(defn stop [{::keys [stop-fn] :as this}]
(stop-fn this))
(defn new-thing []
(let [running? (atom false)]
{::start (fn [this] (reset! running true))
::stop (fn [this] (reset! running false))}))
For the ergonomic part of metadata-based protocol extension, I created https://github.com/nedap/utils.modular#nedaputilsmodularapiimplement which is really lightweight, simple and team-tested.
Clojure's polymorphism should be faster than hand-rolled polymorphism.
Although, metadata-based extension is the last thing searched in the chain: first go direct implementations (like those defined with defrecord
), second things extended with the extend*
family of functions, and metadata-based extension last.
I think its fine. Maybe a little "surprising" to another Clojure dev which would stumble upon it, but I've definitly used it before.
@U45T93RA6 I believe metadata-based implementations are checked second, after direct implementations but before external extensions.
I recall what I said sharply because Alex Miller stated it over #clojure , and metadata-based is sth I particularly care about Might be found in the https://clojurians.zulipchat.com/ archive
See https://clojure.org/reference/protocols#_extend_via_metadata: it lists the order, unless that is wrong.
I’m only learning about extend-via-metadata
myself. Do you know of any comparison between it and, say, reify
? It seems to me than extend-via-metadata
is far more flexible, at a very small performance cost.
downsides of reify vs it are, in my view: * reify instances aren't vanilla maps that you can use clojure.core upon * reify method definitions are nested, whereas plain defns are at the top level (= increased readability+reusability), and those defns can use something like https://github.com/jeaye/orchestra (or your favorite equivalent lib)
Yes, I can only see benefits to extend-via-metadata
. I’m wondering if I should go as far as saying it is preferable to reify
.
https://clojure.atlassian.net/browse/CLJ-2426 is a big gotcha. One has to write a wrapper for performing that additional check
But I don't know about using reify on protocols personally. Reify always felt like an interop construct to me, and it works on protocols because under the hoods they create an interface. But I feel it's not really pure Clojure. Maybe I'm wrong, But I've never even thought of using it for protocols before.
Though I guess, you could imagine that reify could in theory return an extended by metadata version. So maybe reify is more like an interface to instance polymorphism
@jjttjj I have a clojure.spec port which moved from protocols to a map based approach here: https://github.com/borkdude/spartan.spec works great 🙂
Cool, just the type of affirmation I need to proceed 🙂 In skimming the source quickly I see an example in specize
where you replace the protocol dispatch with a cond. Are there other cases in here where it invokes a function inside a map, to still enable outside extension for example?
Don't remember exactly. The reason the port works like this is so it can be executed with babashka, which doesn't have protocols (yet!)
I have an interpreter that maps parsed symbols to the functions that implement those operations. I’m a big fan of maps with function values, especially since get
(and also map-as-a-function) accept a not-found
value which can be a default function, or just a function that throws an ex-info
Any idea how I can give a parameter name to the language prefix here in bidi router:
(def handler
(make-handler ["/" {#{"en" "fr" "es"} home-handler
true not-found-handler}]))
I tried [#{"en" "fr" "es"} :lang]
but I am getting: No implementation of method: :segment-regex-group of protocol: #'bidi.bidi/PatternSegment found for class: clojure.lang.PersistentHashSet
I am using it with liberator's defresource macro and with the above attempt it's not named so I can only see the URI in the context
Is this idiomatic for a library? (Use of pmap) https://github.com/juxt/crux/blob/master/crux-rdf/src/crux/rdf.clj#L191-L202
My problem is if pmap can appear many places in your call stack, wouldn't it just cause overhead? So maybe it is better if it is only used in places that are guaranteed to be at the edge of your app (so not in libraries)
True, but the part I pointed to was just a utility library for parsing rdf files (it is a separate package)
even using it once imposes coordination overhead (inside the vm bookkeeping, plus kernel thread context switches)
that's unavoidable with anything parallel
even once might be worse than using a single thread, it's very situation specific
For example, if I have a list of files, and I pmap some parse job, but the parse function itself also pmaps...seems bad, no?
Only computation. If you only do compute, and you nest pmaps in one another... well, it might degrade a bit since you start having way more threads than their are CPU cores.
Now, sometimes you can do IO with it, and it can speed things up as well, but its like the super lazy way to parallelize your IO, which will yield only minimal performance boost.
bad how?
also, with many file systems, parallel access is worse than one at a time, and the fs layer is likely the biggest bottleneck
In what sense? Parallel access to the same file, or across different files? (the former I agree with, the latter not so much)
no device is truly random access - high performance frameworks get huge speedups by eliminating fs seeks
Sure, but parallelising multi-file IO gives the OS’s IO scheduler more to work with (to optimise seeks on devices that have them i.e. “spinning rust”), and some devices (e.g. RAID arrays, most NAS devices, SSDs etc) also support parallel access all the way to hardware.
linear access one one device can be faster than ram lookup
Yes I know. But if you have multiple files to process, parallelising that access can in many instances be faster than linearly reading them (for the reasons I outlined above).
of course one would want to measure these things for a use case
and I wasn't claiming that parallel was always slower - just noting that it isn't always faster
Yes, and I have. I worked in content management for 16 years, and most of the “cost” of content management operations is in schlepping bits from files.
Some of the articles linked from https://github.com/pmonks/alfresco-bulk-import explain this in more detail, though IIRC some of them are now sadly dead. 😞
Though this one still works: https://www.slideshare.net/alfresco/taking-your-bulk-content-ingestions-to-the-next-level
Slide 25 is the one I was looking for - it shows the improvement in overall throughput as multi-threading is increased.
You’ll note that the speedup is more pronounced for the test environment that had traditional HDDs (i.e. increased IO parallelism allows the OS to do more seek optimisation). In the test environment with the SSD, multithreading didn’t improve things much.
:thumbsup: - it's great to see this depth of insight, thanks
np! File I/O is kind of fascinating when you get into it. 😉
@U0MDMDYR3 thanks for the link. I briefly looked at the slides and the results don't look that promising. The improvement wasn't big with increasing number of threads. However, those tests were performed only on two laptops and are now quite outdated, I think. Moreover, here's in the conclusion they seem to state that multithreading only helps in CPU-bound scenarios
Yes I know - I wrote that presentation (almost 10 years ago, now). 😉
bad in the sense that it is less likely to cause more productive cpu utilization than just having it at the top level (or one level, at least)
sure - I would be suspicious of library usage of threads period, including pmap
the app can make choices that the library author can't know enough to make
pmap is reasonable for compute-heavy tasks, not where context switching dominates or side effects
My experience has been that multi-threading (whether using pmap or Threads or whatever) can help for IO-heavy workloads, in part because it gives the IO scheduler more to work with (to optimise seeks on traditional disk drives, for example).
I'd prefer threads over pmap for that, since with pmap you relinquish control over the number of threads or backpressure. I wrote a https://bsless.github.io/mapping-parallel-side-effects/
And context switching isn’t really an issue, since (from the CPU’s perspective) those threads spend a very large percentage of their time parked waiting for IO to complete.
(which is “cheap” - don’t have to context switch to check whether a thread is still waiting or not)
Of course it depends on the use case. Going multi-threaded to try to read a single file faster probably isn’t going to help.
This would be fun to do in Clojure!!
So I had a spare moment, and came up with this as a “stupid” first attempt at wc
in Clojure/Babashka:
#!/usr/bin/env bb
(let [file-name (first *command-line-args*)
content (slurp file-name)]
(println (format " %7d %7d %7d %s"
(count (re-seq #"\r?\n" content)) ; Line count
(count (re-seq #"\S+" content)) ; Word count
(count content) ; Character count
file-name)))
Interestingly, it doesn’t do too badly, probably because we don’t read the file 3X the way the first Haskell version did:
$ ls -l big.txt
-rw-r--r--@ 1 pmonks staff 6.2M Jun 5 14:50 big.txt
$ /usr/bin/time -l wc big.txt
128457 1095695 6488666 big.txt
0.03 real 0.03 user 0.00 sys
1810432 maximum resident set size
[snip]
$ /usr/bin/time -l ./naive.clj big.txt
128457 1095695 6488666 big.txt
0.59 real 0.50 user 0.07 sys
295354368 maximum resident set size
[snip]
Clearly memory usage is just a tad different though. 😜Yeah it gets complicated because a “file” isn’t really a thing the hardware knows about - they deal with sectors or cells or whatever (so above that is the line where parallelism could help).
I'm seeing an absolutely bananas bug that's driving me up the wall.
When using cider-jack-in, and only on GraalVM (the regular JVM, not native-image or anything), in a project that has a lot of stuff on the classpath (specifically the full AWS SDK), (clojure.java.classpath/classpath)
will only include Graal's src.zip, and nothing else. I do the same thing from lein repl
, everythings fine. OpenJDK, everything's fine. I'm working on getting a reproducer working in GitHub Actions, but if anyone has any idea where to even start looking for debugging this, I'm all ears.
look in *Messages*
to see the exact command cider is using and then start doing it from the command line. remove cider-nrepl and see if that fixes it.
well, clojure.java.classpath uses the classloader hierarchy, so my guess would be the hierarchy is different.
which version of java.classpath and Java you are using is also relevant as there have been changes
So, given clojure.java.classpath 1.0.0 + JDK 11, I was expecting clojure.java.classpath to fall back to using the java.class.path
property, but, even in the "bad" setup (in emacs, with graal, and an almost empty (cp/classpath)
), I get:
(count (str/split (System/getProperty "java.class.path") #":")) ;; => 353
Aha!
(cp/classpath (clojure.lang.RT/baseLoader))
;; =>
(#object[java.io.File 0x4fd2d0bb "/home/lvh/.local/graalvm-ce-java11-20.1.0/lib/src.zip"])
(and c.j.classpath assumes that if there's anything on that classpath, it's good to go, otherwise it reconstitutes from java.class.path
: https://github.com/clojure/java.classpath/blob/master/src/main/clojure/clojure/java/classpath.clj#L85-L87)
OK, progress. Still haven't delved to the bottom of it, but: running:
(->> (clojure.lang.RT/baseLoader)
(iterate #(.getParent ^ClassLoader %))
(take-while identity)
(mapv (juxt identity cp/loader-classpath)))
... shows two extra clojure.lang.DynamicClassLoaders
that only exist in my editor env, and the second of which introduces /home/lvh/.local/graalvm-ce-java11-20.1.0/lib/src.zip
to the classpath. The two extra classloaders only appear when using lein: tools.deps does not exhibit the issue. However, using
lein update-in :dependencies conj \[nrepl\ \"0.7.0\"\] -- update-in :plugins conj \[refactor-nrepl\ \"2.5.0\"\] -- update-in :plugins conj \[cider/cider-nrepl\ \"0.25.0\"\] -- run
(a command line I copied mostly from Messages in Emacs as previously suggested) also does not exhibit the issue. I've put the not-quite-a-reproducer up here: https://github.com/latacora/graal-empty-classpath-reproducerbut lein ... -- repl
does. So, TL;DR: when using lein repl a new DynamicClassLoader gets introduced that adds only src.zip, which in turn confuses cp/classpath into thinking it's in an environment where it should trust the output from walking the classloader hierarchy
compare env
in a terminal where you running the copied command and env
from eshell? perhaps there are differences in environment when running the same command in an emacs process versus from the terminal
sorry, spoke too soon before: per last message, I can reproduce, lein repl
and lein run
were just giving me different results
ah gotcha. i thought there was different behavior between what emacs was running and the terminal with the same lein command
yeah sorry I was running lein run because I figured it'd be easier to stuff into automation, but of course the one thing I tweaked made all the difference 🙂
Oh, and one further: only when introducing nrepl deps. (I'm writing all of this up for the README in that repo)
This feels very similar to the issue @mikerod brought up in #clojure-dev on Wednesday...
the nrepl deps likely add a classloader to ensure that they can use it to inject or whatever that feature is called
(that was related specifically to using Pomegranate to load new deps in various repl setups, but one of them was lein repl
doing something different to the class loader hierarchy compared to plain ol' clj
's REPL)
if it's pomegranate related / magic classpath injection I would narrow it down to the CIDER nrepl middleware, rather than nrepl itself
FWIW just adding nrepl doesn't trigger the bug, you need all 3 (well, maybe not technically all 3: some combinations don't make sense)
also, maybe keep an eye on nrepl 0.6.0 vs 0.7.0. I believe there are some new features coming to allow injecting more stuff so perhaps its related to these changes?
echo "(-main)" | lein update-in :dependencies conj \[nrepl\ \"0.7.0\"\] -- update-in :plugins conj \[cider/cider-nrepl\ \"0.25.0\"\] -- run
does not trigger the bug, but:
echo "(-main)" | lein update-in :dependencies conj \[nrepl\ \"0.7.0\"\] -- update-in :plugins conj \[refactor-nrepl\ \"2.5.0\"\] -- update-in :plugins conj \[cider/cider-nrepl\ \"0.25.0\"\] -- repl
doesthe classloader hierarchy could be anything, and clojure.java.classpath depends on it being one thing
Just need to be sure you share a loader with what you dynamically add. If that’s what’s being attempted here? Missed that part.
Is there an extra step for using encrypted maven authenticated repos? Everything works fine when I use a plaintext password but as soon as I encrypt it, it doesn’t work.
Using mvn directly from the command line works fine with an encrypted password, so I’m wondering about clojure..
If you’re talking about from clj with encrypted passwords in settings.xml, that’s not currently supported
I have not looked into what would be required but would be happy to do so
That’s exactly what I’m looking for. I’d appreciate any help or pointing in the right direction as I’m curious to learn myself. In the meantime I’ll see what I change on the maven side. Thanks!
I went down this rabbit hole this weekend. I found the util/maven.clj setting repo credentials easily enough. Maven even has a settings.crypto package with the necessary classes. However, they’re using something called Plexus for dependency injection to actually set the cipher class in the decrypter that I haven’t been able to figure out how to replicate in clojure yet.
Usually you can just do normal Java stuff to replicate whatever plexus would do
It’s setting a protected field in a class. I tried using proxy and gen-class to extend the original class after I couldn’t set! the field directly, but I’m not very familiar with those and wasn’t able to get it working.
there are sometimes workarounds if you do enough spelunking in the maven libs
OK, I've written up everything I know about the bug here: https://github.com/latacora/graal-empty-classpath-reproducer I'm unsure where to report it, though. CIDER?
If that helps, back in Feb I opened a similar reproducer as a PR: ahttps://github.com/clojure-emacs/cider-nrepl/pull/668 its partially red build demonstrated its point. As you can see, it was welcome