This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-07-06
Channels
- # aws-lambda (6)
- # babashka (1)
- # beginners (204)
- # calva (10)
- # chlorine-clover (17)
- # cider (57)
- # cljs-dev (3)
- # cljsrn (3)
- # clojure (148)
- # clojure-bangladesh (1)
- # clojure-berlin (3)
- # clojure-europe (30)
- # clojure-france (1)
- # clojure-italy (4)
- # clojure-nl (5)
- # clojure-spec (4)
- # clojure-uk (14)
- # clojurescript (15)
- # code-reviews (8)
- # conjure (27)
- # data-science (9)
- # datomic (38)
- # duct (6)
- # figwheel-main (11)
- # fulcro (78)
- # helix (11)
- # jobs (1)
- # malli (18)
- # meander (22)
- # mount (4)
- # nrepl (3)
- # off-topic (93)
- # pathom (2)
- # pedestal (4)
- # re-frame (5)
- # reagent (6)
- # reitit (1)
- # ring-swagger (1)
- # sci (1)
- # shadow-cljs (19)
- # spacemacs (1)
- # sql (1)
- # tools-deps (76)
- # unrepl (1)
- # vim (5)
- # xtdb (8)
I've got a problem where I need to construct a directed graph, and I'm curious what the clojure approach to this is - everything I learned in school was very mutable. Is there any good reading on the topic?
That's similar to what I tried first, but I'm going to have different nodes with the same value - I started going about generating uuids for each value before I thought, I'll just ask first
correct! I'm building a graph representation of a sequence, where the nodes are the members of the sequence and the edges are the "observed-before" relations.
So, the sequence (1 2 3)
is broken into {:mem [1 2 3], :ord [[1 2] [1 3] [2 3]]}
I need to construct a directed graph with that map, and then topologically sort it to get the original sequence back
In the pathological case I'll have a sequence like (1 1 1)
that maps to {:mem [1 1 1], :ord [[1 1] [1 1] [1 1]]}
but that would be a pain if you need an invariant like the graph of (next coll)
to be a subset of the graph of coll
I just found https://github.com/weavejester/dependency, so I'm going to read some source code and see what I can purloin
ok, that one has the same problem I mentioned above, but I think I can get around it by creating a map of uuid->element, and then passing the collection of uuids into the graph, topo-sorting it, and then once it comes back mapping the uuids back to the original values
Whether you use uuid, or something else unique and convenient, e.g. the index of the node in the original order, perhaps paired with the value, e.g. [0 1] [1 1] [2 1] for your three nodes with value 1 example, I think you will be much saner if you pick some representation of nodes where they are all guaranteed to have unique values.
I'm hesitant to pair it with the value and complicate the structure, since I hope to be able to apply this work to nested datastructures after I get the simple case down. But you're right, my repl explorations have already proved that a unique id is useful, though now I have to use a special print fn to make them nice.
How expensive are uuids? I'm considering going with integers because uuids everywhere is going to make testing a nightmare
Oh, this is so nice! I think I can throw away my implementations of seq-intersection
seq-difference
and seq-union
and just use clojure.set now
the correct way to do an adjacency list is for every value to be unique (and most often meaningless to the domain)
you can use a second hash map to assign a value to each node
yeah, that's what I've got going now and it simplifies so much! Trying to handle nil
and false
inside of those seqs was the hackiest thing I've done this month
if you don't need the strong guarantees of uuids, a third option is gensym
user=> (zipmap (repeatedly #(gensym "node")) [:a :b :a :c :a :b :a])
{node9 :a, node10 :b, node11 :a, node12 :c, node13 :a, node14 :b, node15 :a}
What's cheaper, the (repeatedly #(gensym "foo"))
or (range)
. I think what I'm going to be doing will be slow for large collections, so I'd like to save where I can
but the gensym is useful if you aren't producing the result all in one go
also, in my experience clojure hits some hard limits with graphs, and to do interesting things you eventually need to write java code, or even worse, very hard to read and write clojure with very weird bugs
clojure's better at defining reliable high level code, but java is better at defining reliable performant code, and luckily we are allowed to use both together
I have a file.jar
in my project root and in my deps.edn I included it as the following:
{:deps {foo {:local/root "file.jar"}}}
Yet when I try to require it in my repl, I run into a FileNotFoundException
:local/root
requires you to specify the path from the filesystem root, I think. For me, I have /home/me/path/to/file.jar
in my deps.edn
Have you confirmed that the code you're trying to require really is in the .jar
file?
i know the .jar
is definitely in the path and yeah, its not empty if that's what you're asking
I'm asking that you've confirmed the contents of the JAR are what you expect.
Also, have you confirmed exactly what file is not found?
Yeah I just extracted the contents, and its all there, maybe the problem is with my naming
I'm trying to require jquran
, but the name of the jar is jqurantree-1.0.0.jar
and has a structure of org.jqurantree.*
when uncompressed
would i have to require it as jqurantree
? Or can I refer to it with any random symbol, such as foo
So the Clojure namespace is org.jqurantree.something
? That's what you have to require.
(this is Clojure code right, not compiled classes?)
You import
Java classes. Not require
.
ahh, ok that makes sense. Would I still be able to import it under foo
? Or do i need to follow the namespace structure that I find when I extract the .jar
The group/artifact name in deps.edn
is purely for tracking dependencies -- it has no relationship to code at all.
ok cool, I think i got it now:
user=> (import org.jqurantree.core.io.FileWriter)
org.jqurantree.core.io.FileWriter
user=> FileWriter
There you go!
Most Java libraries are up on Maven and you just specify the coordinates in deps.edn
. You rarely have to download the JAR.
haha yeah, this particular project isn't on Maven: http://corpus.quran.com/java/overview.jsp
Ah, OK. Just wanted to check since you didn't know about import
vs require
so I wasn't sure where you were on your journery.
Yeah, I'm still in the early stages for sure! But am able to get really productive surprisingly. Clojure is a gem.
ok, I definitely need different collections to have different values, so now I'm thinking I'm back to uuids. How safe is gensym?
ok, I don't think this approach will work at all - I need to compare values in different collections, and when they are uuids of course that doesn't work -they are never going to be equal. Back to the drawing board on this one
well, I guess I could still use it for the graph creation, I just can't use it for all the other stuff I'm doing
you'd need some kind of indexing - it ends up being very similar to using sql - you have an index that doesn't carry any value on its own, then you use it to cross-correlate
the whole point of gensym is that it's safe within one execution (though might not be safe if you load symbols from another execution of your program, eg. from a file - that's when you really want UUIDs)
I think it's even worse than that. I'm trying to implement Mergeable Replicated Datatypes according to this paper: https://www.cs.purdue.edu/homes/suresh/papers/oopsla19-mrdt.pdf
if this is really about a datatype, you can do what clojure did for its hash-map and vector impls and just use mutable definitions with an immutable interface
Then I do a three-way merge of those relations with replica1 (r1), replica2 (r2), and the lowest common ancestor between them (l), according to this formula:
:ord
: Rob(v) ⊇ (Rob(l) ∩ Rob(v1 ) ∩ Rob(v2) ∪ Rob(v1 ) − Rob(l) ∪ Rob(v2) − Rob(l)) ∩ (Rmem(v) × Rmem(v))
:mem
: Rmem(v) ⊇ (Rmem(l) ∩ Rmem(v1) ∩ Rmem(v2) ∪ Rmem(v1) − Rmem(l) ∪ Rmem(v2) − Rmem(l))
Once I'm on the other side I use :mem
and :ord
to produce a new sequence. So I churn them all together and get something completely new, and all the mappings are destroyed in the process
I may go that way, but right now I'm stuck on the last step, reproducing a sequence given a list of its members and the "observed before" relation for each member.
In this: {:mem [1 3 2], :ord [[1 3] [1 2]]}
, how do I relate the ord stuff to the mem stuff? I mean, in this case it is obvious but what about when my collection is all 1s?
you need a representation that separates a reliable index from the value held, I think
is it posssible to update the defmulti dispatching function, without remove current namespace or restart the REPL?
I believe the common trick is to do something like this:
(defn my-dispatch-fn [& args] ,,,)
(defmulti foo #'my-dispatch-fn)
You could also define the multimethod var as something other than a multimethod (for example (def foo nil)
) then re-evaluate the defmuliti
and all the relevant defmethod
s (since redefining it to nil
will remove all of the methods) ... that way you can change it to use the var-quote form without restarting the repl
@U0P0TMEFJ It works!
Hi everyone 😄 I'm looking for code analyzers to detect concurrency issues (Like Race and deadlock) in clojure. Does anyone have any suggestions?
I've mostly done this sort of thing through runtime detection and automated testing rather than static analysis but ymmv ... maybe have a look at test.check (https://clojure.org/guides/test_check_beginner) ??
Clojure's approach is making high-level thread-safe primitives which are thread-safe even when composed. Since it works, I'd imagine people haven't felt very compelled to author such analyzers. Still it'd be possible to write linters ensuring that refs, atoms, etc are used properly... not many "rules of thumb" come to mind right now, but there certainly are. My hunch is that it's a better investment to thoroughly understand all primitives, and ensure their proper usage through code review.
data races occur when you have uncoordinated threads racing to write the same value. None of the concurrency primitives in Clojure allow this. (there are some corners you can use to exploit this via interop w/ mutable arrays, or mutable deftypes etc but usually people doing that stuff know what they're doing)
deadlocks occur when you have a resource/lock cycle. most of the concurrency primitives have guards against this - atoms use spin/lock retries so you don't "hold" the resource, agents are async, refs can conceptually deadlock but has timeout/retry built in.
Ah I see. Thanks a lot for the information @U45T93RA6, @alexmiller That really helps. 🙂 I will also check out test.check as @U0P0TMEFJ has suggested.
It is easy to misuse things like atoms, by calling deref
keeping the value in a let, and using swap!
to write a new value in, which will result in data races. Don't do things like
(def thing (atom 1))
(defn make-not-odd! []
(let [t @thing]
(when (odd? t)
(swap! thing inc))))
I’d like to write a function mywalk
that is similar to `clojure.walk/prewalk` with the possibility to leave part of the structure untouched (for instance when the metadata contains :`skip true`)
For instance,
(mywalk #(if (number? %)
(inc %)
%)
{:a {:b 1}
:c ^{:skip true} {:d 1}})
should return
{:a {:b 2}
:c ^{:skip true} {:d 1}}
Any idea?You can either just copy the code of walk
and prewalk
and make all the necessary changes there (there's not a lot of code), or you can just wrap and unwrap the marked values into something that walk
doesn't recognize. E.g. in this case I used atom
. But for it would be better to create a custom type, of course.
(defn filtering-prewalk [f form]
(let [f (fn [v]
(if (:skip (meta v))
(atom v)
(f v)))
outer (fn [v]
(cond-> v (instance? clojure.lang.IDeref v) deref))]
(walk (partial filtering-prewalk f) outer (f form))))
Actually, the implementation of walk
is hardly longer, so in my code I would definitely just copy and modify its code rather than using the above example.
Eventually I solved it this way
(defn prewalk-skip
"Like `clojure.walk/prewalk` but leaves as is parts of the form (and their children)
that satisfies `skip` predicate"
[f skip form]
(if (skip form)
form
(walk (partial prewalk-skip f skip)
identity
(if (skip form)
form
(f form)))))
Hi all, we’re currently use https://github.com/lettuce-io/lettuce-core to interface with Redis. We’re facing a slightly weird issue where the push to Redis is “uneven.” In other words, lpush
only works sometimes. I’m wondering if someone has faced a similar issue before? We’re using 6.0.5. Would be great if someone could provide some pointers on where to look. We have debugged the surrounding code and found that the issue lies specifically with the push itself.
Has any one ever used xhprof here? Is there anything like that for Clojure? I'm not great with reading flame graphs
The good profiling is going to come from the VM (I'm assuming Java for Clojure), the java tool hprof
dumps data, there's various tools for loading and exploring that data. I've used visualvm and yourkit, yourkit is definitely nicer, but visualvm is free.
The biggest gotcha with analyzing performance with clojure is that jvm tools expect the relevant context info about execution to be the class whose method is invoked. For clojure that means you get a lot of info about methods in PersistentHashMap and PersistentVector and LazySeq, but not necessarily info about the code that made those things execute.
I've even gone so far as replace hash-maps / functions with records implementing protocols, just so I could get better profiling info
luckily that conversion is trivial
is it possible to rename keyed parameters in an anonymous function?
(fn [{bar :foo}] (prn bar))
and expect it to be called like so
(wrapper {:foo "baz"})
of course: (fn [{foo :bar baz :quux}] ...)
- perhaps I misunderstand you though
that’s what i tried but maybe i’m doing it wrong, i’ll go back and dig a little more. mainly wanted to make sure i was taking the right approach
thanks @U051SS2EU!
your example works on a minimal test case
(ins)user=> ((fn [{bar :foo}] (prn bar)) {:foo 42})
42
nil
you can also do the same destructuring inside a let block, which can improve clarity if the argument list starts to get noisey
Is anyone using Ogre or other gremlin-based API in Clojure for a graph database in a large scale production environment? (You get to define “large” 🙂 ) There is little in the way of examples or clojure-specific documentation that I could find.
How to I know how much memory each namespace use?
when I start clj
, it use 13Mb
After (require 'app.main)
, it goes to 116Mb
I need to debug why this footprint is so high
Both memory usages are from Used metaspace
in visualvm
the problem here is that objects don't belong to namespaces, if two namespaces hold references to one object, there's no definitive way to say who is responsible for the space usage
you could compare how much space is used recursively by the objects under the fields in the ns, but, for example, all of clojure.core would count as references held by every ns
so many things would be "counted" multiple times
There is some methodology to find with namespaces are impacting my "initial memory" usage?
Context: It's a tiny application that use almost no resources "on request", but do not fit in tiny ec2 machines
also keep in mind that that size probably includes garbage that will be garbage collected if you are approaching the max heap size
so you could probably set -Xmx64m and it would use less than that (because it would hit the limit and gc)
@souenzzo I can’t recall if EC2 instances are considered containers or not, but you might get some benefit from using -XX:+UseContainerSupport
as a JVM parameter (note: requires JVM v9+). Amongst other things, that tells the JVM to check the container’s memory configuration and adjust things like memory limits appropriately.
It's running on EC2/OpenJDK8. EC2 isn't a container
I found some deps that requires cljs.analyzer
and others cljs namespaces into classpath and remove they
That footprint "metaspace usage" goes from 120Mb to 80Mb and now it works way better
metaspace correlates with classloading. if your app.main loads a metric ton of dependencies, you'll see metaspace inflate
oh yeah, sorry I didn't catch that above
but even so, metaspace can be collected if needed too now right?
I think so, but likely won't have much collectable if it's the initial load of the application
is it really the case that metaspace is that much larger than regular heap usage? (the root question is how to use a small enough amount of RAM to fit in a specific container, right?)
@souenzzo If you're looking for something with a low memory footprint, take a look at GraalVM native-image. Babashka, a scripting environment compiled with GraalVM, can run with as low as 16Mb or something
A question about accretive vs breaking change: if I have a protocol in a library and some implementations of it, and I add a new method to the protocol and add the definition to each of the library's implementations and that new method is only used in a new feature that is added to the library at the same time, should that be considered purely accretive? Concern: existing users of the library may have their own implementations of the protocol, without this new method... but at long as they don't use the new feature (a single new function), they would be unaffected by the change in the protocol. Only users who have their own implementations that want them to participate in the new feature would need to make changes -- to add an implementation of the new protocol method.
I'm interested in anyone's / everyone's opinion but especially interested in @ghadi and @alexmiller I guess 🙂
seems like a gray area but maybe ok. alternate would be to make a new protocol
in general, I have often regretted making protocols bigger, but never regretted making them smaller :)
protocols emit an interface right? any chance someone has used the underlying interface from java for speed? that would break there right?
adding a method to an interface is not a breaking binary change in java
sorry, I guess it would be for implementors
was the original protocol public? @seancorfield
you do not have to @dpsutton , will get AbstractMethodError if calling something not implemented
@ghadi Yes, I'm talking about next.jdbc.result-set/RowBuilder
which is both public and documented -- although relatively unlikely to be used (except perhaps by @ikitommi for high-performance extensions to next.jdbc
?). And the new method would only be used by the new feature I'm planning to add -- so the only "breaking" impact would be in terms of people starting to use the new feature on an existing 3rd party extension to the protocol that hasn't been updated with the new method.
@alexmiller Even if I add a new protocol for this, the new feature is going to require it to be implemented so the new feature wouldn't be usable on 3rd party things that extend the current protocol but not the new one -- so the end result would be "the same" although the error message would be different (no implementation of <new protocol> vs AbstractMethodError calling the new method that was added to the existing protocol).
So the change won't break anyone's existing code -- only new code, leveraging the new feature, and only for 3rd party implementations of the current protocol (which won't have the new method). Hence my asking here for input because it feels like it isn't technically a breaking change but might cause surprise for someone who tries the new feature on someone else's next.jdbc
extension. The new method would allow for some refactoring of existing implementations to reduce boilerplate/repetition, if those authors wished.
According to http://grep.app the only use I can find of that protocol (outside of next.jdbc
itself) is metosin/porsas
so I'm leaning to going ahead with the change and then opening an issue against that single project that has an implementation of RowBuilder
...
Before (now): https://github.com/seancorfield/next-jdbc/blob/develop/src/next/jdbc/result_set.clj#L122-L134
The change would add a variant of with-column
that allows for the column reading function to be passed in. Name TBD. The intention is to make it easier for people to write their own builders that can adapt/wrap other builders while controlling how columns are read/converted. And a new, more generic, builder-adapter would be added to next.jdbc
itself that leveraged this machinery.
Currently you have to have an adapter for each type of underlying builder because implementing with-column
requires knowledge of how new columns are added to the row being built -- and that's the only place you can intercept/control how columns are read.
(so it's ultimately a design bug in terms of how I allowed for future extension that I want to fix by adding a new method)
Current adapter (for map row builders): https://github.com/seancorfield/next-jdbc/blob/develop/src/next/jdbc/result_set.clj#L237-L242 -- and there has to be another adapter for array row builders too, since that requires conj!
. And it exposes the use of transients which is unfortunate.
(a new type of column reader will be supported that is passed the ResultSet
, the column index, and the entire builder object -- rather than just the ResultSetMetaData
-- so that column readers can access (:cols mrsb)
to get the actual Clojure keys associated with columns, rather than having to rely on JDBC interop to get SQL-level labels etc from the metadata object; and it will be up to the new column reader whether it calls read-column-by-index
which is based on the ReadableColumn
protocol, which I'll make extensible via metadata)
(it isn't extensible via metadata right now because it's use was essentially "closed" inside the existing builders -- and I should have made that change when I added the adapters, which is lines 237-242 above)
@ghadi Does that change any aspect of your initial feedback on the change (being accretive vs breaking)?