This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-03-04
Channels
- # aleph (8)
- # aws (14)
- # babashka (37)
- # beginners (30)
- # calva (5)
- # cider (4)
- # clj-kondo (21)
- # cljsrn (4)
- # clojure (234)
- # clojure-denmark (1)
- # clojure-europe (10)
- # clojure-france (10)
- # clojure-italy (4)
- # clojure-nl (17)
- # clojure-sanfrancisco (1)
- # clojure-spec (8)
- # clojure-uk (44)
- # clojurescript (20)
- # cursive (9)
- # datascript (2)
- # datomic (5)
- # emacs (9)
- # fulcro (50)
- # graalvm (32)
- # jackdaw (18)
- # leiningen (1)
- # malli (10)
- # meander (10)
- # nrepl (10)
- # off-topic (15)
- # pathom (20)
- # re-frame (14)
- # reagent (37)
- # reitit (7)
- # ring (1)
- # shadow-cljs (102)
- # test-check (6)
- # tree-sitter (15)
- # vim (4)
- # xtdb (2)
- # yada (1)
Proxy question 🙂
Can I access the this
object in the methods provided in a proxy?
ie:
(defn my-instance [args]
(proxy [MyAbstractClass] [args]
(myMethod [this] (...)))
My intuition is yes, but when I pass this proxy to an object that uses it, when the method myMethod is called I get an arity exception
that's the only anaphoric macro in clojure (that invents a special symbol)
Possible (random) insight: by prefering mapv
to map
, one fails faster (in case something goes wrong), and importantly with clearer stacktraces
...stacktraces where laziness is involved tend to involve a lot more Clojure internals, burying the actual culprit. e.g. maybe the defn at fault lives in a different namespace than the namespace that threw the exception.
wdyt?
by preferring (into [] ...) to mapv you get all the perf benefits of transducers (and transients)
Nice! I would have imagined the impl of mapv
would have been upgraded after the introduction of transducers
mapv/filterv are vestigial functions now that transducers exist
Nor did I! Perhaps clj-kondo should highlight it?
I wouldn't put it as strongly as deprecate
mapv is certainly more concise and in cases where you know the data is small, you're not stacking transformations, etc I still use it
Ok, thanks.
mapv and filterv were yet another small step on the "why do we keep building the same family of functional transformations in new contexts" journey
In cases where I read from database where the results might be consumed much later at a wildly different context, I have a thin wrapper that just calls first
to force at least the first chunk of the sequence. This way any database errors will be thrown in the context where I can log the query, the user running it etc etc.
That's interesting. It preserves the laziness, up to a point. One could call it semi-laziness :)
Isn't that wrapper just equivalent to calling seq
? Given that it checks that the collection is non-empty?
👀 Got you, one could (->> xs (map f) (seq))
for being lazy and fail-fasting.
Maybe sequence
would be better, for avoiding nil values.
Is there a way to extend keyword lookup to some random Java class that implements Map
? So I can do: (:foo x)
and that behind the scenes would call (.get x (name :foo)
Probably not, as clojure.lang.ILookup
as an interface (as opposed to a protocol, which can be extend
-ed)
the lookup behavior of Keywords is in the java implementation (implementing IFn or AFn), so there's no good way to shadow or replace it
it's not a syntactic feature of clojure that keywords do lookup, it's just that clojure tries to invoke
things and IFn tells you what happens if you do that
I think it's possible by wrapping it with a deftype but I'd like to avoid it -- this is mainly for performance reasons and to lower GC pressure.
What's up with this error? Cannot cast javafx.scene.input.MouseButton to [Ljavafx.scene.input.MouseButton;
Why the [L
?
ah thanks
Yeah, it’s a strange Java class file format internal detail that leaks into Clojure.
Hello! I've got a big Java application at work that I'd like to poke around in using Clojure. I've never "called Clojure from Java" or used a Clojure project without Leiningen / tools-deps or similar before. Is there a recommended resource/guide for this use case? Would-be-nice if: • Simple to get a REPL up and connected to CIDER • If I could use tools-deps (or lein) to specify additional dependences • If I could do this with minimal impact to the Java app
https://github.com/stuarthalloway/clojure-from-java/blob/master/src/java/example/Main.java
I’m not sure about nrepl, but clojure supports socket repls out of the box via system properties, this is as close to “minimal impact” as it gets
does that java app use something like maven or gradle? clojure is “just a jar” that you can add using your build tool’s way to specify dependencies
depending on the jdk, one of the following may work: https://github.com/dkz/liverepl (this version has some nrepl support iiuc) the original is: https://github.com/djpowell/liverepl -- may have need some tweaking
the original does not provide nrepl support and i think it predates the official socket repl (it provides its own method iiuc)
i think they are more likely to work out of the box with jdk <= 8. i think i got one of them working with a more recent jdk, but i'd have to dig a bit to confirm -- i don't tend to do nrepl-related things much though, so even if this worked it would be via a socket repl.
Hey @U47G49KHQ and @UG1C3AD5Z -- thanks the replies. Sorry it took me a while to get back to you. We're using Gradle on OpenJDK 11. Thanks for the links. If I understand correctly, my options are: 1. Link Clojure's jar, add a system property to the process invocation and connect via Socket-REPL 2. Link to Clojure's jar, NREPL and CIDER, and add java code to start a CIDER-enabled NREPL server into the existing codebase 3. Use Liverepl, which if I understand correctly allows the Clojure code to live in a normal Clojure project, but "remote connects" to the Java process. It's not clear to me how this would work, if there's synchronization needed, or whether it eventually "joins" java processes. I appreciate your insight! I feel like I have a bit more to go on now.
iirc, option 3 injects clojure into the running java process -- the java process then runs a socket repl of sorts and one connects to that. if you are just poking around, this can be a short-term option. however, the liverepl option is a bit dated and getting it working with recent jdks may require some tweaking. also if the java application is using java 9 modules or something "different" regarding the classpath, it may introduce additional complications. if you have aspirations of eventually using clojure within the application, i think vlaaad's approach makes sense to start with. another reason to go that way is that the application is using jdk > 8.
I'm having an issue with transit-clj where it seems that transit is somehow causing very small packets of data to be flushed. This is causing me massive overheads in transfer size. I haven't been able to find the cause/config for this though, any pointers?
Okay. May have found it, but someone's confirmation would be valuable. https://github.com/cognitect/transit-java/blob/cff7111c2081fc8415cd9bd6c6b2ba518680d660/src/main/java/com/cognitect/transit/impl/AbstractEmitter.java#L189 This appears to be the problem. Every time something is emitted, it's flushed... But that's going to cause immediate writing. I have no idea why one would want this.
Possibly related: https://github.com/cognitect/transit-clj/issues/43
I wanted to do that, but I'm still figuring out how to put something together exactly 🙂. I'd need to create a "slow" output stream or something in order to demonstrate the problem, I think.
Will be doing that today. Although I have definitely tracked this down to flushing. Would doing this in terms of showing frequency of flushing be sufficient?
I think the complementary use case is when you want to see each message on the stream as soon as its ready
these inherently fight in flushing behavior
and maybe the best option is to make autoflushing a policy choice
if you're not autoflushing, who's responsible? at what size/frequency? that's basically a flow control problem.
Isn't that default for streams? As long as I'm using a http://java.io.OutputStream and not opting into using a http://java.io.BufferedOutputStream
I found that first writing to a byte array output stream and then to a file was much much faster then directly to the file: https://github.com/borkdude/clj-kondo/blob/99dc96c5359ee5420390049cb3ad30eb934a4d5b/src/clj_kondo/impl/cache.clj#L41
what happens if you do use a bufferedoutputstream?
@dominicm i'm not disputing flushing -- a cursory skim of the code indicates that it's flushing after every entry in a map, not after full objects
I'm just saying the ticket should be something like "I'm using this thing in an ordinary way, and it's slow" later -- is it flushing after every key?
@alexmiller The .flush
forces the bufferedoutputstream to immediately clear it's buffer.
I suppose my exact use case would require bringing in something like jetty/pedestal to demonstrate the overheads involved from http. I could make an output stream which prints some newlines & length when each chunk writes to output.
hypothesis: you should see a performance difference between writing to BAOS and writing to jio/output-stream wrapping anything. Does reality reflect that?
IMHO no need to make a special OutputStream... on-label problems stronger than off-label problems
I wonder why in the first form gc kicks in. While in the second it cannot. The decompiled java code looks the same.
(decompile (let [r (range 139)]
(first r)
(last r)))
;; heap space OOM
(decompile (let [r (range 1e9)]
(last r)
(first r)))
you're holding on to the head of r, and then building a 1e9 long sequence to get the last value
so the whole sequence is instantiated in memory
I don't know what decompile is
user=> (doc decompile) ------------------------- clj-java-decompiler.core/decompile ([form]) Macro Decompile the form into Java and print it to stdout. Form shouldn’t be quoted. nil
My understanding is that, it is the =r= that holds the sequence. So both forms should OOM. However, the first form doesn’t.
well the first form is only 139 elements, which is not much memory?
vs 1000000
in the first one, the last that walks the sequence is in tail position - the Clojure compiler can clear the r reference before evaluating it
and it does
https://bpaste.net/PQ5Q It is the de-compiled java code. Is it more of the runtime java gc, or due to the clojure compiler? I don’t see a much difference between the decompiled code of the first and second form.
I'm not having a performance issue though. My problem is directly with flushes against my output streams causing me to send very small buffers over http 🙂. I can't see anything particularly like this in Java itself.
isn't that a performance problem?
I find your problem description to make sense
on solution side, seems like a custom stream impl could selectively ignore flushes
This is more workaround than solution. Ignoring flushes is potentially dangerous. I think I can get this right for my particular use case, but if you're not careful about flushing at the right time, the end of your response may be cut off. I'll likely be reading the source for the output stream I'm receiving to ensure that it's .close method doesn't utilize .flush.
It makes sense to me too, but a clearer ticket is easier to act upon for someone with less context than us
I'm happy to reorder the paragraphs and change the title. But an actual repro for my exact problem seems like it would be too large to be useful? Maybe I've been trained to make bad repros?
The description change is made. I'm happy to make a repro in the way that is most useful to someone 🙂
@ghadi It's no problem. The previous ordering was entirely my attempt at being as helpful as possible, I apologize that it had the opposite effect.
@dominicm you're great! I've caught myself lately leading with solutions and it's burned me around understanding problems clearly.
I don't think so. I think that pedestal's http.clj transit integration takes care of it automatically. I would assume that as the API returns a function taking an outputstream, it can't know the length and has to automatically fall back to chunked-transfer. One potential thing to note is that I happen to know this is Jetty behavior, but I'm unclear about whether other web servers would consume the whole outputstream and then calculate the length. It's unlikely, but possible.
is there a "nice" way to get map-indexed
to accept multiple lists? e.g. (map-indexed (fn [idx foo bar baz] ...) foo-list bar-list baz-list)
combine the lists with concat
?
has the benefit of not having to remember where the index is. i always forget if its first or last
clj-kondo reads type annotations and uses them for linting. this resulted in a false positive for byte
:
(defn byte
"Coerce to byte"
{:inline (fn [x] `(. clojure.lang.RT (~(if *unchecked-math* 'uncheckedByteCast 'byteCast) ~x)))
:added "1.0"}
[^Number x] (clojure.lang.RT/byteCast x))
So now clj-kondo thinks that byte
always takes a number or nil (due to the nullability of objects in Java).
But actually it takes a lot more when you look at the impl of byteCast
like characters. So is the type annotation wrong?False positive:
$ clj -A:clj-kondo --lint - <<< "(byte \a)"
<stdin>:1:7: warning: Expected: number or nil, received: character.
characters are numbers
in certain ways
yeah, I agree
if this isn't important, I'll just ignore it, won't make a JIRA issue and override the type config in clj-kondo
not something I'm concerned about
thanks @ghadi. Ions are too involved (AWS lock in etc). Re: shipping uberjars, what starts your jar/app on the destination machine
preferably I'd use a git push (similar to dukku/heroku) and have everything else run automatically. the complexity of dokku is frightening though.
Complexity of using it or reading codebase of dokku? Dokku have good "git push" story.
It handles many operations which we'll have to deal manually otherwise, won't an alternative with same features have similar size?
Dokku codebase is largely composed of bash scripts.
What alternative ways do you recommend to Dokku?
So, do you prefer bash scripts over these solutions?
it's git push, and your code runs inside the database. disclaimer: I work for Cognitect but not on Datomic. I do a bunch of AWS work
I bet there is some tutorial of how to deploy an app to a VPS, e.g. https://www.digitalocean.com/community/tutorials/how-to-deploy-a-clojure-web-application-on-ubuntu-14-04
I've been running (hobby) Clojure apps like that since 2012, not digital ocean but similar
Elastic Beanstalk is dead easy on AWS and you can deploy from the command line. Not sure what you’re gaining by running it all yourself.
@denik https://gumroad.com/l/aws-good-parts has a tried and true methodology of deploying on ec2. I highly recommend it.
the uberjar approach is tried and true, but eventually leads to complaints about the size of artifacts. there is potential to do much better with tools.deps, deploying each dependency only if it isn't already deployed, and then just shipping a single text file containing a classpath
That's what I want to do. Done w/ lein
in a production environment you want to minimize the possible failures and differences between servers, introducing lein and dependency resolution in to the mix, where that is happening on each server independently is no good
so ideally you resolve dependencies once, and each copy of your server gets those exact dependencies instead of resolving again independently
> the uberjar approach is tried and true, but eventually leads to complaints about the size of artifacts. Is this a real issue for people? We have a large enterprise application with quite a large dependency tree, and the artifact size has never been an issue for us. We depend on some large native libraries (e.g., intel mkl) but those are pre-installed onto the image we deploy to. The jar is always under 100mb which can be downloaded in a couple seconds.
It can be, it depends to some degree how you are doing builds, how many builds, how fat the pipe is between where you do builds and where you deploy is
At my last job it was just slightly annoying, only one real artifact and fast pipes, but at my current job we have multiple uberjars with a lot of overlap in dependencies (so for a new dep the total jar size grows by dep size * number of uberjars) and a small pipe between builds and deploys
if you're building and deploying (trunk-based dev) on every commit to a distributed bunch of machines for a project with 100 devs that's gunna add up
Ah true. We're only doing 10s of deploys a day at the moment. You're referring to data transfer fees adding up, right?
I'm trying to read a large csv file (~1.1gb) using data.csv. data.csv is supposed to read in data lazily to minimize memory usage. I am seeing the java proc go up to 20gb of memory while reading the csv like below. I'm not sure why it would go up to 20gb if it is processing the csv lazily. Am I doing something wrong here or is this size of memory expected when reading a large csv file?
(with-open [rdr (io/reader "data.csv")]
(let [rows (csv/read-csv rdr)
header (nth rows 0)
data-rows (rest rows)]
(vec (drop-while (constantly true) data-rows))))
drop-while will walk over the whole sequence, realising it, while data-rows is still retaining the head
yes sorry the call to vec is realising the lazy seq created by drop while which contains the lazy seq from data.csv
that could also just mean they haven't been gc;ed yet, how large is your max heap, and have you actually seen an out of memory error?
jvm will basically use whatever you give it, so by giving it such a large heap, it may delay gc'ing until it has allocated close to 30gb
Forcing a gc does clean up all that. To be clear, vec is realizing the lazy seq but data-rows is not holding on to the entire csv.
keep in mind that as soon as you remove the artificial drop-while argument, your call to vec
becomes a liability
This does not oom
(with-open [rdr (io/reader "data.csv")]
(vec (drop-while (constantly true) (csv/read-csv rdr))))
There have been changes to Clojure in the past several years that eliminate some cases of locals clearing being done in cases where it did not used to, to avoid some 'holding onto head' cases similar to this. It isn't clear to me whether this is a case of "should hold onto head" or "locals clearing should prevent that".
Hmm, rereading what I wrote, I meant to say "that add (not eliminate) some cases of locals clearing"
I don't spend as much time looking at the compiler as I have done in the past, but I think the way it determines when to clear locals might be the most inscrutable part of it. I have a toy re-implementation of the core.async's go macro that does some pretty standard dataflow analysis which results in always clearing locals right after last use, but is iterative (I think I read some where if you arrange things right the algorithm will be linear, which I have not done).
what the clojure compiler does is supposed to be better in some way (faster? simpler?) and it generates a tree of usages and compares paths in that tree, and it always seems to need tinkering with
public static Object invokeStatic() {
final Object rdr = ((IFn)user$fn__287.const__0.getRawRoot()).invoke("data.csv");
Object invoke;
try {
final Object rows = ((IFn)user$fn__287.const__1.getRawRoot()).invoke(rdr);
final Object header = RT.nth(rows, RT.intCast(0L));
final Object data_rows = ((IFn)user$fn__287.const__4.getRawRoot()).invoke(rows);
invoke = ((IFn)user$fn__287.const__5.getRawRoot()).invoke(((IFn)user$fn__287.const__6.getRawRoot()).invoke(((IFn)user$fn__287.const__7.getRawRoot()).invoke(Boolean.TRUE), data_rows));
}
finally {
((Reader)rdr).close();
}
return invoke;
}
would locals clearing, if it were happening, look like a setting of those variables to null before one of those steps in invoke?
I dunno exactly what it would look like in whatever you are using to turn the bytecode in to java, maybe something like rows = null;
after the last usage
> To make the output clearer, clj-java-decompiler by default disables https://clojuredocs.org/clojure.core/*compiler-options* for the code it compiles. You can re-enable it by setting this compiler option to false explicitly, like this:
Object rdr = ((IFn)user$fn__196.const__0.getRawRoot()).invoke("data.csv");
Object invoke2;
try {
Object rows = ((IFn)user$fn__196.const__0.getRawRoot()).invoke(rdr);
RT.nth(rows, RT.intCast(0L));
final IFn fn = (IFn)user$fn__196.const__3.getRawRoot();
final Object o = rows;
rows = null;
Object data_rows = fn.invoke(o);
final IFn fn2 = (IFn)user$fn__196.const__4.getRawRoot();
final IFn fn3 = (IFn)user$fn__196.const__5.getRawRoot();
final Object invoke = ((IFn)user$fn__196.const__6.getRawRoot()).invoke(Boolean.TRUE);
final Object o2 = data_rows;
data_rows = null;
invoke2 = fn2.invoke(fn3.invoke(invoke, o2));
}
finally {
final Object target = rdr;
rdr = null;
Reflector.invokeNoArgInstanceMember(target, "close", false);
}
return invoke2;
is what it looks like if you don't turn off locals clearing (and in fact locals are being cleared)
so my guess is @U083D6HK9 is using some tooling that turns locals clearning off
it is, the bytecode that is emitted loads the value on to the stack (no direct representation in java source) from the local and then nulls out the local
Typing this in my repl does indeed sound like that is true
*compiler-options*
=> {:disable-locals-clearing true}
no core.match channel but wondering if anyone knows why core match regex is built on re-matches
rather than re-find
re-find
is unanchored, re-matches
is whole-string (anchored)
(off the top of my head)
user=> (re-matches #"foo" "foobar")
nil
user=> (re-find #"foo" "foobar")
"foo"
@dpsutton( a little bit meta: finding re-find in an app called re-find: https://borkdude.github.io/re-find.web/?args=%23%22foo%22%20%22foobar%22&ret=%22foo%22 )