This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-02-27
Channels
- # announcements (8)
- # architecture (3)
- # aws (18)
- # beginners (96)
- # bristol-clojurians (3)
- # calva (15)
- # cider (7)
- # clj-kondo (8)
- # clojure (135)
- # clojure-denmark (1)
- # clojure-dev (14)
- # clojure-europe (37)
- # clojure-italy (9)
- # clojure-nl (14)
- # clojure-sanfrancisco (1)
- # clojure-spec (1)
- # clojure-uk (54)
- # clojurescript (27)
- # core-async (243)
- # cursive (28)
- # data-science (6)
- # datomic (33)
- # fulcro (25)
- # graalvm (24)
- # hoplon (2)
- # instaparse (12)
- # jackdaw (1)
- # java (21)
- # juxt (12)
- # meander (10)
- # nyc (4)
- # off-topic (6)
- # om (3)
- # pathom (17)
- # perun (1)
- # re-frame (29)
- # reitit (4)
- # rum (3)
- # shadow-cljs (119)
- # spacemacs (31)
- # xtdb (14)
when it's a long day of reproducing and fixing regressions that were in master and you are feeling a bit punchy
(let [huge-payload (apply str (repeat 100 (System/getProperty "java.class.path")))]
...)
I had to take some zeros off the repeat count, the first version crashed the integration test vm with an OOM
Wow. Do you recall how big (count (System/getProperty "java.class.path")) was?
haha - I can go back and check - of course this was a corporate clojure service with a bunch of apache libs, the classpath was massive :D
21031 outside the test runner, the test runner adds more
Not too short 🙂 Certainly repeating 10^6 of those in memory seems likely to be an OOM, but I'm surprised 10^3 would be, unless the process was already very low on memory.
I meant I took zeroes off to end up with what I shared
Understood. Engineer here geeking out on the details, looking for performance problems/fixes 🙂
You could use a promise-chan
And ops on it in a go block
@michael.e.loughlin > I'm looking for Lisp implementations written in Clojure. Maybe sci is another one to look at. https://github.com/borkdude/sci/
It uses edamame to parse Clojure code, which is in turn based on tools.reader.edn
What would the idiomatic way of breaking out of nested structure be in Clojure? As someone who also uses Scheme, my first thought was "continuation", and exceptions are a kind of continuation, but "abusing" an exception as a continuation feels a bit wrong.
Yes, exceptions are usually used for errors. I think simply returning (terminating the expression) is normally used, which means it has to be done at all four levels. I realize that doesn't answer your question.
In Clojure, it is more like,
(when (pred x)
;; Do something
)
If pred becomes false, when
returns nil
. You can propagate this nil
upwards if you wanted to.@U2APCNHCN would care sharing what you’re code does? Is it data transformation? Some state changes, or async calls?
It is data transformation so to speak, yes. It's a web crawler. The problem is that one function relies on getting nil
as content at some point, or it'll enter an infinite loop. But another now has to figure out if an URL is invalid, and that is where I want to jump out of the stack
Can you structure this problem as multiple pools of workers? Like, one set of URL validators send valid URLs across a channel. Other pool actually fetches the content. I think Thom is trying to say the same thing.
So… it seems like you’re designing some bigger process. Clojure doesn’t tend to jump out of stack. If passing data from function to function is not enough, core.async
is, most often, way to go to design processes.
I think that would be a fine use of core.async. It’s just multiple queues of work (the list to be scraped, the found content, the list of potentially invalid URLs) you can separate out and which don’t then need to care much about each other.
Well, I also have to be able to fetch pages during crawling, for paginated things for example
The other approach will be to tag the invalid URLs and make the outer function ignore them.
The way it works now is that I have a crawler definition for each crawler that gets read by a macro. And that definition says crawl this webpage using these rules, and when there are multiple pages, here's how to fetch them
That just sounds like an inner loop finding more URLs to scrape, or scraping all pages at once and finding content of interest. Either way the output can just be pushed to a channel.
A crawler is basically a chain of main blocks, and each of these main blocks contain instructions on how to operate on the data
So I think you have a fair few options here. Obviously you can use channels and just return some structured value that indicates it bailed out early. I actually don’t think exceptions would be inappropriate here though, given that you presumably mostly expect crawls to succeed. You can add extra info about errors or URLs encountered (using ex-info or whatever).
I know what you mean. There are some threading constructs like some->
and some->>
that can help
but I must confess, I have used exceptions to break out. It does feel like abuse. I think the lack of continuations in clojure is a design decision by Rich. He mentions it in his interview with Eric Meijer.
IIRC he doesn't say much. Just that he "doesn't believe in them" https://channel9.msdn.com/posts/Expert-to-Expert-Erik-Meijer-and-Rich-Hickey-Clojure-and-Datomic
I think I also heard how the JVM simply doesn't have any kind of support for continuations and that would make it very hard to implement them
If it is "very hard" to implement them then that may be the reason, what with Rich's focus on simplicity.
Dynamic vars are the solution for communicating across a stack. Dnolen has a continuation library. Not exactly idiomatic though.
What would be the need for continuations here though? Is there spectacularly complicated retry logic or something? Surely if there's a DSL you can push the orchestration wherever you like.
Do you think this could be relevant to your problem? It'll only get in clojure 11 but you can just copy the funcion https://clojure.atlassian.net/browse/CLJ-2555 Also, while the JVM doesn't support TCO, you can cheat with trampoline https://clojuredocs.org/clojure.core/trampoline
core.match uses exceptions for control flow also: https://github.com/clojure/core.match/blob/master/src/main/clojure/clojure/core/match.clj#L432-L438
This doesn't apply to most situations but I feel like I should mention it anyway: For performance critical code, using exceptions for control flow is a bad idea because the creation of the stack trace is relatively slow. Most code is not performance critical, so this doesn't apply and you should probably not worry about it. But in the Java world when writing performance critical code, this is a no-no. For situations where the exception is always caught and handled in a restricted scope, a workaround is to throw a static pre-created exception. This is an advanced performance topic and I can't imagine it being done in Clojure except in rare situations, but that is the story from the Java performance perspective.
I thought the stackframes were reasonably cheap and that was from the early java days?
Maybe relatively is a better word. I guess they're expensive in real-time networking or something :)
Yes, relative is the key word. And my experience is in performance tuning for very high throughput operations. But even in Clojure, I wouldn't use exceptions for control flow primitives that might be used in high throughput code.
In fact core.match is using a pre-created exception: https://github.com/clojure/core.match/blob/master/src/main/clojure/clojure/core/match.clj#L75
@UBRMX7MT7 do you know if it is still a perf problem if you have pre-created exceptions?
Sorry, I misread that. The try/catch mechanism is not slow. It is the exception creation that is slow. So with pre-created exceptions it is not a significant performance issue.
But it's important to use the pre-created exceptions in very limited scope. You don't ever want that exception to escape and be caught by something that doesn't understand it and handle it. If you do, the stack trace will be printed and it will be non-sense, which can be very confusing when trouble shooting. In Java I only do this with checked exceptions, to be sure they don't escape to the upper levels. In Clojure you'd just have to be very careful that you have a try/catch at a higher level that can handle it correctly.
I would never throw such an exception from a library API, for example.
@U3JH98J4R I get the general idea, but more specially do you know of any monad-like things done in clojure that would work for situations like this? Or any examples that are close?
;; ----------------------------------------------------------------------------
(def return
(fn [_]
(throw (UnsupportedOperationException.
"return function used outside of supporting macro context"))))
;; ----------------------------------------------------------------------------
(defmacro allow-early-return
"lets you write the given block of code with unnamed early returns
in the form of a `return` function.
Because the return mechanism uses Exceptions under the hood,
catching the generic Exception or Throwable class will lead to undefined
behaviour if done within a block that may call the return function.
Ex. (allow-early-return
(when (> x 10)
(return false))
(... some longer computation ...))"
[& code]
(let [block-name (gensym)
ret-name (symbol (str "return-from-" block-name))]
`(block ~block-name
(let [~'return (fn [a#] (~ret-name a#))]
~@code))))
oh, it uses exceptions.
i'm done using hidden parameters of any kind or tricks, i'll just stick with arguments and return values.
compared to Java, it is just so easy to return a tuple (vector) from a function, or a map if the number of variants is more complex. i'm very happy with that.
Oh, that reminds me of this: https://github.com/jepsen-io/jepsen/blob/87908dd0eb546f7ca167999bc352e6b404e83697/jepsen/src/jepsen/util.clj#L442-L512 No exceptions
or
is another way to avoid nesting while creating a "pipeline" of bindings, using the example for letr
:
(let [res (network-call)
err (when-not (:ok? res) :failed-network-call)
people (or err (:people (:body res)))
err (or err (when (zero? (count people)) :no-people))
res2 (or err (network-call-2 people))]
res2)
I know it's a little strange but to me it's clear.I'd like some input on this issue: is try without a catch (and/or finally) probably a silly mistake or are there legit uses of this? https://github.com/borkdude/clj-kondo/issues/773
no legit use I'm aware of
it is an implicit do, but do would be preferred in that case
Given java doesn't allow it, my only other idea is to read the source and see what clojure generates.
try itself doesn't really result in anything - it's the catch + finally that create exception handlers etc
there are some impacts on like recur in try, but removing the try only opens possibilities
Well, that seems pretty much impossible to track down as to where I am supposed to catch that
It seems like the message comes from deep within glibc
because of some possibly incorrect usage of snprintf
. And seems like you cannot catch it or even suppress it.
Sounds like some underlying dependency has this issue, maybe JVM itself.
Hi, i’m trying to use create-ns
to make ns’es like foo.bar
mainly so that i can use them with namespaced keywords like ::fb/baz
without creating actual files. Not sure of the best way to create aliases for them. require/use :as ..
don’t work as those guys are trying load libs. do I have to use (alias )
explicitly or is there a way to do this via the ns macro
Found this: https://stackoverflow.com/questions/3779278/problem-with-n-n-in-writable-segment-detected-c-i-qt You don't embed values into your SQL queries by hand, do you? :)
No, I don't do that, @p-himik, I use HugSQL and next.jdbc. @U11BV7MTK no stack trace at all
Does it print that message and the code continues to run, or does it abort the computation?
Any native libraries in your class path? I do not know much about them, but if you mention it is on the way to the database, then likely you are using some JDBC driver library, which might be partly implemented in Java, and partly in native code?
With enough logging in your code, with appropriate flushing so that all of them get flushed out to a log file/service, and not lost in an in-JVM-process buffer when the JVM crashes, you could probably isolate it to which database call is the cause.
I just managed to reproduce the crash in a simple test app, dump the core, and debug it. I couldn't get Java symbols within GDB (please tell me if that's possible), but at least I can see the first C function that Java calls that ends up crashing the app.
TBH I went with this one far beyond my comfort zone. :) I've never even used GDB before.
I'll have to look tomorrow what ulimit -c
would output. No native libraries directly, but I am using the PostgreSQL JDBC library
Is it something you can reproduce 100% of the time, e.g. running a particular test? Or is it some "happens after weeks of running in production" kinds of things?
@U2APCNHCN Here's a good workflow for tomorrow, as far as I can tell. Sorry if it's too verbose - I just done it myself for the first time, and I have no idea what you might know already.
1. Run ulimit -c
and see if it outputs 0
2. If it does not, search online for where core dumps are located on your particular system - there should be one
3. If it is '0', run ulimit -c unlimited
- it will only work within current shell, so stay in this shell
4. Run your application within that shell
5. Reproduce the situation where the core is dumped
6. Run gdb <path-to-java> <path-to-core-dump-file>
7. Within gdb
, run where
At this point, you should see the native stack trace with a bunch of ??
after it. The question marks are Java calls, it's hard to get them to be displayed. But the native stack trace should at least give you an idea where to look further.
Hope it helps.
ulimit
already was unlimied, and there are no core dumps, even though it now claims to write one (or is it in some special location?)
The other place I have an error that causes the VM to segfault, it creates a core dump in the same directory I am in when I call the program.
Oh, OK. This one is strange. Does the first process have writing permission to the CWD?
Yes. It is the same program in both cases, just the input it gets is different (it's a web crawler, and I give it pages to crawl). All other pages work. But my output from one page causes this "%n in writable segment detected" and makes the whole thing stop, and one other page causes a SIGSEGV.
Hmm. So in your particular case, the error somehow does not generate the core dump, but it does on my end. Albeit with a different program.
Just to check yet another thing. What does the /proc/sys/kernel/core_pattern
file contain? If you have it at all.
In any case, since you know what input generates the error, can you try to find the %n
string anywhere in that input or in anything that that input can produce?
One alternative - rename that file temporarily. It may be the case that systemd-coredump
does not write the dump to disk in that particular case for some reason.
No, but me and my coworker looked at the coredump file of the other error. We threw out runejuhl/clj-journal
and that resolved the problem.
Seems very relevant: https://github.com/runejuhl/clj-journal/blob/master/src/clj_journal/log.clj#L7-L11
So maybe you can plug that library back in, set *strict*
to true, and find out the stacktrace and where the bed values comes from.
I have a clojure.data.xml question. Is there a way to forgo xml namespace prefixes when emitting xml? I have a situation where the xmlns has to be present, but the remote parser does not support prefixes. The only way I have found is to (with-redefs [
you can set a default ns at the root - have you tried that?
if all of the elements are also in that ns, then I think the emitter should emit them w/o prefix
are all of your elements in the same ns?
if not, then I think the answer to your first question is no - you're asking to make an invalid document
if yes, then set the default ns and put all your elements in that ns and the emitter shouldn't need to prefix them
They are all in the same ns. Its emitting bare tags, but the xmlns attribute is still getting the prefix.
so something like "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a xmlns=\"my:ns\"><b>whoa</b></a>"
works, but its outputting "<?xml version=\"1.0\" encoding=\"UTF-8\"?><a xmlns:a=\"my:ns\"><b>whoa</b></a>"
(with the xmlns:a=
)