This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-12-17
Channels
- # adventofcode (25)
- # announcements (2)
- # babashka (16)
- # babashka-sci-dev (16)
- # beginners (213)
- # calva (15)
- # clj-kondo (126)
- # clj-on-windows (1)
- # cljdoc (5)
- # cljfx (1)
- # cljs-dev (6)
- # clojure (230)
- # clojure-europe (38)
- # clojure-nl (3)
- # clojure-uk (3)
- # conjure (10)
- # core-async (15)
- # cursive (33)
- # fulcro (58)
- # hyperfiddle (4)
- # jobs-discuss (1)
- # kaocha (5)
- # lsp (46)
- # meander (3)
- # off-topic (30)
- # polylith (10)
- # portal (9)
- # re-frame (5)
- # reitit (7)
- # releases (2)
- # ring (17)
- # sci (8)
- # shadow-cljs (6)
- # specter (1)
- # sql (1)
- # testing (9)
- # tools-deps (4)
- # vim (12)
Hoping to find a sound way to use apache thrift with Clojure. Anyone have experience with Thrift in Clojure recently?
Thrift outside of FB isn’t exactly super well supported in general. What is your use case?
and my first guess would be that the best approach would be to use the java tools for this and just get data out of results with the appropriate methods
I'm really confused on the order of execution in comp when you have both transducers and functions in the comp
I've relied on the intuition that functions execute last to first and transducers execute first to last in comp
but mixing them means I need to actually understand why the case is reversed for transducers
It isn't that the functions aren't applied from right to left, they are, but those functions aren't the pipeline, but transformations to be resolved when supplied a reducing function
I didn't see the workshop, no
https://github.com/bsless/2021-nov-transducers-workshop -- not sure if there's a video too?
thanks, sean
I can't find a video link for it.
that's alright, I can look through the presentation
Rich Hickey's talk introducing transducers also touches on this (linked at the proper timestamp): https://youtu.be/6mTbuzafcII?t=1530
The key thing to understand is that composition works the same way you expect, and the first transducer you list is the outermost wrapper and sits on top of the call stack – but values traverse the transducer stack from the bottom, being generated by the step function and then successively transformed by each transducer as they get returned up the stack.
can you give an example? I've never used comp where transducers were interspersed with other functions so struggling to understand a motivating example
(->> (re-seq #"\w+" line)
(map set)
(partition-all 10))
how would you make that a comp?
the answer is
(comp #(partition-all 10 %)
#(map set %)
#(re-seq #"\w+" %))
but that's not using it as transducers
I don't think it's possible?
re-seq
gives you the collection you want to transduce to that is not part of the comp
you'd use something like (into [] xf (req-seq #"\w+" line))
where xf
is the result of the comp
right, you're saying you can't combine them
(map (comp #(partition-all 10 %)
#(map set %)
#(re-seq #"\w+" %))
vec-of-strings)
(->> coll
(map set)
(partition-all 10))
That's what you are starting with...(into []
(comp (map set)
(partition-all 10))
coll)
That's what that becomes.yes, assuming I have already run the re-seq on the string
and that I write it using transducers like that
but my question was really how to combine them with the re-seq in comp
and it doesn't sound like that's possible
and I'm probably missing something in my understanding that would make that obvious
Right, I'm just trying to draw the distinction between "I do something to produce a collection" and "I do a sequence of operations on a collection"
right, but one of the sequence of operations is to split it
it's part of a pipeline
but it's not really how transducers link up
so meh
If you had a collection of lines and you were mapcat
'ing a word-splitter over that collection, it would go in the comp
.
I'm really tired
I should just go to bed and look at this in the morning when it'll probably be obvious to me
thanks, y'all
user=> (def lines ["the quick brown fox" "jumped over the lazy dog"])
#'user/lines
user=> (into [] (comp (mapcat #(re-seq #"\w+" %))
#_=> (map set)
#_=> (partition-all 10))
#_=> lines)
[[#{\e \h \t} #{\c \i \k \q \u} #{\b \n \o \r \w} #{\f \o \x} #{\d \e \j \m \p \u} #{\e \o \r \v} #{\e \h \t} #{\a \l \y \z} #{\d \g \o}]]
oh, nice
I got reminded about that I have once spent a lot of time on trying to sieve prime numbers using some different Clojure options. Submitted my fastest solution to a repository where they collect sieves made in many languages. https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/PrimeClojure/solution_3/sieve.clj This also reminds me that I am puzzled by the tiny gains I harvest from using futures to parelellize things. For a sieve of all primes up to 1 million, I only gain some 20%. Thinking maybe I am doing something wrong. I actually don’t see much activity over the cores of my CPU when running the benchmark.
Here’s the leaderboard, btw. https://plummerssoftwarellc.github.io/PrimeView/report?id=davepl-1639302015.json&hi=False&hf=False&hp=True&fi=&fp=&fa=&ff=&fb=&tp=True&sc=ps&sd=True
One super puzzling thing there is with the top three solutions. They are in the range of 10,000 X faster than the rest of the pack. The fastest solution is in Common Lisp, and it is 5000 times faster than the fastest Rust solution. (But I’m derailing my own question, please ignore this tangent 😄).
but... you're not actually parallelizing the sieving? just the extraction of the primes from the array?
I bet what is dominating the runtime is constructing the result vector & the intermediate mapcat list in
(into [2]
(mapcat deref)
futures)
the rules seem to say
> • Your benchmarked code returns either a list of primes or the is_prime
array, containing the result of the sieve.
so why not just return the array then?
I don't think there's anything to be gained by clojure-specific stuff here. The optimum on JVM will be a tight loop over an array.
maybe I just can't read the results, but I don't see clojure solution #3 in the results, only #2
Yeah, the array building(not parallel) is all the work.of the sieve, all that faffing about with futures is nonsensical
Right, so when I was experimenting with this getting the values out of the sieve was actually taking up a significant part of the time. It depended a bit on which storage I had. But, yes, it is still a very little work per access, so maybe that’s why the futures gain me so little?
yeah, you're effectively parallelizing the aget
s, which are cheap, while not parallelizing the construction of the result vector and adding some intermediate transients
but the fastest way to do something is to not do it at all, so I'd just skip the result vector creation
I'm probably sounding pretty harsh, sorry for that, I'm grumpy today for some reason
The reason the common lisp one is at the top is it does the whole thing at read or macro expansion time
yeah the benchmark competition feels pretty pointless, but the journey can be educational
I dunno, I mean, the common lisp impl does it and sits at the top of the board, it may be fine with the rules
My solution was not written for this “competition”. It was just something I spent time on some year ago. But I will test with returning the array and see where that lands me.
https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#faithfulness
it's kinda implying you need to actually do the computation at runtime, but not really stating it
> So, to summarize: mayerrobert-cl-hashdot
does not use compile-time evaluation, the evaluation happens before compile-time.
I don’t think it makes sense contributing a solution that just moves the work to when the clock is not running.
I haven’t run the benchmark but I think my solution would rank somewhere around 34-35 in this list https://plummerssoftwarellc.github.io/PrimeView/report?id=davepl-1639302015.json&hi=False&hf=False&hp=True&fi=&fp=&fa=&ff=uf&fb=&tp=True&sc=ps&sd=True
And Clojure solution 2 is not realizing the lazy sequences, so I don’t think we can really count that one.
but yeah a doall in the sieve function would be more honest, but probably wouldn't affect the timing that much
If you compare to the runner in solution 3 you’ll see the difference (not that it is needed in my solution, but anyway).
If I just return primes, it runs 2-3X slower.
(defn sieve [^long n]
(let [primes (boolean-array (inc n) true)
sqrt-n (int (Math/ceil (Math/sqrt n)))]
(if (< n 2)
'()
(loop [p 3]
(if (< sqrt-n p)
primes
(do
(when (aget primes p)
(loop [i (* p p)]
(when (<= i n)
(aset primes i false)
(recur (+ i p p)))))
(recur (+ p 2))))))))
I must revisit this during the weekend. 😃You mean what you just pasted is slower than what's in the first message of this thread? I wonder if counting the primes could be the bottleneck then. But that's outside the sieve
function of course, so I'm wondering if you even really need to do it... :)
I guess the counting could be parallelized if needed. Is your implementation now loop-aget-inc?
Whatever, it's a silly game with ill-defined rules. The tight array loop will be the same speed in pretty much all languages, but the varying ceremonies around it might cost a lot or nothing at all.
see also https://benchmarksgame-team.pages.debian.net/benchmarksgame/index.html (edit: fixed link)
Clojure used to be published in the computer language benchmarks game, and I and Alex Miller and a few others spent some time tweaking Clojure programs for it to get it close to Java. Beating the Java solution is pretty unlikely, unless the Java solution did something inefficiently, but there are enough players that the Java solutions were not leaving spare cycles sitting around.
And yes, there were discussions about whether the solutions for some problems were following the rules or not, although I don't recall any that precomputed the answer at compile time 🙂
@US1LTFF6D Those measurements where with criterium
(quick-bench (doall (sieve 1000000)))
So no counting. Results:
clj꞉sieve꞉>
#'sieve/sieve
clj꞉sieve꞉>
Evaluation count : 54 in 6 samples of 9 calls.
Execution time mean : 13.230384 ms
Execution time std-deviation : 559.934025 µs
Execution time lower quantile : 12.437373 ms ( 2.5%)
Execution time upper quantile : 13.805512 ms (97.5%)
Overhead used : 6.866165 ns
nil
clj꞉sieve꞉>
#'sieve/sieve
clj꞉sieve꞉>
Evaluation count : 168 in 6 samples of 28 calls.
Execution time mean : 4.281483 ms
Execution time std-deviation : 218.415891 µs
Execution time lower quantile : 4.055185 ms ( 2.5%)
Execution time upper quantile : 4.511662 ms (97.5%)
Overhead used : 6.866165 ns
nil
clj꞉sieve꞉>
(The first one is from just returning the boolean-array, the second from doing all those silly futures…)@U0CMVHBL2 thanks for sharing! This particular leaderboard is pretty young. It is a followup on this video: https://www.youtube.com/watch?v=D3h62rgewZM Which hasn’t been around for very long. Mar 25 2021.
no need for that since you're returning a vector or an array, so there's no laziness
but of course the doall
for the boolean array is slower because it has more elements to walk through
yeah I guess doall
could just short-circuit on non-LazySeq arguments, but it doesn't
Just returning the array:
Execution time mean : 1.119677 ms
Returning the sieve
Execution time mean : 3.050549 ms
If I recall correctly I got faster results when updating a BitSet, but it was slower to collect the results. So just maybe I can shave off some little more within the rules of being allowed to skip collecting.
with spec is there a succinct way to say a map needs both ::a
and ::b
present or none of them? Ie. including just ::a
would be invalid.
Do you have the option of bundling ::a
and ::b
into a composite value and making that optional?
no, porting a legacy system and need to be api compatible. that’s definitely the way to go if this was not the case. pretty happy with what I ended up with though. thanks for the input!
> or none of them spec will resist this, in general, because it is intended to describe open maps
but you can probably do it with a top level s/or
and an arbitrary predicate asserting that neither one is in there
you mean make two specs, ::without
and ::with-a-and-b
and or those? thats my fallback solution if I cant come up with anything cleaner. Ideally I would put them in a nested map, but I can’t break the api.
spec is Turing-complete, if you write your own predicate functions that return true for things you consider correct and false for things you don't.
No, I mean
(s/or (s/keys :req [::a ::b])
#(not (or (contains? % ::a)
(contains? % ::b)))
aha, because the conformed value is passed to the lambda, i see. Still learning spec, thanks!
Hm, would the conformed value be passed there? I wouldn't expect that, but surprise conformations still catch me somtimes. I didn't test what I just wrote, just giving the general idea.
no wait, that’s with s/and
. these are just separate functions
(s/and (s/keys :req-un [::x] :opt-un [::a ::b])
#(= (contains? % :a)
(contains? % :b)))
this works. thanks!s/valid?
checks the conformed value which might not be what you want in the case of s/or
But that's not the correct syntax for s/or
anyway @dorandraco . You need to have the names for every path as well
Using cursive for a bit in Intellij… is there a way to get C-c C-c and C-c C-e to work like in Emacs (for sending form under point and before point to REPL, respectively)?
there’s a #cursive channel. I know those are configurable key commands so just need to figure out where that is
Cool… I’ll ask there. 🙂
It can cause problems when used for a library by forcing users to use timbre as their logging implementation as well.
Or at least that seemed to be true last time I used it for a library. I've since switched that project to clojure.tools.logging
It's also hard to get line information and similar from clojure.tools.logging when it's routed through timbre.
We didn't like how many other taoensso libs it brought in, and they all seemed to be constantly updating. I think there were other concerns for us as well. Maybe @hiredman has opinions on it too?
We'd originally started out with c.t.l but wanted more configurability (or so we thought) but after switching to timbre and using it for a while, we found we were not leveraging those extra features, so we switched back to c.t.l (and log4j2).
Timbre has a bridge for slf4j, I think? And I think you have to bridge other logging libs to slf4j in order to control them all with timbre? Can't remember details any more.
There's also a lot to be said for simply using "standard Java approaches" for stuff that cross-cuts all the (Java) libs you depend on.
the slf4j bridge is, if I recall, written in clojure and distributed as an aot compiled library
Yeah https://github.com/fzakaria/slf4j-timbre/blob/master/project.clj#L20-L26 -- it tries to only put its compiled code into the JAR but that still means binary compatibility could be an issue across Clojure versions?
if you do want to use timbre, https://github.com/stuartsierra/stacktrace.raw is nice to have in your back pocket
The main benefit i want is 1. be out of the log4j game and 2. be able to log maps and move away from string-ey logs
we had an unrelated outage and very quickly discovered our logs aren’t the most pleasant to search through atm
> We’d originally started out with c.t.l but wanted more configurability (or so we thought) but after switching to timbre and using it for a while, we found we were not leveraging those extra features, so we switched back to c.t.l (and log4j2). Was the switch back driven by any particular experience, or just realizing that it wasn’t giving a value add
No real value add and we try to keep dependencies small/focused. That's also what has led us to move from Cheshire to data.json and from clj-http to http-kit (initially) and then to Hato (a wrapper for Java's HttpClient).
We've made several library switches over the years in a quest to prune back dependencies (and reduce deployment artifact size).
We switched from Redisson to Jedis (and wrote our own connection pool code) to get away from the mass of dependencies (and bloat) that Redisson brought in.
i dont think our org is there yet, and i’ll have to look at its full tree to make a call
> clojure -X:deps list :aliases '[:dev :everything]' | fgrep -v /Developer | wc -l
232
(the fgrep -v
removes all the :local/root
deps within our monorepo so that's 232 external libraries)Adding test deps is another 30 libs. Adding build is another 55. Adding poly
is another 50. But the 232 above is just production code.
for structured logging, there are benefits to using json - there's a lot of tooling for parsing json logs
im not against mulog, and originally that is what i wanted to do, but i couldn’t get it figured out and kept getting deadlocks locally so i kinda passed it over
In my projects, I go for clojure tools logging hooked into logback. Works well for me 🙂
i’m hoping a log4j-lite comes out that is api compatible with log4j and with no jndi and no expansion. just simple logging
If environment values need to be logged, then the code that calls the logger should be getting those vars and supplying them to the logger. It’s a better approach from a composability point of view, and any security issues are the responsibility of the local system. Most people never need to include ${} in their logs. I believe that putting this “convenience” in there for everyone was a mistake in general, but the latest security nightmare would back me up on this point.
also, people put way too much trust in environment variables - there are simple accessible ways to access the list of all key value pairs in the environment for every mainstream language (I think it's a cargo cult thing that stems from the data at rest / data in motion distinction which is also overrated...)
I have such a library @U11BV7MTK . Hoping to open source it
that would be amazing @U050ECB92.
https://github.com/pedestal/pedestal/tree/master/log might be worth a look as well
and, not a complete logging library by any means, I've been using something like https://gist.github.com/hiredman/64bc7ee3e89dbdb3bb2d92c6bddf1ff6 in personal projects
I find it hard to believe that I am very close to hating Java logging for 25 years
I was just about to say, this can only amplify the logging mess on the JVM
You know, how the entire OpenSSL debacle from a while back spawned all sorts of alternatives
brb, claiming domain /s
you would think that logging is a solved problem in java after x years but i guess templating can screw everything up
BTW, is there a comprehensive solution that would: • Log structured data while preserving as much as possible (via EDN or transit or maybe something) • Work with Clojure • Work with ClojureScript • Have a UI with good filtering capabilities (from mundane date ranges and full text search to structured search of the logged data) ?
But they satisfy only the first 3 items, right? Or do they also include logged data viewers?
i don’t think any logging library does that - and the tools that do that seem to like looking at lines of json
Well, I'm not speaking strictly of libraries. It can be a SaaS or something else. But yeah, I haven't seen one either, and JSON is... lackluster, to say the least.
@U065JNAN8 What was your “production” mulog setup, if i can ask?
It was pretty simple. Just included mulog and mulog-cloudwatch in the project.clj and configured as described here https://github.com/BrunoBonacci/mulog/blob/master/doc/publishers/cloudwatch-logs-publisher.md
I currently have a basic implementation. However, each time I send a message, a new connection is made. Is that something I should expect? I only want my connection to be print once, so any ideas to do that? Here's my code (defn listener
[port]
(println "Listening on port" port)
(with-open [server-socket (ServerSocket. port)
socket (.accept server-socket)]
(doseq [msg (line-seq (io/reader socket)) socket)]
(println "New connection from" socket)
(println msg))))
My connections look like this New connection from #object[java.net.Socket 0x2973e18 Socket[addr=/0:0:0:0:0:0:0:1,port=37194,localport=8383]]
that creates a server socket, accepts one connection, prints out, then closes the connection and server socket
so the only way you can get multiple prints from that is if you have a loop somewhere else that restarts the server socket after you close it
it almost doesn't matter in what language because the structure of socket programs is basically always the same
create a server socket, accept connections, hand off the connection to something that does something with it (often another thread) loop back to accepting connections
https://docs.oracle.com/javase/tutorial/networking/sockets/clientServer.html might be a good read
@hiredman Sorry if I misunderstood the reading - is this fine? I'm not sure if threads are necessary as the socket closes (defn listener
[port]
(println "Listening on port" port)
(with-open [server-socket (ServerSocket. port)
socket (.accept server-socket)]
(while true
(println "New connection from" socket)
(doseq [msg (receive socket)]
(println msg)))))
that will loop forever reading from the the one connection (until an exception is thrown) and cannot handle multiple connections
(defn listener
[port]
(let [running (atom true)]
(println "Listening on port" port)
(future
(with-open [server-socket (ServerSocket. port)]
(while (deref running)
(with-open [socket (.accept server-socket)]
(println "New connection from" socket)
(doseq [msg (receive socket)]
(println msg))))))))
and if handle connection doesn't hand the connection off to a thread (if you aren't using nio) then you will only ever handle one connection at a time
I got this, hopefully it does what you described (defn listener
[port]
(let [running (atom true)]
(println "Listening on port" port)
(with-open [server-socket (ServerSocket. port)]
(while (deref running)
(future
(with-open [socket (.accept server-socket)]
(println "New connection from" socket)
(doseq [msg (receive socket)]
(println msg))))))))
(defn listener
[port]
(let [running (atom true)]
(println "Listening on port" port)
(future
(with-open [server-socket (ServerSocket. port)]
(while (deref running)
(future (with-open [socket (.accept server-socket)]
(println "New connection from" (.. socket getInetAddress getHostAddress))
(doseq [msg (receive socket)]
(println msg)))))))))
That's what I thought, doing so makes it stop listening to the port, for some reason
It seems to me that logging in most programming languages suffers from the "if there isn't one clear way that everyone agrees on early enough across most/all libraries, then it is easy enough to roll your own that N independent implementations are created". I know this happens in C. Perhaps logging is an example of a feature that suffers from something like "the Lisp curse" (except it happens repeatedly across many programming languages)? http://www.winestockwebdesign.com/Essays/Lisp_Curse.html
Any recommendations for a websocket client? I've found https://github.com/cch1/http.async.client but want to check if anybody has storng feelings about the alternatives.
If you're on jdk 11+ there is built-in jdk support. I've heard good things about https://github.com/gnarroway/hato as a wrapper
I am not sure what's the status on that one in particular on more recent jdks, but we got bit by it a few times before we had to disable http2 on the client.