Fork me on GitHub
Samuel Rose03:12:32

Hoping to find a sound way to use apache thrift with Clojure. Anyone have experience with Thrift in Clojure recently?


Thrift outside of FB isn’t exactly super well supported in general. What is your use case?


and my first guess would be that the best approach would be to use the java tools for this and just get data out of results with the appropriate methods


maybe in a prep required local dep if you are in the deps.edn world

Cora (she/her)05:12:58

I'm really confused on the order of execution in comp when you have both transducers and functions in the comp

Cora (she/her)05:12:36

I've relied on the intuition that functions execute last to first and transducers execute first to last in comp

Cora (she/her)05:12:59

but mixing them means I need to actually understand why the case is reversed for transducers

Ben Sless06:12:19

Shameless plug but have you watched my transducers workshop from last month?

Ben Sless06:12:52

Tldr transducers transform reducing functions by wrapping them

Ben Sless06:12:13

The outermost wrap is the one you'll go through first

Ben Sless06:12:09

It isn't that the functions aren't applied from right to left, they are, but those functions aren't the pipeline, but transformations to be resolved when supplied a reducing function

Ben Sless06:12:31

It's like building up a stack. Last in, first out

Cora (she/her)07:12:30

I didn't see the workshop, no


I can't find a video link for it.

Cora (she/her)07:12:14

that's alright, I can look through the presentation simple_smile

Ben Sless07:12:01

Maybe the workshop videos haven't been made public yet

Colin P. Hill12:12:25

Rich Hickey's talk introducing transducers also touches on this (linked at the proper timestamp):

Colin P. Hill13:12:03

The key thing to understand is that composition works the same way you expect, and the first transducer you list is the outermost wrapper and sits on top of the call stack – but values traverse the transducer stack from the bottom, being generated by the step function and then successively transformed by each transducer as they get returned up the stack.


can you give an example? I've never used comp where transducers were interspersed with other functions so struggling to understand a motivating example

Cora (she/her)07:12:31

(->> (re-seq #"\w+" line)
     (map set)
     (partition-all 10))

Cora (she/her)07:12:37

how would you make that a comp?

Cora (she/her)07:12:19

the answer is

(comp #(partition-all 10 %)
      #(map set %)
      #(re-seq #"\w+" %))

Cora (she/her)07:12:27

but that's not using it as transducers

Cora (she/her)07:12:35

I don't think it's possible?


re-seq gives you the collection you want to transduce to that is not part of the comp


you'd use something like (into [] xf (req-seq #"\w+" line)) where xf is the result of the comp


the execute in the order similar to ->> in that case


(comp (map set) (partition-all 10))?

Cora (she/her)07:12:48

right, you're saying you can't combine them

Cora (she/her)07:12:18

(map (comp #(partition-all 10 %)
           #(map set %)
           #(re-seq #"\w+" %))


(->> coll
    (map set)
    (partition-all 10))
That's what you are starting with...


(into []
    (comp (map set)
          (partition-all 10))
That's what that becomes.

Cora (she/her)07:12:03

yes, assuming I have already run the re-seq on the string

Cora (she/her)07:12:11

and that I write it using transducers like that

Cora (she/her)07:12:26

but my question was really how to combine them with the re-seq in comp

Cora (she/her)07:12:40

and it doesn't sound like that's possible

Cora (she/her)07:12:52

and I'm probably missing something in my understanding that would make that obvious


Right, I'm just trying to draw the distinction between "I do something to produce a collection" and "I do a sequence of operations on a collection"

Cora (she/her)07:12:45

right, but one of the sequence of operations is to split it

Cora (she/her)07:12:52

it's part of a pipeline

Cora (she/her)07:12:01

but it's not really how transducers link up


If you had a collection of lines and you were mapcat'ing a word-splitter over that collection, it would go in the comp.

Cora (she/her)07:12:41

I'm really tired

Cora (she/her)07:12:55

I should just go to bed and look at this in the morning when it'll probably be obvious to me


user=> (def lines ["the quick brown fox" "jumped over the lazy dog"])
user=> (into [] (comp (mapcat #(re-seq #"\w+" %))
  #_=>                (map set)
  #_=>                (partition-all 10))
  #_=>       lines)
[[#{\e \h \t} #{\c \i \k \q \u} #{\b \n \o \r \w} #{\f \o \x} #{\d \e \j \m \p \u} #{\e \o \r \v} #{\e \h \t} #{\a \l \y \z} #{\d \g \o}]]


I got reminded about that I have once spent a lot of time on trying to sieve prime numbers using some different Clojure options. Submitted my fastest solution to a repository where they collect sieves made in many languages. This also reminds me that I am puzzled by the tiny gains I harvest from using futures to parelellize things. For a sieve of all primes up to 1 million, I only gain some 20%. Thinking maybe I am doing something wrong. I actually don’t see much activity over the cores of my CPU when running the benchmark.


One super puzzling thing there is with the top three solutions. They are in the range of 10,000 X faster than the rest of the pack. The fastest solution is in Common Lisp, and it is 5000 times faster than the fastest Rust solution. (But I’m derailing my own question, please ignore this tangent 😄).


but... you're not actually parallelizing the sieving? just the extraction of the primes from the array?


I bet what is dominating the runtime is constructing the result vector & the intermediate mapcat list in

(into [2]
                  (mapcat deref)


shouldn't you just return the number of primes from sieve?


There isn't an intermediate mapcat list


oh right yeah sorry


the rules seem to say > • Your benchmarked code returns either a list of primes or the is_prime array, containing the result of the sieve. so why not just return the array then?


I don't think there's anything to be gained by clojure-specific stuff here. The optimum on JVM will be a tight loop over an array.


maybe I just can't read the results, but I don't see clojure solution #3 in the results, only #2


Yeah, the array building(not parallel) is all the work.of the sieve, all that faffing about with futures is nonsensical


The benchmark was run some days ago.


Right, so when I was experimenting with this getting the values out of the sieve was actually taking up a significant part of the time. It depended a bit on which storage I had. But, yes, it is still a very little work per access, so maybe that’s why the futures gain me so little?


yeah, you're effectively parallelizing the agets, which are cheap, while not parallelizing the construction of the result vector and adding some intermediate transients


but the fastest way to do something is to not do it at all, so I'd just skip the result vector creation


I'm probably sounding pretty harsh, sorry for that, I'm grumpy today for some reason


No worries!


The reason the common lisp one is at the top is it does the whole thing at read or macro expansion time

😜 1

Which isn't counted in the execution time


Which is just silly


of course, those wily lispers


The fastest c++ does constant folding, similar kind of deal


So if I moved my sieve to a macro I could achieve something similar?


if you don't care about the rules, sure


I am not considering doing it. 😃


I’m just here to learn.


yeah the benchmark competition feels pretty pointless, but the journey can be educational


I dunno, I mean, the common lisp impl does it and sits at the top of the board, it may be fine with the rules


The readme for cl program even explains it


My solution was not written for this “competition”. It was just something I spent time on some year ago. But I will test with returning the array and see where that lands me.


the rules seem pretty ripe for rule lawyering


it's kinda implying you need to actually do the computation at runtime, but not really stating it


That is usually how these things go


> So, to summarize: mayerrobert-cl-hashdot does not use compile-time evaluation, the evaluation happens before compile-time.


I don’t think it makes sense contributing a solution that just moves the work to when the clock is not running.


at least they tag it as unfaithful


They tag it as both. 😃


yeah they run it in both modes


The faithful version ranks pretty decently, actually.


And Clojure solution 2 is not realizing the lazy sequences, so I don’t think we can really count that one.


it's realizing them in :valid? (= (count primes) ...)


but yeah a doall in the sieve function would be more honest, but probably wouldn't affect the timing that much


It’s only realizing the last pass.


If you compare to the runner in solution 3 you’ll see the difference (not that it is needed in my solution, but anyway).


If I just return primes, it runs 2-3X slower.

(defn sieve [^long n]
  (let [primes (boolean-array (inc n) true)
        sqrt-n (int (Math/ceil (Math/sqrt n)))]
    (if (< n 2)
      (loop [p 3]
        (if (< sqrt-n p)
            (when (aget primes p)
              (loop [i (* p p)]
                (when (<= i n)
                  (aset primes i false)
                  (recur (+ i p p)))))
            (recur  (+ p 2))))))))
I must revisit this during the weekend. 😃


You mean what you just pasted is slower than what's in the first message of this thread? I wonder if counting the primes could be the bottleneck then. But that's outside the sieve function of course, so I'm wondering if you even really need to do it... :)


I guess the counting could be parallelized if needed. Is your implementation now loop-aget-inc?


Whatever, it's a silly game with ill-defined rules. The tight array loop will be the same speed in pretty much all languages, but the varying ceremonies around it might cost a lot or nothing at all.


Clojure used to be published in the computer language benchmarks game, and I and Alex Miller and a few others spent some time tweaking Clojure programs for it to get it close to Java. Beating the Java solution is pretty unlikely, unless the Java solution did something inefficiently, but there are enough players that the Java solutions were not leaving spare cycles sitting around.


And yes, there were discussions about whether the solutions for some problems were following the rules or not, although I don't recall any that precomputed the answer at compile time 🙂


@US1LTFF6D Those measurements where with criterium

(quick-bench (doall (sieve 1000000)))
So no counting. Results:
Evaluation count : 54 in 6 samples of 9 calls.
             Execution time mean : 13.230384 ms
    Execution time std-deviation : 559.934025 µs
   Execution time lower quantile : 12.437373 ms ( 2.5%)
   Execution time upper quantile : 13.805512 ms (97.5%)
                   Overhead used : 6.866165 ns
Evaluation count : 168 in 6 samples of 28 calls.
             Execution time mean : 4.281483 ms
    Execution time std-deviation : 218.415891 µs
   Execution time lower quantile : 4.055185 ms ( 2.5%)
   Execution time upper quantile : 4.511662 ms (97.5%)
                   Overhead used : 6.866165 ns
(The first one is from just returning the boolean-array, the second from doing all those silly futures…)


that's really weird, since the first one should be doing strictly less work, right?


ah it's the doall


@U0CMVHBL2 thanks for sharing! This particular leaderboard is pretty young. It is a followup on this video: Which hasn’t been around for very long. Mar 25 2021.


no need for that since you're returning a vector or an array, so there's no laziness


but of course the doall for the boolean array is slower because it has more elements to walk through


Thanks! I didn’t realize it would traverse the array.


yeah I guess doall could just short-circuit on non-LazySeq arguments, but it doesn't


Just returning the array:
             Execution time mean : 1.119677 ms
Returning the sieve
             Execution time mean : 3.050549 ms


There is a lot to learn from fighting with the various ceremonies around things. 😃


If I recall correctly I got faster results when updating a BitSet, but it was slower to collect the results. So just maybe I can shave off some little more within the rules of being allowed to skip collecting.


with spec is there a succinct way to say a map needs both ::a and ::b present or none of them? Ie. including just ::a would be invalid.


Do you have the option of bundling ::a and ::b into a composite value and making that optional?


no, porting a legacy system and need to be api compatible. that’s definitely the way to go if this was not the case. pretty happy with what I ended up with though. thanks for the input!


What did you end up with?


See separate message in main channel

🙌 1
Colin P. Hill13:12:23

> or none of them spec will resist this, in general, because it is intended to describe open maps

Colin P. Hill13:12:51

but you can probably do it with a top level s/or and an arbitrary predicate asserting that neither one is in there


you mean make two specs, ::without and ::with-a-and-b and or those? thats my fallback solution if I cant come up with anything cleaner. Ideally I would put them in a nested map, but I can’t break the api.


spec is Turing-complete, if you write your own predicate functions that return true for things you consider correct and false for things you don't.

Colin P. Hill13:12:29

No, I mean

(s/or (s/keys :req [::a ::b])
      #(not (or (contains? % ::a)
                (contains? % ::b)))


aha, because the conformed value is passed to the lambda, i see. Still learning spec, thanks!

Colin P. Hill13:12:03

Hm, would the conformed value be passed there? I wouldn't expect that, but surprise conformations still catch me somtimes. I didn't test what I just wrote, just giving the general idea.


no wait, that’s with s/and . these are just separate functions


(s/and (s/keys :req-un [::x] :opt-un [::a ::b])
       #(= (contains? % :a)
           (contains? % :b)))
this works. thanks!


s/valid? checks the conformed value which might not be what you want in the case of s/or But that's not the correct syntax for s/or anyway @dorandraco . You need to have the names for every path as well


Using cursive for a bit in Intellij… is there a way to get C-c C-c and C-c C-e to work like in Emacs (for sending form under point and before point to REPL, respectively)?


there’s a #cursive channel. I know those are configurable key commands so just need to figure out where that is


Cool… I’ll ask there. 🙂


Maybe a dumb question - but what are the downsides to timbre?

Joshua Suskalo19:12:18

It can cause problems when used for a library by forcing users to use timbre as their logging implementation as well.

Joshua Suskalo19:12:47

Or at least that seemed to be true last time I used it for a library. I've since switched that project to

Joshua Suskalo19:12:05

It's also hard to get line information and similar from when it's routed through timbre.


We didn't like how many other taoensso libs it brought in, and they all seemed to be constantly updating. I think there were other concerns for us as well. Maybe @hiredman has opinions on it too?


it messes with stacktraces by default


and what are you going to do about the logging for all your java dependencies?

👆 2

We'd originally started out with c.t.l but wanted more configurability (or so we thought) but after switching to timbre and using it for a while, we found we were not leveraging those extra features, so we switched back to c.t.l (and log4j2).


Timbre has a bridge for slf4j, I think? And I think you have to bridge other logging libs to slf4j in order to control them all with timbre? Can't remember details any more.


There's also a lot to be said for simply using "standard Java approaches" for stuff that cross-cuts all the (Java) libs you depend on.

👍 1

the slf4j bridge is, if I recall, written in clojure and distributed as an aot compiled library


Yeah -- it tries to only put its compiled code into the JAR but that still means binary compatibility could be an issue across Clojure versions?


if you do want to use timbre, is nice to have in your back pocket


The main benefit i want is 1. be out of the log4j game and 2. be able to log maps and move away from string-ey logs


we had an unrelated outage and very quickly discovered our logs aren’t the most pleasant to search through atm


Perhaps mulog?


> We’d originally started out with c.t.l but wanted more configurability (or so we thought) but after switching to timbre and using it for a while, we found we were not leveraging those extra features, so we switched back to c.t.l (and log4j2). Was the switch back driven by any particular experience, or just realizing that it wasn’t giving a value add


No real value add and we try to keep dependencies small/focused. That's also what has led us to move from Cheshire to data.json and from clj-http to http-kit (initially) and then to Hato (a wrapper for Java's HttpClient).


We've made several library switches over the years in a quest to prune back dependencies (and reduce deployment artifact size).


We switched from Redisson to Jedis (and wrote our own connection pool code) to get away from the mass of dependencies (and bloat) that Redisson brought in.


okay so its just the goal of minimizing deps


i dont think our org is there yet, and i’ll have to look at its full tree to make a call


> clojure -X:deps list :aliases '[:dev :everything]' | fgrep -v /Developer | wc -l
(the fgrep -v removes all the :local/root deps within our monorepo so that's 232 external libraries)


Adding test deps is another 30 libs. Adding build is another 55. Adding poly is another 50. But the 232 above is just production code.


for structured logging, there are benefits to using json - there's a lot of tooling for parsing json logs


im not against mulog, and originally that is what i wanted to do, but i couldn’t get it figured out and kept getting deadlocks locally so i kinda passed it over


Bruno's about on slack, so you could ping him to see if he responds?


In my projects, I go for clojure tools logging hooked into logback. Works well for me 🙂

Alex Miller (Clojure team)19:12:37

(but make sure you update logback for the latest)

👍 2

(and set the prop)


i’m hoping a log4j-lite comes out that is api compatible with log4j and with no jndi and no expansion. just simple logging

👆 2

If environment values need to be logged, then the code that calls the logger should be getting those vars and supplying them to the logger. It’s a better approach from a composability point of view, and any security issues are the responsibility of the local system. Most people never need to include ${} in their logs. I believe that putting this “convenience” in there for everyone was a mistake in general, but the latest security nightmare would back me up on this point.


also, people put way too much trust in environment variables - there are simple accessible ways to access the list of all key value pairs in the environment for every mainstream language (I think it's a cargo cult thing that stems from the data at rest / data in motion distinction which is also overrated...)


I have such a library @U11BV7MTK . Hoping to open source it


that would be amazing @U050ECB92.


and, not a complete logging library by any means, I've been using something like in personal projects


yeah see, i want out^

Alex Miller (Clojure team)21:12:00

I find it hard to believe that I am very close to hating Java logging for 25 years

❤️ 1
Lennart Buit21:12:18

I was just about to say, this can only amplify the logging mess on the JVM

Lennart Buit21:12:10

You know, how the entire OpenSSL debacle from a while back spawned all sorts of alternatives

Cora (she/her)21:12:14

can't wait for librelog4j

😆 1
Lennart Buit21:12:14

brb, claiming domain /s


you would think that logging is a solved problem in java after x years but i guess templating can screw everything up

Cora (she/her)21:12:11

sometimes it's the seemingly simple things that end up maddeningly complex

truestory 1

BTW, is there a comprehensive solution that would: • Log structured data while preserving as much as possible (via EDN or transit or maybe something) • Work with Clojure • Work with ClojureScript • Have a UI with good filtering capabilities (from mundane date ranges and full text search to structured search of the logged data) ?


mulog and timbre are the only ones on my radar


But they satisfy only the first 3 items, right? Or do they also include logged data viewers?


i don’t think any logging library does that - and the tools that do that seem to like looking at lines of json


Well, I'm not speaking strictly of libraries. It can be a SaaS or something else. But yeah, I haven't seen one either, and JSON is... lackluster, to say the least.


When I've used mulog, the UI was cloudwatch insights and it was amazing


Thanks! I'll check it out.


@U065JNAN8 What was your “production” mulog setup, if i can ask?


It was pretty simple. Just included mulog and mulog-cloudwatch in the project.clj and configured as described here


I currently have a basic implementation. However, each time I send a message, a new connection is made. Is that something I should expect? I only want my connection to be print once, so any ideas to do that? Here's my code (defn listener [port] (println "Listening on port" port) (with-open [server-socket (ServerSocket. port) socket (.accept server-socket)] (doseq [msg (line-seq (io/reader socket)) socket)] (println "New connection from" socket) (println msg))))


My connections look like this New connection from #object[ 0x2973e18 Socket[addr=/0:0:0:0:0:0:0:1,port=37194,localport=8383]]


that creates a server socket, accepts one connection, prints out, then closes the connection and server socket


so the only way you can get multiple prints from that is if you have a loop somewhere else that restarts the server socket after you close it


I would fine a nice guide to socket programming somewhere


it almost doesn't matter in what language because the structure of socket programs is basically always the same


create a server socket, accept connections, hand off the connection to something that does something with it (often another thread) loop back to accepting connections


only accepting a single connection, then closing everything is not what you want


be sure not to skip the bit at the very end about multiple clients

👍 1

@hiredman Sorry if I misunderstood the reading - is this fine? I'm not sure if threads are necessary as the socket closes (defn listener [port] (println "Listening on port" port) (with-open [server-socket (ServerSocket. port) socket (.accept server-socket)] (while true (println "New connection from" socket) (doseq [msg (receive socket)] (println msg)))))


receive is just (line-seq (io/reader ))


that will loop forever reading from the the one connection (until an exception is thrown) and cannot handle multiple connections


I guess you just misplaced the call to accept?


I ended up using future , I think this version should work


(defn listener [port] (let [running (atom true)] (println "Listening on port" port) (future (with-open [server-socket (ServerSocket. port)] (while (deref running) (with-open [socket (.accept server-socket)] (println "New connection from" socket) (doseq [msg (receive socket)] (println msg))))))))


Nevermind, apparently it isn't


you keep missing the loop around accept


1. accept connection 2. handle connection 3. goto 1


and if handle connection doesn't hand the connection off to a thread (if you aren't using nio) then you will only ever handle one connection at a time


I'm using telnet


I got this, hopefully it does what you described (defn listener [port] (let [running (atom true)] (println "Listening on port" port) (with-open [server-socket (ServerSocket. port)] (while (deref running) (future (with-open [socket (.accept server-socket)] (println "New connection from" socket) (doseq [msg (receive socket)] (println msg))))))))


this will attempt to spin up an infinite number of futures


@hiredman I think that using future again could work? Let me show you


(defn listener [port] (let [running (atom true)] (println "Listening on port" port) (future (with-open [server-socket (ServerSocket. port)] (while (deref running) (future (with-open [socket (.accept server-socket)] (println "New connection from" (.. socket getInetAddress getHostAddress)) (doseq [msg (receive socket)] (println msg))))))))) That's what I thought, doing so makes it stop listening to the port, for some reason


It seems to me that logging in most programming languages suffers from the "if there isn't one clear way that everyone agrees on early enough across most/all libraries, then it is easy enough to roll your own that N independent implementations are created". I know this happens in C. Perhaps logging is an example of a feature that suffers from something like "the Lisp curse" (except it happens repeatedly across many programming languages)?


Any recommendations for a websocket client? I've found but want to check if anybody has storng feelings about the alternatives.


If you're on jdk 11+ there is built-in jdk support. I've heard good things about as a wrapper


I've heard people talk about Sente (haven't used it myself...that may change soon)

Benjamin C21:12:19

Using sente a little bit. I'm impressed.


about jdk11+ http client, be aware it has some nasty quirks


like handling of GOAWAY with http2


as long as you don't use http2 it's fine


I am not sure what's the status on that one in particular on more recent jdks, but we got bit by it a few times before we had to disable http2 on the client.


I still prefer jetty http client personally for "advanced stuff" and good old clj-http for the rest


also out of the box the jdk http client is missing a few important things, like timeout on response body read, you need to use something like mizosoft/methanol (or bake your own) to add support for basic stuff like this