2018-05-28 clojure | Clojure Slack Archive

clojure 2018-05-28

hawari 2018-05-28T07:02:52.000100Z

Thanks @seancorfield I'll check that out. Though I must admit, being a Java novice myself, I don't really know how to diagnose a performance problem.

hawari 2018-05-28T07:04:19.000008Z

Usually I only need to test something that I can observe (e.g. how much time does a function took, where can it be improved)

hawari 2018-05-28T07:05:11.000032Z

But since now the problem is memory consumption I tend to get overwhelmed by tools available without knowing how to use it properly.

hawari 2018-05-28T07:31:44.000280Z

I'm using VisualVM now, and I have to praise it for being easy to approach for a novice like me

2018-05-28T08:02:54.000223Z

@zakora_clojurians has joined the channel

2018-05-28T08:13:04.000073Z

@jcwejman has joined the channel

matan 2018-05-28T12:20:11.000459Z

Hey, any equivalent to python pandas that you'd know of, in the Clojure world? *equivalent in performance I mean, I guess

xtreak29 2018-05-28T13:08:30.000035Z

Quick googling gets me this : > huri.core a loose set of functions on vanilla Clojure collections that constitute an ad-hoc specification of a data frame; along with some utility and math functions. https://github.com/sbelak/huri https://cbds.netlify.com/2017/10/12/data-analysis-with-clojure/

matan 2018-05-28T12:20:53.000140Z

I've just finished delivering a small data science project where I had to use python, and it really reminds me why I don't find python to be that great, even for data science. But it has pandas dataframes 🙂

matan 2018-05-28T12:21:03.000296Z

I think clojure maps would take X 100 more space than a dataframe would, for the same data

matan 2018-05-28T12:21:29.000237Z

And all the object constructors implicated in clojure collections would make data crunching X 100 slower

matan 2018-05-28T12:22:10.000165Z

Any lean and mean clojure library out there, providing dataframe functionality?

alan 2018-05-28T14:16:39.000458Z

@matan actually pandas is not that performant, even according to his creator: http://wesmckinney.com/blog/apache-arrow-pandas-internals/

2018-05-28T14:28:51.000286Z

Well all those criticisms apply and then some for Clojure collections.

2018-05-28T14:30:14.000471Z

Seems like there should be some good FP solution to the problem of data views being fast when mutable.

2018-05-28T14:30:56.000393Z

Stuff like core.matrix is nice but it forces you to manually choose between mutable and immutable ops.

2018-05-28T14:31:24.000045Z

Maybe more escape analysis in the compiler could help here

val_waeselynck 2018-05-28T14:32:26.000061Z

@tbaldridge you mean, having macros do static analysis? What about something like ~~transducers~~ transients instead?

2018-05-28T14:33:57.000259Z

No i mean something that analyzes Clojure code and swaps in mutable ops when it can be proven doing so won’t affect the semantics

2018-05-28T14:35:01.000376Z

So that something like (reduce conj [] (range 10)) would automatically use transients.

mpenet 2018-05-28T14:39:26.000498Z

someone saw bodil talk on persistent/immutable data structures in rust? 🙂 I think she does just that using ownership (ex if sole owner, it does update in place, etc etc)

mpenet 2018-05-28T14:41:08.000205Z

So basically that becomes easy to just use them without second thought and get best performance when you can afford it without having to do anything.

troglotit 2018-05-28T14:42:14.000278Z

I follow bodil on twitter, so read a little bit about that library. But yeah, that sounds awesome, about mutation in place. I thought it was just like regular clojure-structures with auto-RC on them

mpenet 2018-05-28T14:44:26.000477Z

-> https://bodil.lol/persistence/#0 https://www.youtube.com/watch?time_continue=1&v=Gfe-JKn7G0I

2018-05-28T18:43:34.000044Z

The problem is that RC is really expensive in a multi-core environment. There's a few tricks to make it faster (mostly around turning it into something more like a GC), but that often means giving up the features that make it suitable for transient tracing

2018-05-28T18:44:12.000191Z

So I think some sort of static analysis would need to be involved here, and preferably something that allows lookups to be moved to before modifications.

2018-05-28T18:44:33.000064Z

I wouldn't be surprised of Haskell does some stuff like this, but I haven't seen any examples of it.

2018-05-28T18:45:39.000104Z

For example:

2018-05-28T18:46:54.000024Z

(let [v [1 2 3]] (conj (conj v (inc (nth v 0)) (inc (nth v 1))))

2018-05-28T18:48:11.000125Z

The compiler should be able to figure out that the nths can be reordered and performed before the execution of any calls to conj.

2018-05-28T18:48:21.000211Z

That's something I don't think can be done in a RC-only model

matan 2018-06-29T11:23:14.000175Z

Well maybe in Clojure 3.4, or a newer language drawing from its ideas

matan 2018-06-27T18:13:16.000282Z

@tbaldridge I believe having features like that in the clojure compiler would make it more than just a naive interpreter 🙂 while hopefully paving the way (?) for making transducer syntax unnecessary in some cases. Transducers are nice, but their current API is horribly non-intuitive... a hallmark of a feature strapped on top a ~10 years old language

2018-06-27T18:21:59.000034Z

Yeah, transducers are really mini processes, but they're not modeled that way in Clojure.

2018-06-27T18:23:25.000010Z

it should be possible to model transducers as immutable actors, and then optimize out the performance regressions

jco 2018-05-28T14:55:23.000283Z

Hi, I'm trying to figure out some basics on worker queues and messaging systems (RabbitMQ, etc). I basically want to create a queue of tasks where a consumer can fetch stuff to do. This should hopefully lead to no other consumers retrieving the same task (I don't know if there are theoretical limitations on what can be achieved in this regard -- in that case I'd probably have to go with at-least-once delivery and make sure to have my actions be idempotent). I saw immutant used HornetQ for this kind of stuff. Does anyone know of any other cool resources to check out? Currently we're using Postgres DB:s for a lot of stuff that might be a better fit for a queue. What do other people out there use?

2018-05-28T15:19:20.000017Z

I’ve used langohr (for rabbitmq) at a few places, worth checking out, docs are pretty decent: http://clojurerabbitmq.info/

jco 2018-05-28T15:31:04.000017Z

Yes, I've looked a bit at that, and it seems pretty nice. Do you use a relational DB for storing tasks/messages as well? Or is RabbitMQ a replacement for that (for storing just the messages/tasks I mean)?

val_waeselynck 2018-05-28T15:42:35.000371Z

Because I deploy on AWS, I use SQS (https://aws.amazon.com/sqs/) as a job queue - very cheap. You should use a Job queue to dispatch ephemeral commands, not durable information. I recommend http://dataintensive.net/ to get your ideas straight about such topics

👍 1

jco 2018-05-28T15:50:45.000255Z

I'm also using AWS so that could be interesting. You haven't experienced a lot of limitations with SQS? Thanks for the link.

jco 2018-05-28T15:59:09.000351Z

Ordered that book by the way, seems worth it.

2018-05-28T15:59:14.000209Z

@jco we do not use a separate layer to store messages, no

2018-05-28T16:00:26.000407Z

RMQ supports a bunch of different paradigms so it’s worth seeing if your architecture maps on to it.

val_waeselynck 2018-05-28T16:04:21.000069Z

@jco no limitations to report really, but we haven't put a lot of load on it, and our use case is fairly basic. Be aware that it's text-based, and that by default you cannot rely on ordering of messages, and the delivery is at-least-once.

Vuthy 2018-05-28T16:51:36.000082Z

@san.vuthy08 has joined the channel

matan 2018-05-28T18:51:30.000279Z

@alan well Apache Arrow looks like the thing (!) I'm looking into which Java / clojure libraries actually implement the Arrow standard now...

matan 2018-05-28T18:52:14.000147Z

> Apache Arrow and the "10 Things I Hate About pandas" I don't like python's pandas too much either.. the API is downright awkward.

matan 2018-05-28T19:00:32.000028Z

A quick google search shows that Apache Arrow Java implementations are pretty much pre-alpha or so..

troglotit 2018-05-28T20:06:31.000297Z

to be fair, the most recent version is 0.9.0 and java’s version is on par (0.9.0) http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.arrow%22%20AND%20v%3A%220.9.0%22 here’s the github https://github.com/apache/arrow/tree/master/java but otoh, they’re working on binary stability of data for 1.0.0

matan 2018-05-28T19:01:18.000262Z

Apache Arrow being only a specification and effort for cross-language integration of in-the-make implementations. Would be great to use its implementations in few years as it matures!

errob37 2018-05-28T19:08:16.000027Z

@eroberge has joined the channel

twashing 2018-05-28T23:27:49.000035Z

I’m trying to create a stateful transducer, join-averages. https://gist.github.com/twashing/2f3ee581d49161fa23533fd4680fbea8#file-joining-transducer

twashing 2018-05-28T23:27:49.000092Z

Running the let block shows me that I’m correctly joining values. But the result output still doesn’t have the joined values.

twashing 2018-05-28T23:27:53.000131Z

Any ideas?

2018-05-28T02:35:55.000140Z

searching for boot code examples able to parse an xml template and fill it with values - anyone?

seancorfield 2018-05-28T02:56:33.000139Z

@michael.heuberger Do you mean "produce an XML document with values substituted in"? I'd look at Selmer for that. If you just want substitutions in a template, you don't need to parse it.

2018-05-28T22:11:19.000159Z

Yeah, Selmer it is. Thanks!

2018-05-28T03:28:52.000056Z

@erkki.keranen has joined the channel

hawari 2018-05-28T05:13:50.000186Z

Hi everyone, is there a way to profile a memory usage in a Clojure program?

the2bears 2018-05-28T05:21:24.000157Z

Any JVM profiler should be fine for that.

the2bears 2018-05-28T05:21:56.000018Z

YourKit is supposed to be very good.

hawari 2018-05-28T06:24:54.000266Z

Alright, I was thinking maybe there are something more "repl-esque" rather than the common profiler

seancorfield 2018-05-28T06:32:26.000081Z

@hawari.rahman17 Do you just want to figure out the memory size of Clojure data structures?

seancorfield 2018-05-28T06:33:47.000326Z

You could look at com.clojure-goes-fast/clj-memory-meter, then you can do

(require '[clj-memory-meter.core :as mm])
(mm/measure (your-expression))

👍 1

Clojurians Log v2

clojure 2018-05-28