Fork me on GitHub
#clojure
<
2018-05-28
>
michael.heuberger02:05:55

searching for boot code examples able to parse an xml template and fill it with values - anyone?

seancorfield02:05:33

@michael.heuberger Do you mean "produce an XML document with values substituted in"? I'd look at Selmer for that. If you just want substitutions in a template, you don't need to parse it.

michael.heuberger22:05:19

Yeah, Selmer it is. Thanks!

hawari05:05:50

Hi everyone, is there a way to profile a memory usage in a Clojure program?

the2bears05:05:24

Any JVM profiler should be fine for that.

the2bears05:05:56

YourKit is supposed to be very good.

hawari06:05:54

Alright, I was thinking maybe there are something more "repl-esque" rather than the common profiler

seancorfield06:05:26

@hawari.rahman17 Do you just want to figure out the memory size of Clojure data structures?

seancorfield06:05:47

You could look at com.clojure-goes-fast/clj-memory-meter, then you can do

(require '[clj-memory-meter.core :as mm])
(mm/measure (your-expression))

👍 4
hawari07:05:52

Thanks @seancorfield I'll check that out. Though I must admit, being a Java novice myself, I don't really know how to diagnose a performance problem.

hawari07:05:19

Usually I only need to test something that I can observe (e.g. how much time does a function took, where can it be improved)

hawari07:05:11

But since now the problem is memory consumption I tend to get overwhelmed by tools available without knowing how to use it properly.

hawari07:05:44

I'm using VisualVM now, and I have to praise it for being easy to approach for a novice like me

matan12:05:11

Hey, any equivalent to python pandas that you'd know of, in the Clojure world? *equivalent in performance I mean, I guess

xtreak2913:05:30

Quick googling gets me this : > huri.core a loose set of functions on vanilla Clojure collections that constitute an ad-hoc specification of a data frame; along with some utility and math functions. https://github.com/sbelak/huri https://cbds.netlify.com/2017/10/12/data-analysis-with-clojure/

matan12:05:53

I've just finished delivering a small data science project where I had to use python, and it really reminds me why I don't find python to be that great, even for data science. But it has pandas dataframes 🙂

matan12:05:03

I think clojure maps would take X 100 more space than a dataframe would, for the same data

matan12:05:29

And all the object constructors implicated in clojure collections would make data crunching X 100 slower

matan12:05:10

Any lean and mean clojure library out there, providing dataframe functionality?

alan14:05:39

@matan actually pandas is not that performant, even according to his creator: http://wesmckinney.com/blog/apache-arrow-pandas-internals/

tbaldridge14:05:51

Well all those criticisms apply and then some for Clojure collections.

tbaldridge14:05:14

Seems like there should be some good FP solution to the problem of data views being fast when mutable.

tbaldridge14:05:56

Stuff like core.matrix is nice but it forces you to manually choose between mutable and immutable ops.

tbaldridge14:05:24

Maybe more escape analysis in the compiler could help here

val_waeselynck14:05:26

@U07TDTQNL you mean, having macros do static analysis? What about something like transducers transients instead?

tbaldridge14:05:57

No i mean something that analyzes Clojure code and swaps in mutable ops when it can be proven doing so won’t affect the semantics

tbaldridge14:05:01

So that something like (reduce conj [] (range 10)) would automatically use transients.

mpenet14:05:26

someone saw bodil talk on persistent/immutable data structures in rust? 🙂 I think she does just that using ownership (ex if sole owner, it does update in place, etc etc)

mpenet14:05:08

So basically that becomes easy to just use them without second thought and get best performance when you can afford it without having to do anything.

troglotit14:05:14

I follow bodil on twitter, so read a little bit about that library. But yeah, that sounds awesome, about mutation in place. I thought it was just like regular clojure-structures with auto-RC on them

tbaldridge18:05:34

The problem is that RC is really expensive in a multi-core environment. There's a few tricks to make it faster (mostly around turning it into something more like a GC), but that often means giving up the features that make it suitable for transient tracing

tbaldridge18:05:12

So I think some sort of static analysis would need to be involved here, and preferably something that allows lookups to be moved to before modifications.

tbaldridge18:05:33

I wouldn't be surprised of Haskell does some stuff like this, but I haven't seen any examples of it.

tbaldridge18:05:39

For example:

tbaldridge18:05:54

(let [v [1 2 3]] (conj (conj v (inc (nth v 0)) (inc (nth v 1))))

tbaldridge18:05:11

The compiler should be able to figure out that the nths can be reordered and performed before the execution of any calls to conj.

tbaldridge18:05:21

That's something I don't think can be done in a RC-only model

matan18:06:16

@U07TDTQNL I believe having features like that in the clojure compiler would make it more than just a naive interpreter 🙂 while hopefully paving the way (?) for making transducer syntax unnecessary in some cases. Transducers are nice, but their current API is horribly non-intuitive... a hallmark of a feature strapped on top a ~10 years old language

tbaldridge18:06:59

Yeah, transducers are really mini processes, but they're not modeled that way in Clojure.

tbaldridge18:06:25

it should be possible to model transducers as immutable actors, and then optimize out the performance regressions

matan11:06:14

Well maybe in Clojure 3.4, or a newer language drawing from its ideas

jco14:05:23

Hi, I'm trying to figure out some basics on worker queues and messaging systems (RabbitMQ, etc). I basically want to create a queue of tasks where a consumer can fetch stuff to do. This should hopefully lead to no other consumers retrieving the same task (I don't know if there are theoretical limitations on what can be achieved in this regard -- in that case I'd probably have to go with at-least-once delivery and make sure to have my actions be idempotent). I saw immutant used HornetQ for this kind of stuff. Does anyone know of any other cool resources to check out? Currently we're using Postgres DB:s for a lot of stuff that might be a better fit for a queue. What do other people out there use?

ddellacosta15:05:20

I’ve used langohr (for rabbitmq) at a few places, worth checking out, docs are pretty decent: http://clojurerabbitmq.info/

jco15:05:04

Yes, I've looked a bit at that, and it seems pretty nice. Do you use a relational DB for storing tasks/messages as well? Or is RabbitMQ a replacement for that (for storing just the messages/tasks I mean)?

val_waeselynck15:05:35

Because I deploy on AWS, I use SQS (https://aws.amazon.com/sqs/) as a job queue - very cheap. You should use a Job queue to dispatch ephemeral commands, not durable information. I recommend http://dataintensive.net/ to get your ideas straight about such topics

👍 4
jco15:05:45

I'm also using AWS so that could be interesting. You haven't experienced a lot of limitations with SQS? Thanks for the link.

jco15:05:09

Ordered that book by the way, seems worth it.

ddellacosta15:05:14

@U6GL1FLLF we do not use a separate layer to store messages, no

ddellacosta16:05:26

RMQ supports a bunch of different paradigms so it’s worth seeing if your architecture maps on to it.

val_waeselynck16:05:21

@U6GL1FLLF no limitations to report really, but we haven't put a lot of load on it, and our use case is fairly basic. Be aware that it's text-based, and that by default you cannot rely on ordering of messages, and the delivery is at-least-once.

matan18:05:30

@alan well Apache Arrow looks like the thing (!) I'm looking into which Java / clojure libraries actually implement the Arrow standard now...

matan18:05:14

> Apache Arrow and the "10 Things I Hate About pandas" I don't like python's pandas too much either.. the API is downright awkward.

matan19:05:32

A quick google search shows that Apache Arrow Java implementations are pretty much pre-alpha or so..

troglotit20:05:31

to be fair, the most recent version is 0.9.0 and java’s version is on par (0.9.0) http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.arrow%22%20AND%20v%3A%220.9.0%22 here’s the github https://github.com/apache/arrow/tree/master/java but otoh, they’re working on binary stability of data for 1.0.0

matan19:05:18

Apache Arrow being only a specification and effort for cross-language integration of in-the-make implementations. Would be great to use its implementations in few years as it matures!

twashing23:05:49

Running the let block shows me that I’m correctly joining values. But the result output still doesn’t have the joined values.