Fork me on GitHub
#beginners
<
2016-08-18
>
oahner01:08:12

@dpsutton: mapv is map but returns a vector instead of a seq, so side effects happen before it returns

dpsutton01:08:58

Well it's because vectors must be realized. There can be no laziness

oahner01:08:14

run! does the same thing as mapv but always returns nil

richardh14:08:30

@peter.d: This looks great. Two comments: 1) It’s important to think about edge cases. What does your function return when given zero arguments? Probably should either throw an error or return a function that does nothing and returns nil. But right now such a function would return a sequence of its arguments, which is probably not what you want:

((any-number-comp) 5 6 7)
;; => repl output: (5 6 7)

richardh14:08:58

@peter.d 2) When you evaluate (comp f1 f2 f3 f4), only f4 can take multiple arguments while the rest can only take a single argument (because of course in Clojure functions can only return one value):

(def mult-then-inc (comp str inc *))
(mult-then-inc 2 3 5)
;; => repl output: “31"
So you could have an initial step which evaluates the first function (the last in the argument vector before the reversal), and then the loop could evaluate all the other ones, without needing that (if (coll? result) ...) expression. This would also have the added effect of throwing an error if you pass no arguments to any-number-comp, which would be, if not exactly what you want, slightly better than what it’s doing now (see #1 above). Good work, overall!

richardh14:08:07

Question for anyone: I want to have a sorted sequence of items in a priority queue, that I conj other items into frequently, and it needs to resort itself every time there is an insertion. What is the most efficient way to do this? Using a sorted-set, or using a vector and running sort after every insertion? Constraints: I will never have more than a few hundred items, and all of the items are guaranteed to be unique, so I don’t need set semantics for avoiding duplication.

surreal.analysis14:08:51

I believe sorted-set would be better

surreal.analysis14:08:36

sort uses the java Arrays sorting library, so in addition to being an n*log(n) operation, you’re also incurring some overhead with conversion

surreal.analysis14:08:45

sorted-set is just log(n) to add

shooodooken14:08:25

https://clojurians.slack.com/archives/beginners/p1471461591001261 https://clojurians.slack.com/archives/beginners/p1471476167001268 @dpsutton Depending on your use-case, doseq may be more suitable from: https://stuartsierra.com/2015/08/25/clojure-donts-lazy-effects "doseq: good default choice, clearly indicates side effects run!: new in Clojure 1.7, can take the place of (dorun (map ...))"

val_waeselynck14:08:54

@surreal.analysis: n*log(n) is only worst-case, you get far better results when the input coll is almost sorted already

val_waeselynck14:08:22

@richardh: I do encourage you to benchmark

val_waeselynck14:08:32

but I would also go for sorted-set by default

surreal.analysis14:08:41

Right, but even best case is n, not log(n)

surreal.analysis14:08:51

Assuming the sorting method is still timsort

surreal.analysis14:08:22

But then again, when talking about algorithms the coefficients are normally ignored

surreal.analysis14:08:32

And if you have only 300 or so items

surreal.analysis14:08:42

The extra stuff is what matters, so I’d encourage benchmarking too

surreal.analysis14:08:53

But I’d be surprised if sort won

val_waeselynck14:08:21

yeah maybe only for very small sets

val_waeselynck14:08:33

in which case you may not care anyway

richardh14:08:29

@surreal.analysis @val_waeselynck Thank you both for your help. I will do some benchmarks.

richardh14:08:35

Not that it really matters with such small data sets, as you mention. But I’m curious.

guy14:08:11

typically what is a large data set?

surreal.analysis14:08:40

It depends on the context

richardh14:08:53

Good question, don’t know. I’m just assuming 300 is small for what I’m doing. There’s a lot of information being passed between network sockets too — I imagine that’s much more likely to be a bottleneck in my case.

surreal.analysis14:08:23

In my opinion - Large data is whenever manipulating the data itself is the bottleneck preventing the software from being better

surreal.analysis14:08:54

So if you’re working with highly performant code that needs to be heavily optimized, large data could be anything in the hundreds of elements

surreal.analysis14:08:18

If you are working with SQL, large data is in the 10M+ range

richardh14:08:53

@guy Yeah, I didn’t actually think the answer to my question mattered practically — I was just curious if there was an obvious, clear answer for all cases. Code clarity is more important than performance optimization, until it’s not.

guy15:08:19

cool cool

richardh15:08:32

@peter.d One more nitpicky style thing: the main argument list for your comp function is [& f]. I would probably make it [& fs] or something, to make clear that you’re binding a collection of functions, not a single function.

guy16:08:32

is there an official clojure coding style guide?

oahner16:08:13

@guy: not official, but it describes a lot of code that's in the wild: https://github.com/bbatsov/clojure-style-guide

peter.d19:08:25

@richardh A lot of good advices! Thank you so much for your feedback. 🙂