Fork me on GitHub
#beginners
<
2020-02-13
>
noisesmith00:02:49

yeah, order shouldn't matter for that, the existing hash function should be perfect since two equal maps hash to the same value

hiredman00:02:16

I wouldn't count on hash returning the same value between jvm runs though

noisesmith00:02:06

in my tests that works, but yeah that could be brittle

andy.fingerhut00:02:51

For mutable things that let the Java identity hash leak through, definitely can vary from JVM run to JVM run. For Clojure immutable values, their hash has remained consistent across JVM runs since Clojure 1.6.0 (it changed from 1.5.1 and earlier versions then). It could change again in a future Clojure version, even to the point of making them vary from one JVM invocation to the next, but it seems unlikely Clojure's implementers would choose that approach (Python does it, which surprised me on first encountering that behavior).

noisesmith00:02:16

randomizing hashes is a defense against certain http attacks, isn't it?

hiredman00:02:00

yes, and that is what I was thinking of, but I think the mitigation for that was actually built into HashMap and not made put of string hashing

ghadi00:02:11

assuming you use a secure hashing function, you have two choices: 1) calculate hash based on in-memory (Clojure) value 2) calculate hash based on representation of (EDN or whatever) value for 1) you can use https://github.com/arachne-framework/valuehash for 2) you can simply hash the bytes of the stuff you're storing

Alex Miller (Clojure team)00:02:05

they undid the randomized hashing in java

Alex Miller (Clojure team)00:02:11

that was a temporary fix

Alex Miller (Clojure team)00:02:53

the problem was the behavior on hash collisions in java maps was a list of matches which got worse at O(n) rate. they added the hashing stuff in later 1.7 releases iirc to avoid collisions

Alex Miller (Clojure team)00:02:25

then in the next jvm release they changed the impl to devolve to an rb-tree or something nlog n instead of linear

ghadi00:02:19

different tradeoffs there for example using 1) it is indifferent to map ordering {:a :b :c :d} vs. {:c :d :a :b}

ghadi00:02:41

neither hashing in Java nor Clojure is considered cryptographically secure

Sam Heaton01:02:28

Thanks to those who helped me out earlier, I eventually solved it by changing my data structure as suggested and writing this function to get the map minus the entry I wanted to delete

(defn map-except-element
  [id maps]
  (reduce (fn [new-maps [key val]]
            (if (not= key (keyword id))
              (assoc new-maps key val)
              new-maps)) {} maps))

deleted05:02:10

maybe this is late night brain but would dissoc work?

👍 8
Sam Heaton12:02:26

Yeah, definitely. I knew about dissoc but my mind didn’t connect the dots that this would apply once I turned the seq into a map. Way easier! Thanks.

👍 4
Ivan Koz05:02:32

(dissoc (into {} maps) (keyword id))

👍 4
pez07:02:32

So, I sense there is some pattern with the threading macros -> and ->>, that I can't explain to myself. It seems that it is either/or. Sometimes it is -> solving the whole pipeline or it is ->>. (Yes, I sometimes need as-> too, but generally speaking.) What is going on here? I am missing something really obvious, but what?

didibus07:02:13

@pez Yes, the rule is, -> is for collection functions, and ->> is for sequence functions

didibus07:02:06

All sequence functions take the sequence as the last argument, thus they are threaded with ->>, and all the collection functions take the collection as the first argument, thus they are threaded with ->

seancorfield17:02:27

@pez it's also worth noting that you can nest ->> (and as->) inside -> but you can't nested those inside ->> so if you ever need a "mixed" pipeline, start with ->:

(-> s
    (->> (map f)
         (filter p)
         (reduce rf init))
    (as-> v (do stuff to v with args))
    (process with some args))
That said, a mixed pipeline should generally be avoided since it indicates the value is "changing shape" so the above would be better broken into smaller pieces, perhaps with a let.

👍 8
pez18:02:15

Thanks! I do tend to stay away from mixing those, even if I guess it makes sense some times.

pez07:02:44

@didibus Thanks! Then this boils back to another of my questions. I don't think I have grokked what is the difference between sequences and collections. You know of somewhere were that is explained well? With that understanding I think I might realize why sequence functions and collections functions have this difference in where the argument goes.

didibus07:02:00

Hum... can't think of a great resource of the top of my head. I'd say going over the chetsheet: https://clojure.org/api/cheatsheet and reading the Collections section, and then the Sequence section will at least make you very aware of the set of functions for each

didibus08:02:03

I can try to super summarize it. Basically, a sequence is an abstraction, or maybe best to think of it as a Protocol in Clojure's terms. Where a sequence, is just any thing which you can call the functions: first, next, more, cons, count, empty, equiv and seq on.

didibus08:02:46

So it turns out, you can build a lot of functions by just using the functions that a Seq provides (which I listed above).

didibus08:02:31

So all sequence functions are those functions who only use the above listed functions for any given type.

didibus08:02:35

With those, you can build filter, map, remove, last, first, take, while, etc.

didibus08:02:32

Now, you can have many things implement the sequence abstraction. If protocols existed, sequence would probably be a Protocol in fact, but protocols were not in Clojure yet, so instead its a Java Interface (which is similar to Clojure Protocols but not quite).

didibus08:02:18

So anything that implements the ISeq interface, is a sequence, and automatically, all sequence functions can be used on it, because those functions only use those methods available on the ISeq interface.

didibus08:02:34

That's all a sequence is.

didibus08:02:47

Now, a collection, is your classic data-structure and algorithm's concept of data-structures. So a HashMap, a HashSet, a LinkedList, a DoublyLinkedList, a BinaryTree, a Queue, a Stack, etc.

didibus08:02:24

While a lot of these collections share certain methods, like you can call get on a HashMap and Vector and a HashSet, they also often have methods unique to each, like a Stack has push and pop for example

didibus08:02:37

And so most of these are concrete types, they aren't an abstraction. Their methods are not implemented in terms of another set of methods from another interface, but instead are directly manipulating the data-structure.

didibus08:02:02

But, all of this is details. In practice, you care about the cheatsheet, and understanding that Sequence are such thing that implement ISeq and sequence functions are those that can work on anything that implements ISeq. And collections are: Map, Set, Vector, List, Queue

hindol08:02:45

The practical distinction for me is seqs can be lazy, i.e. elements are generated on the fly and the count is not known beforehand. They come with a handle with care label, 🙂 Collections are meh. If you are really the adventurous type, create a bunch of infinite sequences and pretty print them.

didibus17:02:17

They can, but that's not unique to them. There are impementations of lazy collections out there as well. But ya, in practical terms, they are the goto construct in Clojure for lazyness

didibus17:02:56

Collections can do things that are beyond the scope of the sequence abstraction though. So its really all up to what you need to do

hindol18:02:57

There's nothing stopping anyone from building an infinite collection but wouldn't they break at least some of the contracts like count? ISeq is a much more lenient contract.

didibus08:02:09

Hope that helped, sorry if I went too much in detail. Going to sleep now 😄

didibus08:02:16

Oh, and I don't really know the rational for making their operand first vs last argument. My guess is that, since all sequence share the exact same set of functions, it is very nice to make all things that work on sequences and return sequences take the sequence last and combine it with ->>, which means those things are always going to work together in a thread without incompatibility issue. And then, any other function that operate on a given type would take the operand first, since those don't all naturally work with each other, they can throw out of balance a threading. So I think this is why it was nice to do something different to isolate sequences out.

solf08:02:07

Any simple/elegant way of inverting a map grouping duplicate values?

{:a :foo :b :foo} => {:foo [:a :b]}
I got something like this, but it's using map-vals which is not a core function:
(->> {:a :foo :b :foo}
          (group-by second)
          (map-vals keys))

yuhan09:02:24

I would use reduce-kv to build up the result:

(reduce-kv
  (fn [acc k v]
    (update acc v conj k))
  {} {:a :foo :b :foo})

yuhan09:02:07

note this relies on (conj nil ...) returning a list

yuhan09:02:02

(fnil conj []) if you want a vector instead

Ben Sless09:02:36

You can also massage the definition of group-by a bit:

(defn invert
  [m]
  (persistent!
   (reduce-kv
    (fn [ret k v]
      (assoc! ret v (conj (get ret v []) k)))
    (transient {})
    m)))

pez08:02:15

@didibus Oh, wow, This was a great read for me. I'll definitely read the cheat sheet carefully now with this mindset, but I think you unstuck me with reasoning about what collections are. You should write about this somewhere, where it can be googled. ClojureVerse, maybe?

Ivan Koz08:02:59

@pez also sequences fit well with varargs and "first - rest" api to be placed last in the function arguments list.

hindol08:02:45

The practical distinction for me is seqs can be lazy, i.e. elements are generated on the fly and the count is not known beforehand. They come with a handle with care label, 🙂 Collections are meh. If you are really the adventurous type, create a bunch of infinite sequences and pretty print them.

Ivan Koz09:02:33

@pez such recursive pattern leads back into lisp history where all we had was a singly linked list of cons cells(nodes) first(head) was denoted as car and rest(tail) as cdr

(f [x xs]
  (f (first xs) (rest xs)))

hindol11:02:30

Is it guaranteed that keys and vals on a map will return the keys and vals in the same order? I don't care about the actual order as long as the key position match with the corresponding val position.

hindol13:02:30

Thanks for the link! Really helpful.

andy.fingerhut15:02:10

The doc strings for keys and Vals both say the order is the same as for (seq map).

mloughlin11:02:17

I don't think so. That's a weird property for a map data structure to have unless you're using a specific sub-type of map, e.g. sorted-map

hindol12:02:18

I wasn't expecting it to be, only curious.

mloughlin12:02:18

testing it in the repl, it does look like they're the same but I'd be hesitant to claim a guarantee unless explicitly noted in the docs

mloughlin12:02:14

for some clojure datastructures the implementation preserves ordering up to an arbitrary (8?) number of entries, then goes all over the shop

leonoel12:02:43

@hindol.adhya @michael.e.loughlin the documentation for keys and vals is very explicit about it, order is guaranteed to be consistent with seq

mloughlin12:02:50

is seq consistent?

leonoel12:02:12

two successive calls to seq on the same map will return the same result

leonoel13:02:14

TBH the documentation is far less explicit about this one

mloughlin12:02:25

e.g. https://clojuredocs.org/clojure.core/keys > I noticed that the keys are not always returned in the same order. Usually they are, but not always. > Map with 8 or more keys order are unexpected.

leonoel12:02:03

same map, as per identical?

mloughlin12:02:13

ah ok, so sequential calls to keys and vals will be equivalent ordering

leonoel12:02:36

what is not guaranteed is (= m1 m2) => (= (seq m1) (seq m2))

hindol12:02:41

So, (mapcat repeat (vals m) (keys m)) is the opposite of frequencies in a way, although ordering will be messed up.

leonoel13:02:14

TBH the documentation is far less explicit about this one

chiniara17:02:21

Hello guys, are there any good beginners material other than clojure brave and true? I really don't like that way of teaching. I should also say that I'm not a beginner programmer also, I already do commercial work in Python/C# and want to expand my mind into functional programming

Michael J Dorian17:02:11

I read clojure for the brave and true with about five years of professional dev experience and some substantial hobby lisp experience, and I found it to be an effective and skim-able introduction. If the goofiness drives you nuts it's might be worth jumping in the deeper end with your first clojure book 🤷

Michael J Dorian17:02:53

90% of my read time on that book was the chapter on state, since I didn't have functional programming experience

Kamuela12:02:12

I've read through Brave and am now working through this one. So far, absolutely fantastic as a learning resource

👍 4
Kamuela01:02:14

Just wanted to add that I’m nearly finished with it and it is an amazing book

seancorfield17:02:27

@pez it's also worth noting that you can nest ->> (and as->) inside -> but you can't nested those inside ->> so if you ever need a "mixed" pipeline, start with ->:

(-> s
    (->> (map f)
         (filter p)
         (reduce rf init))
    (as-> v (do stuff to v with args))
    (process with some args))
That said, a mixed pipeline should generally be avoided since it indicates the value is "changing shape" so the above would be better broken into smaller pieces, perhaps with a let.

👍 8
chiniara17:02:42

Thanks @nate and shawn for the suggestions I'll take a look! @doby162 yeah I don't know it's not working for me, I never had experience with lisps before too so it's both my first lisp and functional lang

Michael J Dorian17:02:38

Getting clojure has better cover art anyway simple_smile

pez18:02:46

That’s awesome, @seancorfield ! Good as a reference to.

hindol19:02:05

Hi, is there a channel for requesting code reviews?

dorab19:02:15

#code-reviews

👏 4
tjb19:02:42

hey everyone! havent been too active as of late but just re-picked up clojure again since my free time has increased since the holidays 🙂 shout out to @seancorfield and his skeleton repo of how to use Component (https://github.com/seancorfield/usermanager-example). it was very insightful and really helped me understand how to set things up!

❤️ 4
kenj19:02:35

That’s an awesome template! Wish I knew about that sooner…

hindol21:02:27

How do I go about optimizing this prime factorization sieve further?

(defn prime-factorize-till
  [n]
  (let [smallest-factors (int-array (range (inc n)))
        sqrt-n           (int (Math/ceil (Math/sqrt n)))
        factors-of       (fn factors-of
                           [x]
                           (loop [x  (int x)
                                  fs []]
                             (let [y (aget smallest-factors x)]
                               (if (= x y)
                                 (conj fs y)
                                 (recur (quot x y) (conj fs y))))))]
    (doseq [i (range 2 (inc sqrt-n))
            j (range (+ i i) (inc n) i)]
      (when (= j (aget smallest-factors j))
        (aset-int smallest-factors j i)))
    (into [[] []] (map factors-of (range 2 (inc n))))))

(time
 (count
  (prime-factorize-till 10000000)))

;; => "Elapsed time: 8382.9381 msecs"

hindol04:02:43

FYI, I missed an easy optimisation. By looping over only odd numbers during the sieve, I could cut the time by almost 25%.

hindol06:02:28

Here's a faster but less readable version,

(defn prime-factorize-till
  [n]
  (let [smallest-factors (int-array (range (inc n)))
        wheel            (cycle [2 4 2 4 6 2 6 4 2 4 6 6 2 6  4  2
                                 6 4 6 8 4 2 4 2 4 8 6 4 6 2  4  6
                                 2 6 6 4 2 4 6 2 6 4 2 4 2 10 2 10])
        xf               (comp
                          (take-while #(<= % n))
                          (filter #(= % (aget smallest-factors %))))
        factors-of       (fn factors-of
                           [x]
                           (loop [x  (long x)
                                  fs []]
                             (let [y (aget smallest-factors x)]
                               (if (= x y)
                                 (conj fs y)
                                 (recur (quot x y) (conj fs y))))))]
    (doseq [i     (concat
                   [2 3 5 7]
                   (sequence xf (reductions + 11 wheel)))
            j     (range (+ i i) (inc n) i)
            :when (= j (aget smallest-factors j))]
      (aset-int smallest-factors j i))
    (let [factors (object-array (inc n))]
      (aset factors 0 [])
      (aset factors 1 [])
      (loop [i 2]
        (if (> i n)
          (vec factors)
          (do
            (aset factors i (factors-of i))
            (recur (inc i))))))))

hindol05:02:56

And here's a more readable idiomatic solution that is only 20% slower.

(defn prime-factors
  [n]
  (let [cache (object-array (repeat (inc n) []))]
    (doseq [i      (range 2 (inc n))
            :when  (empty? (aget cache i)) ; no factors till now => prime!
            j      (iterate #(* i %) i)
            :while (<= j n)
            k      (range j (inc n) j)]
      (aset cache k (conj (aget cache k) i)))
    (map-indexed vector cache)))

hindol13:02:09

The fastest I could reach, using loop/recur all the way,

(defn prime-factors
  [^long n]
  (let [cache (int-array (range (inc n)))]
    ;; Even numbers
    (loop [i 2]
      (when (<= i n)
        (aset-int cache i 2)
        (recur (+ i 2))))
    ;; Odd numbers
    (loop [i 3]
      (when (<= i n)
        (loop [j (long (+ i i))]
          (when (<= j n)
            (when (= j (aget cache j))
              (aset-int cache j i))
            (recur (+ i j))))
        (recur (+ i 2))))
    (concat [[0 []] [1 []]]
            (map (fn [n]
                   (loop [x  (long n)
                          fx []]
                     (let [y (long (aget cache x))]
                       (if (= x y)
                         [n (conj fx y)]
                         (recur (quot x y) (conj fx y))))))
                 (range 2 (inc n))))))

hindol03:02:44

Not finding aget-int in latest Clojure.

Alex Miller (Clojure team)04:02:33

sorry, not sure what I was thinking of...

hindol04:02:29

Your blog post on fast math is the best resource I could find, 🙂. So no worries.

Alex Miller (Clojure team)21:02:03

in some cases, using longs in the loop bindings will actually be better, but it's hard to predict

Alex Miller (Clojure team)21:02:34

usually this is the point where I look at the bytecode for unexpected boxing

Alex Miller (Clojure team)21:02:41

(set! *unchecked-math* :warn-on-boxed) might possibly detect that and/or help