Fork me on GitHub
#code-reviews
<
2020-10-30
>
stopa03:10:27

(defn ->char-strings [str]
  (let [iter (doto (BreakIterator/getCharacterInstance)
               (.setText str))]
    (loop [start (.first iter)
           res []]
      (let [end (.next iter)]
        (if (= end BreakIterator/DONE)
          res
          (recur
            end
            (conj res (subs str start end))))))))
Wrote a quick function that “splits” up complex unicode with Java’s ICU4J Not sure if above is the most clojury soln. If you have better ideas lmk!

Ben Sless13:10:49

Some really minor stuff, like see if you can use identical? instead of = in (= end BreakIterator/DONE) , or make res transient inside the loop, but these are all optimizations

❤️ 3
stopa15:10:41

Thanks Ben. Noob q: what do you mean by make res transient inside the loop?

Ben Sless15:10:31

like so:

(defn ->char-strings [str]
  (let [iter (doto (BreakIterator/getCharacterInstance)
               (.setText str))]
    (loop [start (.first iter)
           res (transient [])]
      (let [end (.next iter)]
        (if (= end BreakIterator/DONE)
          (persistent! res)
          (recur
           end
           (conj! res (subs str start end))))))))

stopa15:10:24

aah! I didn’t know about this fn! Thanks Ben

👍 3
seancorfield04:10:25

Looks reasonable to me.

❤️ 3