Fork me on GitHub
#clojure
<
2023-02-09
>
Brian Beckman03:02:26

just an exclamation: I've been clojuring off and on since 2009. I today rediscovered defprotocol and defrecord (I had used them in Prod at Amazon 10 years ago, but forgot how good they are). Removed hundreds of lines of demultis from my code base today. Bravo, again, Clojure!

💯 44
lightsaber 2
jaihindhreddy03:02:03

Your https://www.youtube.com/watch?v=wASCH_gPnDw was one of the things that motivated me (and I'm sure a lot of others) to learn Clojure. Thanks for that, and https://www.youtube.com/watch?v=ZhuHCtR3xq8.

8
💯 4
JoshLemer00:02:45

What show was this taken from? Looks like a nice program

Thierry12:02:18

Quick question about http://clojure.java.io/reader Why is it that when I do the following the reader gets messed up and the csv file changes order.

(let [reader    (io/reader csvfile)
      separator (re-find #"(?:\,|\;|\|)" (first (line-seq reader)))
      parsed    (csv/read-csv reader :separator (.charAt separator 0))]
  (prn (first parsed))) ; print the first line of the csv file, typically the column headers
;=> ["value" "value" "value"]
But when I do it with a second reader it doesnt (which is expcected as it creates a second reader object).
(let [reader    (io/reader csvfile)
      separator (re-find #"(?:\,|\;|\|)" (first (line-seq (io/reader csvfile))))
      parsed    (csv/read-csv reader :separator (.charAt separator 0))]
  (prn (first parsed))) ; print the first line of the csv file, typically the column headers
;=> ["col1" "col2" "col3"]
Is that because the line-seq does something to the reader object and messes up the order?

p-himik12:02:47

Readers have internal positions, they are stateful objects. If you start reading from a reader, all further calls to read will continue reading from the reached position - not from the very start.

Thierry12:02:08

okay, that clears that up. didnt know that

Thierry12:02:52

Is there a way to keep the internal position to overcome having to use 2 reader objects?

p-himik12:02:34

The reader should have a .reset method but I've never used it myself and have no clue whether there are instances when it cannot help (assuming it can help at all).

🙌 2
p-himik12:02:47

BTW note that your way to detect the separator won't work if e.g. the separator is ; but some value has , inside. Or if the separator is \tab.

Thierry12:02:43

thats correct, the csv file i made by hand tho so will work for now. but it indeed needs a better way to detect the separator

p-himik12:02:14

Ideally, there should be no need for detection at all. :) IMO if you control the format, it should be fixed.

Thierry12:02:00

indeed 😉

Thierry12:02:42

the .reset method works if you .mark it first

👍 2
delaguardo12:02:29

.reset behaviour depends on reader implementation. In some cases, even if you use .mark , the reader's internal state might be incomplete. To avoid confusion when a change of csvfile value makes .reset unusable, you could rely on interop:

(let [reader    (io/reader csvfile)
      separator (re-find #"(?:\,|\;|\|)" (.readLine reader)) ; read first line
      parsed    (csv/read-csv reader :separator (.charAt separator 0))]
  (prn (first parsed)))

p-himik12:02:24

That has the exact same problem though - the call to prn will receive the second line, not the first one.

p-himik12:02:42

line-seq does not chunk, if that's what you meant. So using interop won't change anything.

Thierry12:02:09

I fixed it like so:

(let [reader    ( csvfile)
      _         (.mark reader 100)
      separator (re-find #"(?:\,|\;|\|)" (first (line-seq reader)))
      _         (.reset reader)
      parsed    (clojure.data.csv/read-csv reader :separator (.charAt sep 0))]
(prn (first parsed)))
In the final version I just removed the sep finding

Thierry12:02:16

the number dint matter

Thierry12:02:20

tried 0, 1 and 100

p-himik13:02:25

Another, more predictable, alternative is to wrap the reader with PushbackReader and then push the first read line back into it.

p-himik13:02:57

Yet another alternative is to call read-csv twice - once for the header and the second time on the "remaining" reader, then combine the results.

delaguardo13:02:49

(let [reader    (io/reader csvfile)
      first-line (.readLine reader)
      separator (subs (re-find #"(?:\,|\;|\|)" first-line) 0 1)
      parsed    (cons (string/split first-line separator) (csv/read-csv reader :separator (first separator)))]
  (prn (first parsed)))
I posted an example to illustrate the idea. Here is a full solution.

p-himik13:02:46

That's not correct. string/split will fail because separator is not a regex. Using (re-pattern separator) will fail when the separator is |. string/split will return incorrect result when the column names are quoted. Just... don't use string/split for CSV. Use csv/read-csv, even if it's 2 calls instead of 1.

p-himik13:02:57

Also, a bit beside the point - it makes more sense to store the separator as a character from the get go, instead of using subs only to then use first on it.

delaguardo13:02:06

Thanks for the feedback, didn't have opened repl to test it. Still those are manageable problems to compare with creating two readers or using an implementation-dependent reset method.

Ed17:02:40

I would prefer creating 2 readers. They're pretty lightweight and lazily read the data. Also, you need to close the reader when you're done with it.

Thierry09:02:37

I've been trying to wrap my head around the with-open part so the reader is closed afterwards. For some reason Stream closed is thrown at the end of my function and this is after the file is parsed and handled. I tried with #dbg to see what happens and I can see it doing the things I want it to but at the end throws stream closed. Why is that?

Thierry10:02:33

The function:

(defn read-matrix-file
  [csvfile]
  (with-open [reader    (io/reader csvfile)]  ;; create a buffered reader from the path
    (let [parsed    (csv/read-csv reader :separator \;)  ;; read the csv file with separator
          header    (first parsed)  ;; get the header
          rows      (rest parsed)  ;; get the rows
          keymap    (fn [map]  ;; accepts a map with strings as k/v
                      (reduce  ;; reduce the provided map
                       (fn [newmap [k v]]  ;; run over the new map and key value pairs
                         (if (.contains k "groep")  ;; if a key contains
                           (update newmap :groups conj v)  ;; combine them in a new key/value pair
                           (assoc newmap (keyword k) v)))  ;; otherwise return the key/value pair
                       {}  ;; the new map
                       map))]  ;; the provided map
      (keep
       (fn [row]
         (->> (zipmap header row)
              (keymap)))
       rows))))

Thierry10:02:15

without the with-open and the reader inside the let it works just fine but the reader doesnt get closed, which is expected but unwanted

Thierry10:02:26

It's something to do with the keep function, when I disable that I can get content.

Thierry10:02:10

But it does work when I put #dbg infront of the keep , I can see it doing what it must do

Thierry10:02:29

and even returns the data at the end

Thierry10:02:01

Okay, huh? So it works with mapv but not with map or keep. It works with filterv but ofcourse does not output what it should, and does not work with filter

Thierry10:02:53

And it works if i wrap the keep with (into [] (keep ...

Thierry10:02:04

Why does Clojure do this?

Thierry10:02:22

And if I change the keep into a reduce it works aswell

(reduce
              (fn [v row]
                (conj v (->> (zipmap header row)
                             (keymap))))
              []
              rows)

Thierry10:02:02

Even this works into (lazy-seq) while without it, it also returns a lazy sequence

Thierry10:02:23

wait, I forgot I need to use doall when there are side-effects

Thierry10:02:30

:man-facepalming:

Ed12:02:09

Yeah. Mixing side effects and laziness is not recommended ;) ... This is one of the great things about transducers, laziness is decided at time of application rather than construction.

Thierry14:02:48

Unless you use doall right?

Ed14:02:29

Like all the occurrences of the word do in clojure, doall should be considered a warning of "here be side effects" ... It marks a boundry between your functional world where laziness is fine and your side-effecty world where laziness breaks.

Ed14:02:14

transducers don't have this problem because the way you get laziness is by applying the transducer in a lazy context with sequence

Ed15:02:00

(defn group-keys [m]
    (reduce-kv (fn [newmap k v]
              (if (.contains k "groep")
                (update newmap :groups conj v)
                (assoc newmap (keyword k) v)))
               {} m))

  (defn process-csv-as-maps [xform rdr]
    (with-open [reader (io/reader rdr)]
      (let [[header & rows] (read-csv reader)]
        (->> rows
             (into [] (comp (map (partial zipmap header))
                            xform))))))

  (process-csv-as-maps (map group-keys) (java.io.StringReader. "1,2,3\na,b,c\nd,e,f"))
for example, you can separate out the side-effecty processing of turning a reader into a vector of processed records, passing in something to filter/`map` every record and perform some transformations ... if that makes sense?

Thierry15:02:14

Yes it does

Ed15:02:20

and if you were still trying to detect the separator, I would probably just create that as a separate function

Thierry15:02:59

Nah, I solved that

Thierry15:02:08

Added as optional binding to the defn

👍 2
Thierry15:02:17

with a default if not supplied

CarnunMP14:02:25

Unexpected destructuring behaviour:

(let [{foo :foo, :or {foo 1}, :as bar} {}]
    [foo bar])
;; => [1 {}], not [1 {:foo 1}]
Is this a bug? deep_thinking

jpmonettas14:02:37

bar is bounded to the entire map, which is {}, only foo is bounded to 1 if :foo is null, so it is the correct behavior

CarnunMP14:02:22

Hmm, right. Still seems 'wrong' though! At least from the standpoint of, '`:or` defines default values'.

Amit Gold14:02:16

:or works per-key, not on the whole value. for example

(let [{foo :foo bar :bar, :or {foo 1}} {}]
    [foo bar])
;; => [1 nil]

Amit Gold14:02:54

if your mental model is “:or means it replaces the input if the key is missing” then it’s wrong. mental model should be “for each key that is missing, take the value from the :or part”

CarnunMP14:02:52

That's helpful, thanks. :))

CarnunMP14:02:44

I still think there's an argument for the behaviour I expected. But I suppose it would complect :or and :as, heh.

Amit Gold14:02:14

yeah i guess it’s a question of which one is happening “first”

jpmonettas14:02:12

I think it is a normal thing to expect, but also the other way around, so :man-shrugging:

delaguardo14:02:17

map argument is immutable, you should not expect any instruction to mutate it. including the combination of :as and :or

CarnunMP14:02:22

> I still think there's an argument for the behaviour I expected... E.g. think of the case where the RHS is a variable supplied by argument—or the destructuring is happening in a function's arg vector. In that case it would be especially nice to think of :or as (really) supplying default values to bar, as opposed to having to explicitly use foo everywhere it's needed.

jpmonettas14:02:12

it is useful in some cases for sure, but then you don't have a way of having a reference to the original value when using :or

☝️ 2
CarnunMP14:02:18

Hmm no, I guess not. It would be original value + :or defaults...

CarnunMP14:02:33

merged, that is

CarnunMP14:02:55

right to left, lol

jpmonettas14:02:44

which will break the expectation of people wanting to always have :as pointing to the original input, which is also useful, so, choices

☝️ 2
CarnunMP14:02:03

flip a coin?

jpmonettas14:02:48

kind of late

CarnunMP14:02:02

best of three?

🪙 2
Alex Miller (Clojure team)15:02:50

it is helpful to keep in mind that destructuring is at heart about binding locals, not a machine to apply transformation or integration of data (you have lots of Clojure functions for that in your code). :or supplies defaults when you bind

🙏 2
Alex Miller (Clojure team)15:02:05

if you want to develop a better intuition about this, destructuring is backed by an undocumented function clojure.core/destructure and you can apply that yourself to see what it's being translated to. As this is a function called by a macro, it takes the quoted binding vector as input (manually reformatted this to make it look like normal code):

user=> (pprint (destructure '[{foo :foo, :or {foo 1}, :as bar} {}]))
[map__6 {}
 map__6 (if (seq? map__6)
          (if (next map__6)
            (PersistentArrayMap/createAsIfByAssoc (to-array map__6))
            (if (seq map__6)
              (first map__6)
              PersistentArrayMap/EMPTY))
        map__6)
 bar map__6
 foo (get map__6 :foo 1)]

👌 2
pavlosmelissinos15:02:32

> destructuring is at heart about binding locals, not a machine to apply transformation or integration of data 💯 , however I too have been baffled by this in the past! I wonder why so many Clojure beginners expect :or to "pollute" :as (myself included, I can't really explain it).

Alex Miller (Clojure team)15:02:36

note that the default 1 just shows up at the end as the default when foo is bound

Alex Miller (Clojure team)15:02:09

clearly this is a good topic to add to the clojure puzzlers book I'm working on! :)

😲 4
❤️ 6
jpmonettas15:02:08

> I wonder why so many Clojure beginners expect :or to "pollute" :as (myself included, I can't really explain it) I think because it is not a crazy expectation that the :or will shadow the map values, even in the :as binding, since when you use it in a function parameter most of the time you just want that

👍 2
pavlosmelissinos15:02:12

Right, it's not crazy but if it worked like that you'd lose the original map, which is worse because you can't get it back. I'd much rather get rid of :or than have an :as that maybe contains the original map (if I had to choose). Performance aside, (merge {:foo 1} bar) is quite clean if you need that behaviour imho.

💯 2
CarnunMP17:02:07

> clearly this is a good topic to add to the clojure puzzlers book I'm working on! 🙂 Glad I asked @U064X3EF3! :))

2
travis15:02:40

hey fambly - mapped my shift keys to parens and getting back into clojure, in part because I started using logseq (https://github.com/logseq/logseq/). there have been a bunch of posts about it in various channels, but no dedicated channel to talk about it so I made one at #logseq

🙌 4
p-himik15:02:51

FWIW they have a dedicated community on Discord.

❤️ 2
travis15:02:54

ah didn't know that, thanks! I imagine the channel here will be low traffic, but in case others dislike discord as much as I do maybe it will be a good space to chat about the clojure-specific elements of it

cldwalker21:02:34

Thanks for kicking this off! I'd been meaning to but also didn't know if there was interest. There aren't too many clj(s) hackers in discord so worth a shot

❤️ 2
borkdude15:02:03

Can someone tell me what's going on here?

user=> (= (edn/read-string "\"123\\n456\"") (edn/read-string "\"123\n456\""))
true

Alex Miller (Clojure team)15:02:12

the first string has a literal "\n" and the second has a \n newline character

Alex Miller (Clojure team)15:02:03

and they both read as a string with a literal \n

Alex Miller (Clojure team)15:02:46

I'm just talking through it. did that actually help?

borkdude15:02:55

I think so. The first, literal newline, is read as a newline, but the non-literal newline is really more like:

(edn/read-string "123
456")
right?

borkdude15:02:15

like this:

user=> (edn/read-string "\"123
456\"")
"123\n456"

Alex Miller (Clojure team)15:02:40

might help to println those strings

borkdude15:02:29

that seems to corresponding with what I said last

Alex Miller (Clojure team)15:02:03

the first is a 10 character string with the characters \ and n in it, the second is a 9 character string with a \n char in it

Alex Miller (Clojure team)15:02:49

but the reader is turning \ n into the \n char ?

borkdude15:02:14

seems so yes:

user=> (count (edn/read-string "\"123\\n456\"") )
7
user=> (count (edn/read-string "\"123\n456\"") )
7
user=> (println (edn/read-string "\"123\n456\""))
123
456
nil
user=> (println (edn/read-string "\"123\\n456\""))
123
456
nil

Alex Miller (Clojure team)15:02:26

I mean, pretty obvious in the code

borkdude15:02:27

Maybe this is a good one for the brain teaser book ;)

mpenet15:02:28

only for quoted strings tho -> (= (read-string "\\n") (read-string "\n")) -> false

mpenet15:02:35

(= (read-string "\"\\n\"") (read-string "\"\n\"")) -> true

Alex Miller (Clojure team)15:02:00

well then you're in whitespace, not in Clojure strings

mpenet15:02:25

yeah the \n is just skipped

mpenet15:02:32

in the false example

mpenet15:02:22

the actual code in in ednreader btw, but it's very similar to lispreader

Alex Miller (Clojure team)15:02:03

ednreader is a copy of lispreader so they are in general in sync (but with some features removed)

Benjamin18:02:43

I wonder at what level of application it becomes appropriate to "take in" the program config from a global var. Currently I would pass a config from the main entry point down everywhere as argument

p-himik18:02:32

I'd say with size, it becomes less and less appropriate.

Benjamin18:02:20

Ah you mean in a small project you get away with implicit input and in a big project it should be explicit always?

p-himik18:02:26

Yep.

👍 2
Ben Sless18:02:20

organize your config like your modules and merge them https://github.com/juxt/aero#using-aero-with-components

Benjamin19:02:02

yea I would pretty much usually do 1 big config (using e.g aero)

Ben Sless19:02:58

Another thing I found works well with that is more components. smaller components. It makes sense in the end

anovick19:02:03

I was watching a talk by Rich Hickey about Reducers (from QCon NY 2012): https://www.youtube.com/watch?v=IjB-IOwGrGE Back then he called this "Reducers library", is it the same thing he later called "Transducers"?

hiredman19:02:52

no, they have some similarities, in some ways the reducer stuff a can be seen as a precursor to transducers, but the implementation is different, and the reducers library has some stuff that hasn't been re-implemented for transducers (the parallel fold stuff)

hiredman19:02:44

the reducer api usage is very familiar, clojure.core/map takes a seq and returns a seq clojure.core.reducers/map takes a reducible collection and returns a reducible collection

❤️ 2
hiredman19:02:09

the transducer api is kind of a radical departure from that, instead you build up a transformation that is applied to the function you use to reduce the reducible collection. reducers sort of do this internally, incrementally, clojure.core.reducers/map returns a reducible collection that when reduced transforms the reducing function by combining it with the function passed to map, then uses the transformed function to reduce the collection passed to map

😕 2
hiredman19:02:48

transducers build the transform in one big step up front and don't make intermediate "wrappers" that carry the transform and the original collection (although you can do that with eductions)

Benjamin07:02:10

another way to to put it is to say the transducer is the recipe of the transformation and a transducable context is transforming according to recipe