Fork me on GitHub
#clojure
<
2021-02-14
>
p-himik06:02:10

@alexmiller The page at https://insideclojure.org/2015/01/02/sequences/ says: > (rest coll) - returns a logical collection of the rest of elements (not necessarily a seq). Never returns nil. However, the docstring of rest says: > Returns a possibly empty seq of the items after the first. Calls seq on its argument. Is that "not necessarily a seq" part of the quote from the article outdated?

didibus07:02:46

I don't have a repl right now, but can't a seq not be empty? So I think it can return an empty list or an empty lazySeq as well

didibus07:02:04

So I don't think it be outdated

didibus07:02:40

Though I guess Empty list is a seq, so I don't know

didibus07:02:54

It's just maybe a confusing kind of wording on either side

hiredman08:02:15

It used to be the case that empty seqs didn't exist, but it has not been the case in a long time

hiredman08:02:07

I believe Alex is trying to construct (or reconstruct) a taxonomy of concepts that most people just don't much attention to there

👍 3
hiredman08:02:29

I think he is drawing a distinction between a sequence and a seq

hiredman08:02:41

The next section goes into that

didibus17:02:49

Hum, ya it mixes weird with the (seq [1 2]) idiom though

didibus17:02:06

Which made sense that you either have a seq with elements in it, or nil

didibus17:02:21

But gets confusing when a seq can be empty

hiredman17:02:49

yes it used to be the case that (if s ... ...) was safe, but now(for years, the change was pretty close to clojure 1.0) you need (if (seq s) ... ...)

didibus17:02:30

I guess in the taxonomy, what is the difference between s and what seq returns?

didibus17:02:55

I'm trying to see if like there's a new word for a seq that can't be empty?

hiredman18:02:43

looking at it again, I think he is distinguishing between a seq and a lazy-seq (not a seq and a sequence), a seq is never empty (it is a seq of something or nil), a lazy-seq can be empty, and both of those are a sequence

hiredman18:02:14

but that doesn't help to make sense of the comment about (rest coll)

hiredman18:02:25

and like the type hint on rest says it returns an ISeq, the method call it bottoms out on (ISeq.more) says it returns an ISeq

Carlo14:02:51

how does one purge specs from the registry (all of them is fine) while doing a spec refactoring?

Alex Miller (Clojure team)14:02:28

the registry is just an atom holding a map, so you can dump the atom

Alex Miller (Clojure team)14:02:35

(reset! @#'clojure.spec.alpha/registry-ref {})

Alex Miller (Clojure team)14:02:17

do note that spec itself installs at least one internal spec in the registry that is essential to the operation of keys* though so doing that may actually break keys* specs

Alex Miller (Clojure team)15:02:30

you could do a more targeted update, or the mechanism provided by s/def is that registering a nil spec will remove

Nazral15:02:52

I have a loop that creates a bunch of futures that write to a set of files, which of course can create race conditions, and I am trying to figure out what would be the best way to solve this. I feel like the solution would be something like having an agent, and having the futures reset the agent with a structure like {:file-name "...", :data ...} followed by a watcher that automatically writes data to file-name, does that sound reasonable?

vemv16:02:16

is the race condition because of concurrent writes to the same file-name? or something else?

Nazral16:02:46

well, the futures might write to the same file yes

Nazral16:02:53

it's determined as they are executed

vemv16:02:51

I think your solution is reasonable yes. The overall model you seem to be following is "parallel execution, serialized writes" which can be fine. You can have either a single agent for all writes, or have one agent per filename (using some kind of pooling mechanism)

vemv16:02:18

One delicate part though is why concurrent computation can write to the same file. It seems plausible that if futures f1 anf f2 are running in parallel, f1's writes will make f2's work useless (depending on the domain it may or not matter)

Nazral16:02:26

Thank you (I think I have too many files to have one agent per file name, plus the file names vary)

vemv16:02:29

> Thank you (I think I have too many files to have one agent per file name, plus the file names vary) agents are lightweight though, they are just objects (not threads). Their threadpool is decoupled from the agents themselves

dpsutton18:02:16

not sure how many files you have and how big they are, but perhaps you are in the realm where you can keep all of the changes in memory and then once all of the changes are done, update the files on disk? Ie, mutate a data structure haphazardly, then flush that. also allows you to do some cleanup on the datastructure: reordering changes, omitting changes where another change clobbers it, choosing which change from clobbering you actuallly want, etc

Nazral02:02:11

@U11BV7MTK I'm crawling some data and generating ~ 20GB of uncompressed data per day (the watcher appends to gzipped version of the file directly). There are about 5000 files that I am writing to

noisesmith16:02:36

> too many files to have one agent per file name > have the futures reset an agent...

noisesmith16:02:24

I think we disagree about what agents are about? they are lightweight, in clojure terms at least, you can have a lot of them. and you send them actions - you wouldn't reset their file, you would close over the file handle and send them a function that makes them write. serializing writes is exactly the sort of thing agents are good at (as long as you aren't too worried about waiting for writes to finish before you move forward)

3
noisesmith16:02:01

a quick example:

(require '[ :as io])

(defn writing-agent
  [fname]
  (let [handle (io/file fname)
        writer (io/writer handle)]
    (agent
     {:fresh (.createNewFile handle)
      :handle handle
      :writer writer
      :written 0})))

(defn append-to
  [the-agent the-string]
  (send the-agent
        (fn [{:keys [writer] :as m}]
          (.write writer the-string)
          (.flush writer)
          (update m :written + (count the-string)))))

(defn close
  [the-agent]
  (send the-agent
        (fn [{:keys [writer] :as m}]
          (.close writer)
          (assoc m :handle nil :writer nil))))
(cmd)user=> (.exists (io/file "foo.out"))
false
(cmd)user=> (def a (writing-agent "foo.out"))
#'user/a
(ins)user=> (pprint @a)
{:fresh false,
 :handle #object[java.io.File 0xff6077 "foo.out"],
 :writer
 #object[java.io.BufferedWriter 0x1280851e "java.io.BufferedWriter@1280851e"],
 :written 0}
nil
(ins)user=> (append-to a "hello")
#object[clojure.lang.Agent 0x12a160c2 {:status :ready, :val {:fresh false, :handle #object[java.io.File 0xff6077 "foo.out"], :writer #object[java.io.BufferedWriter 0x1280851e "java.io.BufferedWriter@1280851e"], :written 5}}]
(cmd)user=> (.exists (io/file "foo.out"))
true
(cmd)user=> (slurp "foo.out")
"hello"
(ins)user=> (append-to a ", world")
#object[clojure.lang.Agent 0x12a160c2 {:status :ready, :val {:fresh false, :handle #object[java.io.File 0xff6077 "foo.out"], :writer #object[java.io.BufferedWriter 0x1280851e "java.io.BufferedWriter@1280851e"], :written 5}}]
(cmd)user=> (slurp "foo.out")
"hello, world"
(ins)user=> (close a)
#object[clojure.lang.Agent 0x12a160c2 {:status :ready, :val {:fresh false, :handle #object[java.io.File 0xff6077 "foo.out"], :writer #object[java.io.BufferedWriter 0x1280851e "java.io.BufferedWriter@1280851e"], :written 12}}]
(ins)user=> (pprint @a)
{:fresh false, :handle nil, :writer nil, :written 12}
nil

noisesmith16:02:25

I saw some weird behavior with .createNewFile btw- the docs say it creates a new fresh empty file only if it doesn't exist yet, but in my experiments it seems to be truncating the existing file before any writes

noisesmith16:02:44

or maybe the writer I create is doing that - I should decouple for testing

dpsutton16:02:08

i think you need {:append true} as options to the writer

Nazral04:02:49

I see, ok I think I'll try to use something like that. So far I have one agent that has a watcher that calls

(defn gz-write-line
  "Append data to gzipped target"
  [target content]
  (with-open [w (-> target
                    ( :append true)
                    java.util.zip.GZIPOutputStream.
                    
    (binding [*out* w]
      (println content))))
and my different futures are writing stuff like {:target "foo.txt.gz", :data "some data"} with (send my-agent (fn [old] {:target path :data (str/join "\n" data)})) which admittedly sounds a bit wrong. That being said I am ok with opening and closing the file every time (if only because I might need to access these files at any point, read only)