This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-01-06
Channels
- # announcements (16)
- # aws (9)
- # babashka (76)
- # beginners (92)
- # boot (1)
- # cider (18)
- # clara (7)
- # clj-kondo (26)
- # clojure (104)
- # clojure-europe (4)
- # clojure-nl (11)
- # clojure-spec (11)
- # clojure-survey (101)
- # clojure-uk (35)
- # clojuredesign-podcast (18)
- # clojurescript (8)
- # core-async (29)
- # data-science (1)
- # datomic (13)
- # emacs (4)
- # fulcro (20)
- # graalvm (14)
- # instaparse (2)
- # jobs (1)
- # juxt (6)
- # malli (5)
- # off-topic (30)
- # onyx (3)
- # planck (1)
- # project-updates (7)
- # re-frame (38)
- # reagent (30)
- # reitit (14)
- # remote-jobs (2)
- # shadow-cljs (50)
- # sql (8)
How do you read in a large edn file. I've got a large JSON response from a server I want to inspect with the REPL. I've converted it to EDN and put it on the disk but every time I try to load it the repl locks up and I have to restart.
It's only 740kb
@ludvikgalois yes top level is a map
@grounded_sage 740kb isn't exactly giant though...
Which is why I'm confused as to why it keeps locking up
I'm using VScode and Calva
@grounded_sage One other alternative would be to use jet from the command line with a query
It's how I usually inspect large chunks of EDN on disk. https://github.com/borkdude/jet
This is a good temporary fix. But longer term this needs to be part of an application. I can't figure out why this is freezing my repl.
Is the contents of the EDN file publishable for others to try to reproduce? How long have you let it run? Have you monitored the JVM process to see if it is running out of memory, and tweaked the -Xmx
command line option when starting the JVM to give it more memory if so?
Seems all I had to do was to use println
I was sending the data straight to the repl.
Yeah, you do not want to print out that much data in a REPL. def
'ing a Var to hold the result is a good approach there.
Yea was a rookie error haha. I guess it runs out of memory because it's holding that info in the repl.
It can depend on whether the REPL is in a terminal, or an editor buffer like in Emacs or similar. The latter tend to behave less well with large output than terminals, although terminals can have trouble, too.
du -h /tmp/file.edn
896K /tmp/file.edn
And then in repl (emacs+CIDER)
(def f (read-string (slurp "/tmp/file.edn")))
(first f)
=> {:key0 0}
How are you ‘loading the file’ @grounded_sage?
Guys, i have a nested map and i'm trying to remove one of it's keys. I'm using re-frame. Wich is the best way to dissoc this? I'm using dissoc but it's not passing o my test yet
Dissoc works only on top level keys. If the map is nested you need to use
(update-in db [:nested :map] dissoc :your-key)
Hey folks, I was wondering how to achieve something like following in transformation step:
Let's assume I've a collection of strings (words),
I applied some filter and map on it,
then I want to apply something like takeWhile
, but the problem is I need/want to remember all the previous strings and do some operation on them (by combining them in single string) and passing it down pipeline, and continue with rest of the strings with the same step..
ultimately I'll have collection of strings (sentences).
It looks like I need a state between the pipeline, in that takeWhile
kinda step. Does it make sense? Any kind of thought provoking direction will be appreciated. Thanks.
hi @UJRDALZA5, in turned out I can achieve my requirements using reduce
. The only difference was that my reduce doesn't reduce sequence to a single value, but yet again sequence.
You mean a list/vector? I might be wrong but I don't think reduce can generate a sequence.
Yes, vector. Reduce can generate them. 🙂
Think about it in this way: It can generate whatever type of value you feed as seed.
Do you want example: you want to double each element in vector. You might think about map
but reduce
can do the same..
user=> (def v1 [1 2 3])
#'user/v1
user=> (reduce (fn [rt x] (conj rt (* x 2))) [(first v1)] (rest v1))
[1 4 6]
user=>
Notice that we pass [(first v1)]
, not (first v1)
. The latter is value but former is vector of single value.
Does it make sense, @UJRDALZA5?
Sorry, I wasn't clear. I mean reduce can generate a list, vector, set, map etc. but not a (lazy) sequence. It can of course generate a vector.
Oh okay.
and sequences are lazy, right?
Yes, there are no eager sequences
in Clojure, but a lazy sequence
can be realized eagerly with into
, reduce
, count
and many more ways.
Basically, any result that needs to scan through the sequence will realize the whole sequence eagerly.
Because all sequences are lazy, you can easily represent infinite sequences in Clojure and realize only as much is actually needed. For example, (range)
returns an infinite sequence from 0 to positive infinity but this is valid code,
(take 100 (range)) ;; => (0, 1, 2, ..., 99)
While evaluating (range)
alone will give you headaches.Right. Totally makes sense. Thanks @UJRDALZA5!
Thanks @rakyi and @valtteri. I'll look into those functions. :thumbsup:
I've got a request I am making with clj-http and I want to write the JSON response to disk but it is running out of memory when it makes the request. How do I stream it to disk?
Haven't tried it, but you can tell clj-http to return the response as a stream:
;; Return the body as a stream
(client/get "" {:as :stream})
;; Note that the connection to the server will NOT be closed until the
;; stream has been read
@U7S5E44DB yea that’s what I was doing but how do I turn that stream into a string? I’m there was a lot about IO/write etc when I was googling earlier. Going to try and slurp it next time I’m at the computer
I'd like to update all the values in a map using a function and return a new map with the same keys and updated values. I am sure I have done this lots of times, but my mind has gone blank
{:a 2 :b 3.5 :c 2.5 :d 3}
I can use map
with an inline function and update the val
but that only returns the sequence of values and not the updated map
(map #(- 10 (val %)) {:a 2 :b 3.5 :c 2.5 :d 3} )
I though I could do this with an inline function. I guess I could use a for
or into
instead. Any suggestions?
Thank youThere’s also something like: https://clojuredocs.org/clojure.core/reduce-kv
You could also simply do
(into {} (fn [[k v]] [k (myfn v)]) mymap)
This is because maps are also sequences of map entries and map entries restructure as vectors/seqs of arty 2. At the same time you can conj pairs (vectors of arty two) into maps, e.g.
(conj {} [:a :b])
I want to write a small program that has a single writer thread, and multiple reader threads. Is this something that Clojure concurrency primitives are good at or should I be looking at interop?
you can always use java.util.concurrent, or channels from Clojure core.async @michael.e.loughlin
java threads + core.async. possibly managed using a threadpool would be my preferred soln
Because you put immutable data in a mutable ref, multiple concurrent readers can get and read that immutable data without impeding other reads or even stopping writes
1) make a queue or channel 2) pass that queue or channel to a producer 3) pass the same queue or channel to the consumers 4) __ 5) profit!
Out of curiosity, what "thing" is it you were hoping to have a single writer, and multiple readers? If the answer is "an arbitrary immutable object, e.g. a Clojure map, or vector", then a Clojure atom enables all of that and more, as long as you are fine with readers getting snapshots of the immutable value, and having to go back and deref the atom again later if you again want the new latest value.
I'm implementing a toy database based on chapter 3 of Designing Data-Intensive Applications. It consists of a hash index that lives in an atom, but the underlying "db" is an append only text file that occasionally gets replaced
If the answer is "some arbitrary mutable thing", then there are all kinds of details about that mutable thing, e.g. thread safety, that are all part of the answer, and will often impose constraints on your code that you will need to check manually yourself as you initially develop, and probably later update, your code.
"occasionally gets replaced"? Meaning the hash index in the atom sometimes gets updated to have a new file name in it, no longer the one it had been using for a while?
If you meant "occasionally instead of appending to the file, the contents are erased to empty and we start appending from that again", that is notably different.
I guess the hash index contains offsets into the file?
when the append-only text file grows to an arbitrary size, I "compact" it by taking the most recent values of all my keys (its a key-value DB) and drop them into a new "main file"
It seems like as long as you know the ways to flush/sync the contents of the append operation so that it is visible to all readers, and/or to ensure that a new created file is guaranteed to be visible to all readers, you can do those operations in the writer, and only when they are complete, update an atom containing the index, assuming the index is an immutable collection like a map
this almost sounds like a good use of an agent, since agents act as a queue of operations to perform on a state (thus will ensure that only one thread is using the i/o connection at a time)
so each write or compact would be sent to the agent, and would be free to set any metadata about the file you are abstracting after each operation
the drawback is that operations on agents are non-blocking (this might not be the right semantics)
but I'd be suspicious of using an atom, as i/o and retries don't mix
If there is a single writer, there will be no retries.
but point taken and agreed with
My "no retries" comment assumed the single writer was always from the same dedicated writer thread of the program
yeah - that's also valid, I like the use of agent here because it bakes that assumption into the behavior of the container used, but using discipline to only write from a dedicated thread also works (though you likely end up implementing much of what agent gives you for free, eg. you now need a queue to get input from other threads etc...)
sure. And regarding your comment that operations on agents are non-blocking, I haven't used await
before, but is an option the writer in this scenario could use to stop until all pending operations have completed.
the problem there (in my experience) is that (do (send a f) (await a))
can end up waiting on some g
sent from another thread
but it's true, await usually works fine
my workaround is instead of await
, do (let [done (delay true)] (send a #(do (f) (force done))) @done)
but that patterns is awkward
In this single-writer, multiple reader scenario, only the writer would be doing send
calls, yes?
Interesting you mention about the "waiting on g sent from another thread" thing. The doc string for await
seems to imply that it wouldn't do that.
Only imply that, not promise it, I mean. An implementation that waits longer than the minimum necessary implied by the doc string is still meeting what it says, if perhaps prone to misinterpretation on how soon await returns.
waiting on g happens in a data race - because the send and the await are not atomic
await doesn't wait for a specific action, it waits on all running / pending actions
fair point - a watch avoids retry noise