This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-03-16
Channels
- # autochrome-github (17)
- # aws (6)
- # babashka (19)
- # beginners (42)
- # bristol-clojurians (1)
- # calva (1)
- # cider (7)
- # clara (1)
- # clj-kondo (6)
- # cljdoc (17)
- # cljs-dev (5)
- # clojars (23)
- # clojure (93)
- # clojure-europe (20)
- # clojure-italy (28)
- # clojure-nl (13)
- # clojure-sanfrancisco (1)
- # clojure-uk (50)
- # clojuredesign-podcast (5)
- # clojurescript (90)
- # core-async (8)
- # datomic (23)
- # duct (3)
- # emacs (10)
- # figwheel-main (1)
- # fulcro (1)
- # malli (1)
- # meander (22)
- # off-topic (12)
- # pathom (57)
- # reitit (4)
- # remote-jobs (5)
- # shadow-cljs (5)
- # sql (8)
- # tools-deps (3)
@i according to the source, it uses java.util.Arrays/sort
after converting the coll
to an array with to-array
So whatever that guarantees...
I have a function that takes an optional extra arg. What is more idiomatic Clojure:
(defn foo [req & [opt]])
(defn foo ([req] ... ) ([req opt] ... ))
?
https://stuartsierra.com/2015/06/01/clojure-donts-optional-arguments-with-varargs
Can I update deps without restarting my repl with tools.deps?
no, you need something like pomegranate https://github.com/clj-commons/pomegranate
I'm not sure how helpful this is but there's an unreleased feature in tools.deps
called add-lib
which you can read about here:
https://insideclojure.org/2018/05/04/add-lib/
Github branch:
https://github.com/clojure/tools.deps.alpha/tree/add-lib
I use the add-lib
branch of t.d.a. locally so I can add new dependencies easily without restarting my REPL. You can see how I do this in next.jdbc
with a "Rich Comment Form" containing code to bring the test dependencies in from that repo when I'm working on other projects: https://github.com/seancorfield/next-jdbc/blob/master/test/next/jdbc/test_fixtures.clj#L133-L155
My dot clojure file has an alias for this, and a comment showing how to use it to add a git-based dependency (pulling in master of any project): https://github.com/seancorfield/dot-clojure/blob/master/deps.edn#L147-L160 ^ @grounded_sage
Friends, how do i call a function that is inside of a defrecord
?
(defrecord ActiveMQConnection [url]
c/Lifecycle
(start [this]
(log/info "Creating ActiveMQ connection to" url)
(assoc this :connection
(doto (amq/connect {:url url})
(jms/start!))))
(stop [this]
(when-let [c (:connection this)]
(jms/disconnect! c))
(dissoc this :connection)))
I would like to call this start
function
the biggest gotcha in my experience is people expect start
to be namespaced to the namespace implementing the record, when it's actually namespaced to the creation of the protocol
that is the thing, people tend to think of, and it is common to talk about the functions that are part of protocols as methods
I usually implement my protocols as -start
or some other private-looking thing, and then a start
function in the namespace that makes sense (like core
etc.)
btw, you can call protocol methods directly on record instances implementing them: (.start (>ActiveMQConnection "some-url"))
(not a good idea for Components though, as that doesn't do the dependency injection part)
It worked, my problem now it's other. Thank you everyone
How do I write directly to disk from a BufferedInputStream?
user=> (def uid (java.util.UUID/randomUUID))
#'user/uid
user=> uid
#uuid "b8a1753f-cbac-40bf-9b9d-cc9cd5f018c9"
user=> (defn foo-is [] (io/input-stream (.ByteArrayInputStream. (.getBytes (pr-str uid) "UTF-8"))))
#'user/foo-is
user=> (type (foo-is))
.BufferedInputStream
user=> (with-open [o (io/output-stream (io/file "foo-31"))] (io/copy (foo-is) o))
nil
user=> (slurp "foo-31")
"#uuid \"b8a1753f-cbac-40bf-9b9d-cc9cd5f018c9\""
oh - and io/copy is happy to take a File
directly, so explicitly calling output-stream was redundant
so this suffices (io/copy promises to close things it opens)
user=> (io/copy (foo-is) (io/file "foo-31"))
nil
Oh so I don’t need with-open
nice!
The main thing to be careful of with an InputStream is to ensure that it gets an OutputStream behavior when doing io/copy. io/copy might do that under the hood already. I guess it wouldn't make much sense if it tried to create a Writer instead of an OutputStream in this case.
@andy.fingerhut good point - I think this is the multimethod dispatch that gets hit, and it would do the right thing https://github.com/clojure/clojure/blob/master/src/clj/clojure/java/io.clj#L319
the inputstream/file combo makes an outputstream out of the file and dispatches to inputstream/outputstream inside with-open
and of course the outputstream implementation is used directly with no danger of a Writer wrapping it
I ran out of memory doing the copy
That seems a bit odd, unless the InputStream was somehow allocating memory as you were reading from it. What is the source of data of the InputStream, or its specific type?
The large file I am copying is fine on it’s own. But when I do all the files there seems to be some memory leak
Large CSV’s from S3
there's an optional arg to io/copy that lets you specify a buffer size
That buffer size is more likely to be an issue if you were running out of memory for a single copy, not across many files. How many files would you estimate are involved?
11 files.
my guess is you are are using the cognitect one, which if I recall pulls s3 blobs entirely into memory when downloading
But after each one it would close and release the memory?
oh yeah, you might want to make sure you are reusing a single instance of the s3 client (even if working concurrently), and consuming the stream directly instead of a wrapper that puts everything in an array or string
Yes cognitect one
I do not know whether io/copy closes its inputs when it completes -- maybe not. Explicitly closing them yourself may help, but that seems difficult to imagine causing a memory problem with only 11 files.
io/copy only closes things it itself opened, so yes this is a concern
(defn S3-CSV->local-disk
[csv-file]
(let [folder (str "partner-data/temp/" (str/upper-case (:partner-id @config)))]
(if (fs/directory? folder)
(io/copy (get-csv csv-file) (io/file (str folder "/" csv-file)))
(do (fs/mkdirs folder)
(S3-CSV->local-disk csv-file)))))
small aside about constructing file objects from parts of a path:
user=> (io/file "foo" "bar/baz.txt")
#object[.File 0x6e0cff20 "foo/bar/baz.txt"]
user=> (io/file "foo/bar" "baz.txt")
#object[.File 0x191a709b "foo/bar/baz.txt"]
which is to say, you don't need that str
call
also you could skip the if
and self-call, and call mkdirs directly and unconditionally (it's a no-op if the dirs already exist)
(that concatenation also works when appending a file with a string btw, so both str calls can be eliminated)
Not quite sure what you mean by eliminating both str calls
from (io/file (str folder "/" csv-file))
to
(io/file folder csv-file)
from (str "partner-data/temp/" (str/upper-case (:partner-id @config)))
to (io/file "partner-data/tmp" (str/upper-case (:partner-id @config)))
and you don't need fs
anymore - you can just unconditionally call (.mkdirs folder)
it's not directly related to your question of course, just a driveby bikeshed
the s3 client you are using holds the entire contents of the file in memory before returning your input stream
Hmm that is a shame
I don't recall if the s3 api itself exposes ranged gets, or if you have to generate a signed url and do a range get against that
Sorry could you please point me to an example or where in the source I can find this?
Figured it out :)
the cognitect.http-client is not fit for all aws-api purposes, and we're slowly trying to extricate it from aws-api
This would change my io/copy
though right?
I don’t even know where to start here haha. All this IO stuff trips me up
Yes it is. Was trying to avoid it for a first pass
the simplest thing then is to go back to io/copy version with the with-open around it for the outputstream, but put a loop inside the with-open that downloads each part and copies it in to the outputstream
Ok I think I follow
And perhaps bump up your JVM's max heap size, if that is needed for one single large file...
(defn parts
"given the total byte size and desired chunk size,
return [start end] bounds for each chunk"
[total chunk]
(let [step (fn step
[i]
(when (< i total)
(cons [i (min (+ i total) max)]
(step (+ i total)))))]
(step 0)))
if you need to request smaller byte ranges @grounded_sage ^
I usually flail at the keyboard until whatever api I am using tells me the chunks are not too large or too small and then walk away
(defn parts
"given the total byte size and desired chunk size,
return [start end] bounds for each chunk"
[total chunk]
(->> (range 0 total chunk)
(map (fn [start]
[start (dec (min (+ start chunk) total))]))))
Is the min necessary here? @ghadi
Oh yea! I see
I was doing a comparison xD