This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-06-14
Channels
- # aws (6)
- # babashka (31)
- # beginners (69)
- # biff (9)
- # boot (9)
- # bristol-clojurians (1)
- # calva (20)
- # chlorine-clover (2)
- # cider (8)
- # cljsrn (24)
- # clojure (25)
- # clojure-norway (4)
- # clojure-spec (29)
- # clojure-uk (7)
- # conjure (23)
- # datahike (5)
- # datomic (39)
- # emacs (4)
- # fulcro (4)
- # graalvm (11)
- # honeysql (1)
- # lambdaisland (1)
- # leiningen (8)
- # liberator (1)
- # libpython-clj (3)
- # malli (6)
- # mxnet (1)
- # off-topic (94)
- # pedestal (13)
- # re-frame (4)
- # releases (2)
- # shadow-cljs (8)
- # spacemacs (22)
- # sql (9)
- # vim (1)
Hi, I want to turn the entries of a set into maps. It's the result of a datomic-esque query using datahike..
; ;; successfully returns all chats from room
(println "haxorrr : "
(d/q '[:find ?authorid ?content ?timestamp ?messageid
:in $ ?kind ?roomname
:where
[?m :message/kind ?kind]
[?m :message/roomname ?roomname]
[?m :message/authorid ?authorid]
[?m :message/content ?content]
[?m :message/timestamp ?timestamp]
[?m :message/messageid ?messageid]]
@conn "chat" "Beginners"))
returns
haxorrr : #{[a.tester Welcome to the beginner's channel. Please post any questions you have here. Feel free to discuss! 1592143964 aed7504f-b1a7] [c.tester Quite a hypothetical, indeed. 1592143964 cf91e354-abab] [b.tester That is a hypothetical. 1592143964 7c96b2a2-ef2b] [a.tester Say I had a question, you'd want to help me figure it out. 1592143964 7c22e8fd-6391] [a.tester Real mature, gems. 1592143964 ad536ace-5a3f]}
So I'd like a vector of maps instead, that way I can sort it by timestamp and generate a page from it. Is there a better way to sort by timestamp ? open to suggestions.datahike is durable datascript which does support pull. https://github.com/kristianmandrup/datascript-tutorial/blob/master/pull_data.md
Neat. I have not used pull before... I recall it being a formative part of om.next but I still don't really get it
That's a nice page but I need more examples to make use of it How can I use pull syntax to get database results?
(println "pull? " (d/pull @conn '[*] :message/messageid))
gives me the schema def / ident...
pull? #:db{:id 1, :cardinality :db.cardinality/one, :ident :message/messageid, :unique :db.unique/value, :valueType :db.type/string}
but how would I change that pull line to get actual db results? It says the entity identifier (3rd and last arg) must have uniqueness... which is groovy, but then how do I get a meaningful result?You're pulling everything (the '[*]
pattern) for the entity corresponding to the :message/messageid
ident; instead you'd likely want to pass in the third position the entity id (or lookup ref) corresponding to whatever entity it is you're interested in.
Or, more likely, you'll want to have a look at the section Pull and Query combined
of the document linked above. Then the pull pattern specified will be applied for all the entities returned by the query.
woots. makin' some progress aw yeah B-) thanks y'all
👋 Hi, I'm trying to spec a tuple where I only care about the first value. The second can be every arbitrary type how can I achieve that?
I'm looking for suggestions for ways to speed this up. It inserts 100000 simpe records into a postgresql table. Right now the best time I've observed is 25 seconds. It uses HikariCP for connection pooling.
(def datasource-options {:auto-commit true
:read-only false
:connection-timeout 30000
:validation-timeout 5000
:idle-timeout 600000
:max-lifetime 1800000
:minimum-idle 10
:maximum-pool-size 10
:pool-name "db-pool"
:adapter "postgresql"
:username "*******"
:password "*******"
:database-name "*******"
:server-name "localhost"
:port-number 5432
:register-mbeans false})
(defonce datasource
(delay (make-datasource datasource-options)))
(defn do-insert [n]
(jdbc/with-db-connection
[conn {:datasource @datasource}]
(jdbc/execute! conn
["insert into speedy values ( ? , ? )" n "some textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome textsome text"])
))
(defn run [& args]
(count (pmap do-insert (range 100000))))
this also has some good tips, https://www.postgresql.org/docs/9.0/populate.html
like disabling auto commit
Thanks!
as well as using COPY
in favor of INSERT
. Although that depends on where your database is and where your data is
I'll give this a try post the results
I won't be able to use copy in the solution I'm designing. Some data validation has to happen in code for the real thing.
if you can upload files to the server postgres is running on, you can potentially validate the data, write it to the file, and then run the COPY command. not sure if that would be faster, but potentially
the INSERT
https://www.postgresql.org/docs/9.5/sql-insert.html only returns oid count
. I suspect that the time spent on processResults
is just time spent waiting for a response and not time actually processing results.
ok yeah that makes sense
no indexes
so besides the loading command, the other area where you might achieve a large speedup is network speed. eg. if your database is on aws and your data is your local dev machine, then a large fraction of the time might just be sending data to the database
that will make things a lot faster too, but with the downside of possible data loss 😉
@U7RJTCH6J I set :auto-commit false but no speed up
but I'm thinking I need to batch differently, like do-insert might need to run a sub-batch of items
what’s the network like between where the query is run from and where the server is?
this is all local right now
are they both on the same network?
production will all be in the same aws vpc
so not too worried about latency
Since this is PostgreSQL, you might also need :reWriteBatchedInserts true
(per my note in the main channel).
Yes, as @U04V70XH6 mentions, that is worth trying. I have that setting in one of my projects too that does a large amount of inserts.
Thanks @U04V70XH6 I'm reading through the rest of this doc to understand the batch mechanism described and am going to work up an implemenation using it.
@U04V70XH6 is there a way to tell clojure.java.jdbc or next not to do anything with the resultset?
next.jdbc
, yes. clojure.java.jdbc
, no.
gotcha. I have no reason not to use next 🙂 so I guess I'll switch to that
It's all a lot easier in next.jdbc
because it deliberately traffics in JDBC objects and exposes a lot more of that to Java interop if you need it. clojure.java.jdbc
tries hard to hide all of that -- so you can't easily get at the underlying stuff. In addition, c.j.j auto-wraps everything in a transaction unless you explicitly opt-out -- whereas next.jdbc
leaves all transactions up to you.
c.j.j is no longer being actively maintained (by me) -- all my focus is on next.jdbc
now.
Right on. I've read that in the first sentence of the c.j.j. readme many times now but I'm usually following tutorials/guides that use c.j.j. so opt not to add another unknown to the mix of what I'm trying to do at the time 🙂
Sweet. Got it down to
(time (run))
"Elapsed time: 1088.791338 msecs"
with 1000 batches of 100:
`
(defn run-batch [i]
(with-open
[conn (nxt/get-connection @datasource)
ps (nxt/prepare conn ["insert into speedy values (?,'hello')"])]
(next.jdbc.prepare/execute-batch! ps (into [] (map (fn [n] [n]) (range 100))))))
(defn run [& args]
(count (pmap run-batch (range 1000))))
Thanks for y'all's help!
@U04V70XH6 this fastest setup I observed was 400,000 inserts in 3554 msecs using:
(defn run-batch [i]
(with-open
[conn (nxt/get-connection @datasource)
ps (nxt/prepare conn ["insert into speedy values (?,?)"])]
(next.jdbc.prepare/execute-batch!
ps
(into [] (map (fn [n] [n raw-data]) (range 100000)))
{:batch-size 10000})))
(defn run [& args]
(count (pmap run-batch (range 4))))
I just moved the (into.. computation to a var outside the function and observed "Elapsed time: 2471.442009 msecs"
Did you also add the rewrite option I pointed you at? And that sounds like pretty good performance: 2.5s for 400k rows.
it looks like it's spending a good deal of time in processResults
which I don't really care about
@skinner89 I know you're using clojure.java.jdbc
there, rather than next.jdbc
but read this section of the latter's docs that talks about a possible issue there https://cljdoc.org/d/seancorfield/next.jdbc/1.0.462/doc/getting-started/prepared-statements#caveats
I find myself wanting to change the value of something I've defined in an outer let
binding form when I encounter some condition inside of an inner letfn
. Realizing that what I introduce in a let binding form cannot be changed, is there something analagous to let
that would actually let me change the value of a binding?
@radicalmatt If you want something mutable, there's atom
, but you should probably try to rethink your approach. You almost never need mutation for most stuff.
I'm working on a puzzle involving tree traversal, and keeping track of certain nodes I've visited while recursing through the structure.
I'm sure there's a way to encode a vector of what I've visited into a recursive function parameter, but I'm just not finding it
Sounds a bit like zippers... Have you looked at clojure.zip
?
Also, we might be getting out of #beginners here, but I’ve used https://github.com/redplanetlabs/specter to do pretty much what you’re describing
Look for the place on the page where it says: “When doing more involved transformations, you often find you lose context when navigating deep within a data structure and need information “up” the data structure to perform the transformation. Specter solves this problem by allowing you to collect values during navigation to use in the transform function.”
Specter is its own beast almost, but allows pretty powerful stuff as far as immutable data transformation is concerned
it's possible I might need to use like, an explicit stack to represent unvisited nodes instead of the recur(left) recur(right) type approach..
I'm simplifying the puzzle to just returning a list of node names that constitute a simple preorder tree traversal haha
You can 'shadow' the value of an earlier binding, even within the same let
, but also within an inner let
. That isn't mutating an existing value, just giving the same name to a new value.
(let [a 1
a (+ a 2)]
a) ; 3
@radicalmatt any time you find yourself wanting to change something try thinking “well I can just make a NEW thing” 🙂
(let [a 1
a' (+ a 2)]
a') ; 3
I personally like the “a prime” approach where I don’t shadow bindings; instead I give them similar but different names