Fork me on GitHub
#clojure
<
2015-12-06
>
borkdude09:12:53

@andrewboltachev: just a tip, having an upload folder as part of your uberjar could be troublesome, I wouldn't do that

val_waeselynck09:12:46

@andrewboltachev try to find the answer in the Java / JVM ecosystem, it will be the same for Clojure for this kind of topic

trancehime09:12:46

#C03RZGPG1 appears to have not seen my query, so I would like to see if I have some input as to where I should look to ask

trancehime09:12:46

It's a pretty simple question.

trancehime09:12:27

Now, assuming I have a vector of maps that are all of the same structure, what would be the best approach to processing that vector so that I can write to database? Ignoring the type of database

trancehime09:12:13

(Not looking for implementation details but more on logical and high-level thinking)

jaen09:12:18

Well, depends on what sort of processing do you think you need

agile_geek09:12:04

@trancehime: I know it sounds like @jaen is ducking the question but there are a number of options and it really depends on your use case. For example, is the vector likely to be large, does 'processing' (upsert?) the maps involve considerable latency, does the processing have to be 'synchronous', is it strictly transactional in nature....etc.

trancehime09:12:37

I know there's at least one use-case where the vector has 10 items in it

agile_geek09:12:44

This is actually a wider architectural question.

agile_geek09:12:37

Do you need every map in the vector to be 'processed' as one transaction? I.e. all succeed or all fail?

trancehime09:12:01

Basically yes

jaen09:12:19

@agile_geek: well, that too, but I was thinking of the somewhat narrower architectural question - do you want to translate keys between db and Clojure, only change it to kebab case, or also change names, are you content with Java dates or want JodaTime. But those are good points as well.

trancehime09:12:18

my concerns are more in line with @agile_geek's points

trancehime09:12:44

I mean, there will be cases where I'm going to be working with a sizeable chunk of data at once

agile_geek09:12:32

I have to admit to not having a huge amount of experience about doing this myself in Clojure (although huge amounts in other languages at scale).

agile_geek10:12:46

I would guess that generically you need a 'transaction' to close over a doseq

jaen10:12:52

@trancehime: is this a one time thing like importing something, or you expect to need to do bulk insert often?

trancehime10:12:09

Well, "often"

agile_geek10:12:25

You would use something like doseq (or doall) for the reasons why to use these fn's instead of a 'for' or 'map' see http://stuartsierra.com/2015/08/25/clojure-donts-lazy-effects

trancehime10:12:59

I actually use doseq for a different operation that's basically initializing empty records to be filled in later on

trancehime10:12:23

(I figured this out through trial and error D:)

agile_geek10:12:40

Yes. Although the lazyness of Clojure sequences often isn't discovered in experimentation with the REPL as the P (Print) tends to force realisation of the full sequence.

jaen10:12:04

@trancehime: with postgres if you want to import a lot of data there's copy - http://www.postgresql.org/docs/current/interactive/sql-copy.html - you can just pass over some CSV as file and it'll import it. It's the fastest way, but might not be right for you, hence I was asking how often would you do that and what for.

trancehime10:12:07

purpose is data submission of statistics in some table which will be compared against similar records input by different peoples

jaen10:12:03

Hmm, if the volume of those statistics is high

jaen10:12:09

This might actually be a sensible choice

jaen10:12:24

The downside is the file needs to be reachable for the DB

jaen10:12:21

MySQL seems to support something similar - https://dev.mysql.com/doc/refman/5.1/en/load-data.html - since I think you mentioned you're usign MySQL

agile_geek10:12:49

I think you'll find a 'bulk insert' operation in just about all RDBMS and for large datasets it should be your default option.

trancehime10:12:26

I mean it's kind of like a setup where the data input is occurring during observation of a given event

trancehime11:12:13

and also I have to provide a feedback to the user

jaen11:12:21

Since your sample data spoiled you seem to be doing something like LoL match analysis - the question is would data come as things happen (someone is killed, someone gets the buff and so on), or would you import the match wholesale after it ended?

trancehime11:12:08

First of all, the user in this case would do data submission once the individual set/game is over

jaen11:12:51

Ok, so I guess that's a point towards using an import mechanism like copy

trancehime11:12:33

Secondly, the idea is that multiple people will be submitting data for the same game and then there will be some checks made to see if the submitted data is the same

trancehime11:12:42

Now here's the caveat - this all has to happen within a period of time since the data submission

jaen11:12:13

I see. I guess in that case you could have a table for submissions; when a submission happens you do a single insert to create a submission id to associate the data with then serialise your vector as a CSV file with rows having a foreign key to that submission row, and import it with the copy mechanism.

jaen11:12:22

Then, and I suppose that depends on how you do the checking exactly

jaen11:12:03

You could just select all valid rows from the db and use copy again to insert them wherever they are needed after you confirmed they are valid.

jaen11:12:47

But that's just a rather rough guess on how that could be done

borkdude11:12:12

in Clojure can you have a mutable (transient) nested vector, or should I look at core.matrix?

trancehime11:12:19

But what if I have multiple people submitting data for different schedules at once, this means I'll just have a ton of CSV files wouldn't I?

jaen11:12:14

@borkdude: I think I've read somewhere that transience isn't transitive if that's what you mean. That is with (transient [1 2 [3 4]]) only the outer vector is transient. But I'm not 100% sure.

borkdude11:12:15

I tried core.matrix, but it didn't quite do what I expected. When I googled, I saw someone had the same problem with set-selection!

trancehime11:12:24

(Anyway, submission only happens once. It's modifications after that)

borkdude11:12:30

@jaen: that's correct

jaen11:12:58

@trancehime: hm, but would modifications be in bulk as well or not?

borkdude11:12:02

@jaen: I could of course use mutable arrays

trancehime11:12:17

It's essentially a "resubmission" so it might be bulk it might not be

jaen11:12:16

@borkdude: hmm, what's wrong with set-selection!? I've been fighting with it today for Advent of Code, curious what you had problems with.

borkdude11:12:42

@jaen: Exactly what I wanted to use it for simple_smile

jaen11:12:08

Well, I can tell it worked out for me.

borkdude11:12:17

@jaen: I played around with it, like this (matrix/set-selection! m [[0 0] [4 4]] (matrix/new-matrix 5 5)) (where m is a :double-array matrix), but I keep getting: RuntimeException Inconsistent count of selection arguments

jaen11:12:01

@trancehime: I see. Well yes, but after you copy the file into the database you can just get rid of those temporary files, so I don't think that's much of a problem?

jaen11:12:12

@borkdude: well, the docs aren't entirely clear to be honest. The coords should be exploded I think for starters, that is (matrix/set-selection! m [0 0] [4 4] (matrix/new-matrix 5 5)) but I'm not sure if that solves everything.

trancehime11:12:39

@jaen Well I suppose I'll just figure something out. Thanks

borkdude11:12:00

@jaen: "RuntimeException Mismatched shapes in assign!" ... I guess I should read the source. The docs are in fact not clear.

borkdude11:12:26

@jaen: I hate to say it, but this is the kind of library where static typing would help 😉

jaen11:12:48

@borkdude: here, have an example from my solution (that works) - (matrix/set-selection! m [[0 0] [4 4]] (matrix/new-matrix 5 5))

jaen11:12:58

I suck at copying & pasting xD

jaen11:12:09

(m/set-selection! grid coords-x coords-y 1)

jaen11:12:45

coords-x (range start-x (inc stop-x))
coords-y (range start-y (inc stop-y))

jaen11:12:12

you basically have to specify all rows that are affected

borkdude11:12:33

I'll try that later then. thanks simple_smile

jaen11:12:35

I don't know if you can do something like you want with setting a matrix

jaen11:12:38

No problem

jaen11:12:08

@borkdude: (m/set-selection (m/zero-matrix 10 10) [4 5 6] [4 5 6] (m/fill (m/zero-matrix 3 3) 125)) seems to work

jaen11:12:02

@borkdude: nice, though I must say I'm surprised you could just assign a keyword like that.

borkdude11:12:22

you can assign any object there right

jaen11:12:55

Maybe depends on the underlying implementation

borkdude11:12:26

if I understood correctly, ndarray stores objects of any type: https://github.com/mikera/core.matrix/wiki/Matrix-implementations

jaen11:12:11

Yeah, I've used vectorz, it can only deal with doubles IIRC.

mikera11:12:29

Yeah it is implementation dependent. There is an implementation that allows complex numbers, for example

borkdude12:12:38

@mikera: could I also initialize an nd-array with a default value?

mikera12:12:12

Hmmmm sure.... you can use assign!

jaen12:12:38

@mikera: oh, since you're here - how would you logically index a matrix to overwrite negative elements? I couldn't exactly figure this out and ended reshaping to 1D array and back to 2D, but I guess that's inefficient.

mikera12:12:42

Or if you don't need to mutate it, you can broadcast a single-element NDArray to the shape you want

borkdude12:12:44

@mikera: I mean, now I have:

(defn start-state []
  (-> (m/new-matrix :ndarray 10 10)
      (set-grid! 0 0 9 9 :off)))

jaen12:12:28

Right now I have something like:

(defn clamp-to-zero! [matrix]
  (let [linear (m/reshape matrix [(m/ecount matrix)])
        clamped (ms/set-sel! linear (ms/where-slice neg?) 0)]
    (m/reshape clamped [(m/row-count matrix) (m/column-count matrix)])))
but that's probably not too smart.

mikera12:12:29

just (assign! :off) simple_smile

mikera12:12:17

Clamping is best to use emax I think

mikera12:12:26

sorry clojure.core.matrix.operators/max

mikera12:12:49

(clojure.core.matrix.operators/max [-1 1 -10 10] 0) => [0 1 0 10]

jaen12:12:52

@mikera: seems to work, thanks; I kind of couldn't get selectors to select negative elements in a 2D matrix for some reason.

mikera12:12:03

The selector stuff is a bit tricky for sure. I was was thinking about replacing it with something more like specter

martinklepsch12:12:58

is there a thing like interleave that always exhausts the supplied colls? i.e. colls don’t have to be same length?

nowprovision12:12:06

not great (filter identity (interleave (concat [:a 😛 :c] (repeat nil)) [:1 :2 :3 :4 :5]))

ul14:12:18

hi, all! please remind me if any convenient data structure does already exist in clojure libs for many-to-many relation? naive implementation is to keep two maps of sets and keep them in sync., but i'm sure that somebody have done better

Pablo Fernandez15:12:37

I’m developing a library in the context of an app, how do I make clojure pick the code of the library instead of the installed jar (so I don’t have to re-install it every change I make)?

sveri15:12:51

@pupeno: leiningen has a feature called checkouts. It means you have a checkouts folder where you link to your other project. You still have to install it once and declare it as dependency in your project.clj file: https://github.com/technomancy/leiningen/blob/master/doc/TUTORIAL.md#checkout-dependencies

borkdude16:12:37

why is this so terribly slow for a 1000x1000 matrix? it makes my Emacs become unresponsive:

(defn to-string [matrix]
  (let [row-to-str (fn [row]
                     (apply str (for [e row]
                                  (case e :on "*" " "))))]
    (string/join "\n"
                 (map row-to-str (m/to-nested-vectors matrix)))))

borkdude16:12:37

with a StringBuilder and doseq it's better, probably because 1000 vectors don't fit in memory easily? (it surprised me)

borkdude17:12:52

what's the easiest way to tell how much memory a data structure occupies in Clojure

jaen17:12:20

I suppose it's rather has to do with the fact that Java strings are immutable, so you're generating a million of throwaway strings and copying the contents each time.

jaen17:12:26

But that's just a hunch

borkdude17:12:19

could be yes

jaen17:12:06

You're trying to visualise the lights exercise?

borkdude17:12:46

@jaen: yeah. Darn, my answer isn't correct, too low simple_smile

jaen17:12:24

I'm not sure if visualising will help, it's a big, big matrix after all

jaen17:12:32

Does it work for the sample data?

jaen17:12:52

I mean, the examples

borkdude17:12:05

@jaen yeah, just tried it for a smaller set, that worked, but big matrix is indeed too big to display

jaen17:12:18

Visualising a smaller matrix, like 10x10

jaen17:12:20

Can be a good idea

jaen17:12:28

And trying if that works well

borkdude17:12:30

turning on all lights works: (count-lights (apply process-line (start-state) (parse-line "turn on 0,0 through 999,999"))) equals a million

jaen17:12:43

Question is how well does toggling work

jaen17:12:58

It's probably the trickiest part

jaen17:12:16

I actually went a bit of a clever way

jaen17:12:30

And had 1 represent on, -1 represent off and I just multiplied by -1 to toggle.

borkdude17:12:45

this equals a 1000: (count-lights (apply process-line (start-state) (parse-line "toggle 0,0 through 999,0")))

jaen17:12:06

I think that with a keyword representation it might be harder to flip it

borkdude17:12:33

not really, I'm doing it like this:

(defn swap-value! [matrix x y f]
  (let [v (m/mget matrix x y)
        v' (f v)]
    (m/mset! matrix x y v')))

borkdude17:12:09

but working with numbers would be a lot faster of course

borkdude17:12:24

since you can then use matrix multiplication

borkdude17:12:41

but that would still yield the same result

jaen17:12:37

Can you make a 10x10 or so matrix and check if, say, turning on (1,1) to (5,5) and then toggling (0,0) to (4,4) and see if that equals IIRC 18

jaen17:12:05

That should probably show you if toggle works ok quite fast

jaen17:12:15

You could also display such a small matrix easily.

jaen17:12:31

Well, yeah, I guess I just didn't want to flip cells one at a time subconsciously

borkdude17:12:47

this equals 999996: (count-lights (apply process-line (m/assign! (start-state) :on) (parse-line "turn off 499,499 through 500,500")))

jaen17:12:50

That seems okay as well

borkdude17:12:21

@jaen: I could try to run over your input and see if I have a different answer than you?

jaen17:12:02

I suppose there's no harm in that

jaen17:12:04

It's 569999 for the first part

borkdude17:12:14

I'm getting 501660

borkdude17:12:32

what do you mean with 'first part'?

jaen17:12:59

Each exercise has two parts

jaen17:12:19

I missed that at first, only noticed it after the first exercise

borkdude17:12:50

@jaen: I've only started this one today simple_smile

weebz19:12:46

On OS X, does anyone know how to get lein to load what I think is a java extension? There's an issue with MIDI on OS X and I've read that this library is required

weebz19:12:16

I've placed it into /Library/Java/Extensions, but I'm getting an error about my classpath in the REPL, and I'm not sure how to instruct lein to load it

weebz19:12:32

Caused by java.lang.UnsatisfiedLinkError no mmj in java.library.path

borkdude19:12:51

@jaen lol, I rewrote in Scala, now the number is too high

borkdude20:12:08

@jaen: ah got it right now, I counted the lights that were off, though. Very fast in Scala too btw

borkdude20:12:19

only 2 seconds

jaen20:12:41

@borkdude: core.matrx + vectorz ~6secs for the first part

jaen20:12:47

For me at least

borkdude20:12:07

@jaen: In Scala I'm using a nested mutable array of Int

borkdude20:12:18

@jaen: thus not thread safe, but I hope Santa will forgive me