This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-04-24
Channels
- # aws-lambda (1)
- # beginners (99)
- # boot (46)
- # cider (8)
- # cljs-dev (20)
- # cljsrn (37)
- # clojure (189)
- # clojure-dev (22)
- # clojure-dusseldorf (28)
- # clojure-italy (1)
- # clojure-russia (28)
- # clojure-spec (10)
- # clojure-uk (33)
- # clojurebridge (1)
- # clojurescript (64)
- # core-matrix (2)
- # css (3)
- # cursive (3)
- # datascript (34)
- # datomic (101)
- # defnpodcast (2)
- # dirac (5)
- # events (1)
- # funcool (3)
- # ldnclj (1)
- # lumo (11)
- # mount (1)
- # off-topic (95)
- # pedestal (2)
- # perun (10)
- # re-frame (3)
- # reagent (6)
- # ring-swagger (4)
- # specter (102)
- # test-check (1)
- # untangled (1)
- # vim (8)
- # yada (17)
@nathanmarz thanks for getting back to me so fast! This is the library I'm working with: https://github.com/Sophia-Gold/Madhava-v2. It's still pretty rough in terms of functionality I need to build out, but the core of it is computing a large number of partial derivatives all at once and storing them in an atomic hash-map. Since the number of partials grows exponentially by order and are stored with keys that match them to the corresponding variables at each order this means the maps end up heavily nested. The Github README describes how to use it, but I'll paste a snippet of the function that creates the map (which is quite small) below and describe some operations on it I'm having trouble getting done with core functions.
I should also mention I did try using data.int-map
last night and the increase in speed was barely noticeable, presumably because when generating the map I'm only assoc
ing data into it and not calling update
. Let me know if you think int-map
makes more sense with some of the operations on it I'm thinking of.
So for starters, the reason why I thought of this project with regards to specter is because I initially tried to implement some functions that would combine two maps by applying a given binary operation to all the values with matching keys (meaning all their keys matching down to the deepest level of nesting) and then either throw out the disjoint key-value pairs or merge them as is. I tried implementing that with postwalk
, but found it too messy to match for just the values (all vectors) since it returns everything in the map.
More realistically for calculating gradients I'd need to use my linear-transform
function in this library as the binary function yet instead of between two different maps, to values store in adjacent keys with some type of weighting scheme. I don't have that fully specified yet, so can set it aside, but I think specter could greatly help. I'm thinking something like a zipper except instead of starting from the head, which would be ridiculously slow, I'd want to go straight to the relevant key and then create the zipper right there.
That's a bit abstract, but a simpler example are the operations I referred to last night. I was trying to see if I could use partial differentiation to encode laws for cellular automata so started playing around with extremely large maps (often verging on too large to print) generated from one very high-dimensional term and going up to many orders like so: (diff [[0 0 0 1 0 0 1 1 0 1 1 1 1 1]] diff-map 7)
. That's the map I was trying to search inside of using some
(to see if the initial term was repeated) and realized it failed searching for any values due to the level of nesting. Similarly, when using (into (sorted-map) ...)
how it would sort to everything but the final level of nesting, filter
to eliminate >50% of it that consisted of empty vectors, and vals
to try and visualize the entire thing with the keys stripped out entirely.
So that gives me five operations on nested hash-maps I couldn't achieve with core. I'm not sure how involved they'd be to do with specter, but seems like they could be a very good place to start as far as learning the library.
@sophiago partial-diff can be written with specter like:
(let [pidx (peek idx)]
(select [ALL
(selected? (keypath pidx) #(-> % zero? not))
(view
(fn [expr]
(-> expr
(update 0 * (get expr pidx))
(update pidx dec))))]
p
))
maybe a little bit of an improvement
this is a lot of code for me to parse, I can't really help with what are the optimal data structures / algorithms for this purpose
if you show me subproblems you're dealing with, with inputs you have and outputs you want, I can show you how to use specter for it
Interesting. I'm not sure how much of performance I can wring out of that tiny function, but it helps me to understand specter.
specter's most useful when you want to change part of a data structure
(and leave the rest unchanged)
when you want to combine everything into a new data structure, that's not what specter is for
though it can be helpful for pieces of it
(`traverse` is useful when you want to combine only parts of a data structure)
Well, I mentioned a few tasks so I'm trying to think where's the best to start. I did have trouble figuring out how to combine two nested maps by applying a function to all values with the same key signature, but you're saying specter isn't ideal for something like that?
it really depends on the problem
So far everything I've mentioned is an operation on an entire map, either like above, or scrubbing the data in some way or other.
if you just had two random maps that you wanted to combine, probably not
but if you wanted to do something like merge a map in one location in a data structure with another map at some other location, specter will be useful
As mentioned, even just sorting and filtering deeply nested maps is not quite working out with core. How would you go about that?
what's a specific example of what you want to do?
input and output
So, as one example, (some #(= % [[120 3 3 0]]) @diff-map)
returns nil
even though that value is in the map.
ok, for this specter can be very helpful
you can define a recursive navigator to the "leaves" of the structure (values which are vectors)
one sec
(def LEAVES
(recursive-path [] p
(if-path map?
[MAP-VALS p]
STAY
)))
then you can do (select-first [LEAVES #(= % [[120 3 3 0]])] @diff-map)
that will return either [[120 3 3 0]] or nil
if you do select
, then you'll get a sequence of every match
that's very performant too
you can continue navigation at each leaf as well
(select [LEAVES ALL ALL odd?] @diff-map)
gets you call the odd numbers in those vectors
So I can also use select
like how I want vals
to work on a structure like this? To strip out keys?
don't think of it as "stripping"
it's just querying the data structure for a sequence of matches
but yea, that will get you all the vectors of numbers as a single sequence
Oh, as a sequence? I thought a big feature was returning the same kind of data structure. What if I want to perform operations on all the values and return a map with the same nesting? For example filtering?
use transform
for that
(transform [LEAVES ALL ALL even?] inc @diff-map)
that gives you a new map with all the even numbers incremented
(setval [LEAVES ALL AFTER-ELEM] 10)
appends the value 10 to each vector of numbers
For just filtering would you recommend calling transform
with identity
as the function?
what's an example of what you want to filter?
most likely it will be something along the lines of (setval [...] NONE @diff-map)
(setval [LEAVES ALL (selected? ALL even?)] NONE @diff-map)
will remove any vector of nums that has an even number within
I was thinking something like (transform [LEAVES ALL ALL #(not= [])] identity @diff-map)
what you wrote will be a no-op
you want to remove empty vectors?
Yes. I'm more interested in working with the vectors as a whole than values inside of them. So that's one example. I can see wanting to apply a function to a whole vector like you did about with inc
as well, though.
(setval [LEAVES ALL #(= [] %)] NONE @diff-map)
will remove empty vectors
(transform [LEAVES ALL] some-custom-fn @diff-map)
runs an arbitrary function on each vec of nums
the path in transform says what you want to change in the data structure, meaning anything else will remain unchanged
so what you wrote says to transform each non empty vector with identity
This is really great! I don't quite understand how you defined the LEAVES navigator to begin with, but from there everything is incredibly simple.
it's pretty simple
it's just saying how to get to the vec of vecs
recursive-path
lets you have the path refer to itself (using p
in this case)
it's saying "if currently at a map, recurse at all the map vals, otherwise already at the target so stay navigated"
example?
Let's say I have two of that map above, so the keys are equal, and I want a new map with the same structure and the values the result of a binary function applied to the old one?
I don't want to make it more complicated by adding a long function to multiply or divide two vectors, but you get the idea. I would also ideally like to be able to choose what to do with key signatures that don't match, but that seems more complicated.
ah, yea specter won't help with that
that sounds like a somewhat complicated merge-with
I can't quite remember, but that must have been the first thing I tried for that before postwalk
. It works with nesting?
your call to merge-with
will have to be recursive
i've thought a bit about extending specter to walk multiple structures at once, but I haven't personally had many use cases for it
it would likely be a massive change to how specter works and I'm not sure it can be done while maintaining near-optimal efficiency
Yeah, I'm not taking into account efficiency of using specter for something like that.
those transformations I was showing before will be near-optimal efficiency
I do think being able to precisely define a path and apply transform
to that will really help me with this library.
will blow away the performance of postwalk
yea, I think this should be a good start for you
I guess the only other thing I was going to ask about was sorting and why (into (sorted-map) @diff-map)
wasn't changing anything. This one came out perfectly sorted, but when they get really gigantic and hairy, they're not on the deepest level. I designed it that way in case I wanted to use something similar to pmap
.
that will only sort the first level
you can define a navigator to every map and then do (transform MAP-NODES #(into (sorted-map) %) @diff-map)
(def MAP-NODES
(recursive-path [] p
(if-path map?
(continue-then-stay MAP-VALS p))
))
anyway, it's bedtime for me, I'll be back online tomorrow if you have more questions
Thanks so much! This will definitely get me started. I was also just going to ask whether you find much of a speed increase using int-map
with these kind of functions?
I've never used int-map
so I'm not familiar with which operations it achieves speedups
I'm guessing it's faster for get and assoc, which these transformations we've been doing aren't using