Fork me on GitHub
#beginners
<
2022-05-27
>
stagmoose01:05:44

I know how -> and ->> are doing in code. I think of these threading macros as a "pipeline" thing. And I've seen others write some atomic functions, test them in a repl and make sure they work. And finally put these funcs in a pipeline using threading macros like (-> x-thing (atomic-func1 arg1-1 ...) (atomic-func2 arg2-1 ...)) My question is when designing those atomic functions, what's your mental models to make sure the x-thing is always the first arg (->) or the last arg (->>). I know one can decide/alter the parameter order definition in a function, but what if when you use a thread-first macro, and a clojure built-in function (namely mixing functions you wrote and built-in function in a threading macro), but this built-in function take x-thing as the last arg?

seancorfield02:05:47

In addition, this page https://clojure.org/guides/threading_macros has the following to say: • By convention, core functions that operate on sequences expect the sequence as their last argument. Accordingly, pipelines containing map, filter, remove, reduce, into, etc usually call for the ->> macro. • Core functions that operate on data structures, on the other hand, expect the value they work on as their first argument. These include assoc, update, dissoc, get and their -in variants. Pipelines that transform maps using these functions often require the -> macro. • When calling methods through Java interop, the Java object is passed in as the first argument. In such cases, -> is useful, for example...

👍 3
1
dpsutton03:05:44

i thought i remembered alex had a good blog post about collections versus sequences but i cannot find it right now

🙏 1
seancorfield03:05:42

(and collections https://insideclojure.org/2016/03/16/collections/ but that's mostly about internals)

stagmoose03:05:48

Thanks for all the replies from you guys! I'll read those material later and post questions if I have any problems.

Jon Olick17:05:52

question about

(= (pmap inc [1 2 3 4 5]) [2 3 4 5 6])

Jon Olick17:05:10

it seems to me that pmap returns a bunch of futures... and = automatically deref's them?

Jon Olick17:05:26

or rather a lazy-seq of futures

Jon Olick17:05:30

but same difference in this case

Jon Olick17:05:07

The question is, does (= automatically deref things to do comparisons?

Jon Olick17:05:53

or does pmap automatically deref things before handing back the values in the lazy list?

dpsutton17:05:17

connection=> (pmap inc (range 10))
(1 2 3 4 5 6 7 8 9 10)
here’s the return value. it is just a sequence of integers. = has no futures to deal with

Jon Olick17:05:15

you can see that pmap just does a map of futures

dpsutton17:05:58

look at the step function

dpsutton17:05:03

in the source there

Jon Olick17:05:06

ok yeah, so looks like it deref's before it hands it back

Jose Varela17:05:16

Thank you @seancorfield for always responding to posts, you’re prob top 1 aid in advancing my Clojure journey!

🙌 4
❤️ 6
sheluchin19:05:07

Is there any way to get a shape description of a large data structure?

phronmophobic19:05:40

I'd be interested to hear if you have a particular kind of description in mind. My attempts at addressing this problem are https://github.com/phronmophobic/treemap-clj and https://github.com/phronmophobic/viscous. There's also various other generic data exploration tools: • https://github.com/djblue/portalhttps://vlaaad.github.io/reveal/https://docs.cider.mx/cider/debugging/inspector.htmlhttps://docs.datomic.com/cloud/other-tools/REBL.html

sheluchin20:05:50

@U7RJTCH6J Your projects are quite neat! I don't have a very detailed description in mind. I was hoping others have thought through the details before me 🙂 In general, I'm looking for some that can take a large data structure and provide some sort of summary that says something like "A vector with n keywords follow by a map, where the map is all keywords to numbers, except this one particular keyword which has a map value", and so on... but not in long sentences like I provided 😛 and Spec is okay - certainly descriptive enough - but Specs don't look anything close to the literal representation of the data they describe. I guess something Spec-like, but still resembling the shape of the literal structure rather than describing it functionally as Spec does. I'm imagining something of this sort:

[k...42 {k->n...99 :particular->{...}}]

phronmophobic20:05:17

That sounds interesting. I still think there's lots of room for improvement in this area. Just some random related thoughts: • One challenge is that the summaries often require more screen space than an example. Examples can be very informative • One aspect of your suggestion that I really like is including a way to express and exploit repetition to compress the summary. • 1d arrays of numbers of all sorts of summary statistics (eg. mean, median, mode, range, max, min, std deviation, etc). It would be nice to have similar summary statistics for heterogenous data.

sheluchin20:05:49

Hmm, yeah, that's a good way of putting it - using repetition to compress the summary. Another approach might be to skip the majority of a special notation and just go with what tutorials often do and provide just enough examples to describe the essence of the thing. Translating my previous example:

[:x :y ... {:a 1 :b 2 :particular { ... }}]
Maybe a mix of both approaches would be good: provide sufficient examples to describe the essence but also include a bit of special notation to provide summary stats of heterogeneous data.

👍 1
sheluchin18:05:15

@U7RJTCH6J just one more thought on the subject, since you seem to do much work in the area.. I don't know if my approach is unusual, but I often find myself diffing structures in the course of troubleshooting. I guess it must be a common enough angle, since Reveal comes with built in https://github.com/lambdaisland/deep-diff2 support. Anyway, in a summary with a bunch of statistics like you mention above, it might be helpful to think about making those summaries statistically diff-able as well. ie. not just the textual difference between two means, but the calculated difference. Just a thought 🙂