This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2024-02-06
Channels
- # aleph (43)
- # announcements (11)
- # babashka (35)
- # beginners (70)
- # calva (4)
- # cider (8)
- # clerk (15)
- # clojure (192)
- # clojure-dev (7)
- # clojure-europe (44)
- # clojure-nl (2)
- # clojure-norway (65)
- # clojure-uk (4)
- # code-reviews (4)
- # conjure (1)
- # cursive (41)
- # data-science (1)
- # datomic (8)
- # emacs (7)
- # fulcro (13)
- # humbleui (17)
- # hyperfiddle (53)
- # kaocha (4)
- # malli (7)
- # missionary (17)
- # music (1)
- # obb (1)
- # off-topic (8)
- # polylith (1)
- # portal (3)
- # releases (11)
- # shadow-cljs (36)
- # squint (4)
- # tools-deps (4)
moin moin
Morning!
Morning! back from FOSDEM '24 ... recording available on Cyber Resilience Act (CRA) and the Product Liability Directive (PLD) ... - https://fosdem.org/2024/schedule/event/fosdem-2024-3683-the-regulators-are-coming-one-year-on/
And a follow up panel - https://fosdem.org/2024/schedule/event/fosdem-2024-3676-cra-pld-panel/
I'm wrestling with a deps.edn tree that is determined to download artifacts from maven central
which is not available via the cantankerous corporate firewall.
does clj
have a way to generate a consolidated deps.edn file?
or a --trace
equivalent so I can see where it is picking up the
url from
I can see a list of deps.edn in clj -Sverbose
but none of them reference maven central...
There is -Strace
https://clojure.org/reference/clojure_cli#opt_strace
-Strace
does not seem to survive the Failed to read artifact descriptor
so it does not provide any useful debug info
I might have asked before, but do you prefer:
(into #{} (filter my-pred) my-list)
or
(->> list (filter my-pred) distinct)
?
if order is to be kept you can still use
(into [] (comp (filter my-pred) (distinct)) my-list)
and if you want the end result to be lazy then
(sequence (comp (filter my-pred) (distinct)) my-list)
in this instance it is the distinctness I'm after mostly. And afterwards I'm usually using it in a seq compatible fashion, so either works.
my pref is the transducer (either into, xforms/into, sequence, eduction, transduce, whatever)
If I have a seq with map/filter/etc that I then do a reduce on the end of then I go for transduce
things get a bit more complicated where the data is big enough I want to go multithreaded when I start to reach for tech.ml.dataset.reductions/group-by-column-agg or ham-fisted.reduce/preduce and friends
Another way to go multi threaded with transducers is core.async pipelines. Same semantics as into
I've done that too. I often found it pretty slow, but I'm thinking that was a PEBCAK problem rather than anything else
what I do like about core.async was being able to create a tree of processing. Lot of the stuff I do want has a common beginning (and or common middle steps) and being able to do those things once felt nice even if it didn't give me the wall clock performance I'm after (most of my constraints are around me sitting in front of a computer waiting for a bit of analysis or projection or modelling to complete)
if someone could point to me how I can really make core.async do batch processing in a fast way that takes advantage of all the cores in front of me I'd be very grateful
The main bottleneck is where doing per element processing isn't very efficient for CPU bound tasks. You can partition the input, then in parallel transduce every each batch, then combine Essentially a diy map reduce Play with the batch size and parallelism parameters and you'll probably get significant speedups
yeah, I end up playing with batch size a lot (luckily my data fits well with that as I usually have 100 or 1000 simulations so I can make a reasonable number of batches w/o thinking too hard about it)
For me, I prefer transducers over seq's because they are explicit about the return type (and I'm including lazyness in the type). When I started learning clojure, i tripped over lazy-seq's quite a bit, including returning them from dynamic scoped things like db transactions. Also you see people mixing side-effects and lazyness all the time. transduce
gives you a clear separation between side effects (in the rf) and functional transformation (the xform). The better gc profile and flexible parallelism are, I think, indications that it's a flexible abstraction. I kinda wish they weren't considered an advanced topic. I think lazy-seq's have more subtle rules that make them harder to get right when mixed with things like side-effects.
@UK0810AQ2 I have used reducers before. I do go for them sometimes still, but they are a bit last on my list.
@U0P0TMEFJ I find using transducers to be really straightforward (most of the time). Tho I do have to admit I still see the rf in the transduce as functional (tho maybe I'm missing something here).
oh ... my bad ... I didn't mean that rf's should be side effecty, just that they could - and they provide a distinct separation. Transducers are a transformation of a reducing function and a reduction has a source and a sink. Putting those together into a transducing context means that you can write a little mechanical bit that takes data from a kafka stream and puts data into a db (or whatever), passing in a transducer. It's a great way to separate out the functional transformation (and test it using into
) from the clunky side effecty stuff ... it just seems like a really powerful abstraction to me ...