Fork me on GitHub
#clojure
<
2022-10-26
>
Daniel Gerson10:10:26

Hey @mkvlr. You wrote that you're keen to move away from core.async in the following thread. As someone who has felt compelled to use this as the only solution in a web-socket 'live' app, I'm wondering if your comments were context dependent or general? Promises/futures are fine for one-time use, but I don't see the alternative. Rather than derail that 🧵 for which the comment wasn't the focus I thought I'd start a new one here. https://clojurians.slack.com/archives/CLX41ASCS/p1666718061315689?thread_ts=1666708739.765979&amp;cid=CLX41ASCS

Daniel Gerson10:10:17

Presumably you see loom as the alternative?

borkdude13:10:44

I expect that core.async will have some loom integration as well

mkvlr13:10:35

@U03B2SRNYTY I’m eager to move away from go blocks which have the don’t do IO in then gotcha and mangle the stacktraces.

mkvlr13:10:52

with loom one could presumably only use the blocking parts and it wouldn’t have the same limitations and problems

mkvlr13:10:27

another general problem with core async is the lack of visibility what’s going in the channels

Joshua Suskalo13:10:55

Yeah, with Loom you can do CSP with just virtual threads and a j.u.c.BlockingQueue

borkdude13:10:41

There is some issue with thread locals + big objects and virtual threads. Since Clojure uses thread locals for dynamic vars, has anyone got more insights on that? The below example is about as fast in bb as in Clojure (100k threads that wait a second, takes about 1-2 seconds to finish the whole thing) https://clojurians.slack.com/archives/CLX41ASCS/p1666722786342499 I tried rewriting the example in Java, but got tired of doing that after the first three lines ;)

Daniel Gerson13:10:34

Thanks for the comments. Look forward to comparative approaches blogs. Wonder where clojurescript will sit in all this.

Daniel Gerson13:10:51

@U04V15CAJ Promises by design are one time resolve. If you have an unpredictable stream of data coming in, you have to use something else.

borkdude13:10:03

That's true

Ben Sless17:10:11

@U5NCUG8NR I thought that queues are insufficient because of alts?

Joshua Suskalo17:10:52

Yeah personally in a post-loom world I'm going to use core.async as a channel lib

Ben Sless17:10:56

We just need to converge on a tiny library to expose a virtual backed go block (and future?)

didibus17:10:59

Loom still suffer from unfair scheduling and CPU heavy compute can hord the real threads. Hope they address that as well. Otherwise you still need to be careful about CPU heavy dispatch.

borkdude17:10:10

@UK0810AQ2 Can't the <!! and async/thread stuff be supplied a VirtualTaskPerThreadPool or so? Then there don't have to be any API changes perhaps

Joshua Suskalo17:10:15

@U04V15CAJ what I just posted is my attempt at monkeypatching core.async to use a virtual thread pool for everything

Joshua Suskalo17:10:16

sourcehut srasu is like my testbed for stuff that I'm not expecting to catch on in the community at large

borkdude17:10:36

In bb the core.async go macro is emulated by normal OS threads, and then <! operations are backed by <!! like your monkeypatch. Once bb is built using JDK19 I want to fix that by using virtual threads for the go macro

didibus18:10:01

I think ideally you change a bit more stuff, because core.async itself uses go mechanisms for things like put! and such. So I think ideally you change everything to use Loom. I also think in theory, having the ability to also use real threads would be nice. So for me the best change would be to keep thread for real threads and go for virtual threads and just update all the other ops to work with those. It would mean you don't really need !! or ! anymore since they'd both work irregardless of thread type, so you could probably just alias one into the other.

borkdude18:10:29

exactly. what has put! got to do with go though?

borkdude18:10:30

there is a similarity in that you should not use blocking IO in put!

Ben Sless18:10:22

I don't think I'd monkeypatch, just add a virtual/go macro which runs on a virtual thread, only use !!

Joshua Suskalo18:10:02

go is not used internally in put! , which uses the thread pool I created in the monkeypatch. Making thread use OS threads is I think not a bad idea and is something I thought about, but the key reason I didn't do that is that a lot of existing core.async code uses thread to wrap blocking IO for use within an async context, and for that context you want that to be using virtual threads too.

👍 1
Joshua Suskalo18:10:51

@UK0810AQ2 the reason that I want this to be a monkeypatch is so that I can make libraries which don't update use virtual threads in certain contexts. For me personally this is to update a library I wrote which must maintain compatibility with java 1.8 until Clojure drops support for it and which uses core.async for IO.

Ben Sless18:10:08

I see In that case, I would expose a non monkeypatched implementation in one namespace, and a mp in another so that users have a choice when requiring the library

Joshua Suskalo18:10:26

Yeah when I think about releasing this I'd probably do that.

Ben Sless18:10:08

I was also thinking about providing virtual future and virtual go block in the same library and compile the core.async code conditionally if the library is resolved

didibus03:10:24

put! dispatches on the same thread pool as go no?

Joshua Suskalo03:10:27

yes, but you can (and my example does) replace that pool with a vthread pool

didibus03:10:10

Ya, I guess I think I'm more wondering if certain details need to be rethought. For example does the puts and takes queues and the limits on pending puts/takes make sense? Isn't that just the vthreads at that point? When core.async pool was fixed it made sense to have them wait in a queue prior to dispatch, now should they just park on a vthread? But my memory of core async is fuzzy, I feel I'm always confused a bit by it's implementation so maybe I'm off the mark here.

Daniel Gerson09:10:27

Found Ron Pressler in Java One released 5 days ago, and then found this talk which I see has already been talked about here. Posting again for those that haven't seen it. https://clojurians.slack.com/archives/C8NUSGWG6/p1652437205417559?thread_ts=1652433406.842079&amp;cid=C8NUSGWG6 What I like is how much context there is to design decisions from other languages, compared to just talking about the implementation.

pez11:10:01

Has anyone seen a Clojure cljc library for ”social” timestamps. For presenting human readable relative time stamp, such as, yesterday at 13:37, or, 2 days ago, etc. I. e. equivalent to this GWT library https://github.com/PEZ/GWT-Relative-Time (but a bit better maintained 😃 )

pavlosmelissinos11:10:11

I don't think it covers "yesterday at 13:37" though

pez12:10:31

Thanks! Yes, something like that. I think I can at least use it as a start for what I need. What seems to be missing the most is some i18n thing for it.

diego.videco15:10:30

Hello, why is it that casting an int to double in Clojure is different than in ClojureScript?

dpsutton15:10:28

the jvm has distinct integer and double types. The javascript vm has only a single Number type. 3 and 3.0 are distinct in jvm land. In js, they are indistinguishable and exactly identical

diego.videco16:10:04

I see. We are having some problems validating a schema that checks for doubles in the backend, but the cljs side sometimes sends numbers that look like ints. Is there any easy/simple way around this, besides loosening the schema?

winsome16:10:31

Add a coercion step on the jvm?

💯 1
diego.videco16:10:50

Yeah, that’s another option

didibus16:10:04

There's probably something wrong in the parking on the server side.

didibus16:10:09

Seems like a bug

dpsutton16:10:27

JSON.stringify({"a": 1.00000})
"{\"a\":1}" 

didibus17:10:18

I mean, the server will need to parse the payload from the client which is most likely encoding all data as strings, so something when the server parses a field does so where it returns a long probably, but should return a double since that's what the string encoding is apparently supposed to be.

dpsutton17:10:37

the string encoding will be an integer because the javascript and cljs side cannot represent a double that has exactly 0 decimal component. My example there shows that any string coming from a client cannot send a double 1.0000 using native numbers.

didibus04:10:12

(parse-double "3")
This is a valid encoding for double. The problem is they probably rely on read or something else for parsing, which tries to guess the type from the syntax. If they expect it to be a double and they want to allow 3.0 to be encoded as 3 that's fine, but then the server has to parse the field into a double.

souenzzo18:10:38

Some time ago I learned that "collection transformers" are composed with ->> and "collection builders" are composed with -> (please read the first comment for the correct explanation) This is a not strict pattern that clojure.core uses and many libraries flow sometimes without even knowing about that. Seems that on libraries exists one other pattern: pass the config as first argument or as last argument. Configs on first argument are usually named env or ctx and seems to be more common when qualified keywords are used. Configs on last argument are usually named opts or config and seems to be more common when the keywords are not qualified.

hiredman19:10:12

never heard of the transformers vs. builder stuff. the explanation I've seen is seq functions take the seq last, collection functions take the collection first, the natural fall out of this is seq functions(map, filter, reduce) compose with ->> and collection functions (get, assoc, update-in, get-in) compose with ->

souenzzo19:10:43

there is some place (blogpost or something like) where this is documented?

didibus04:10:36

It's kind of a arguably slightly annoying design choice, but that was made with good reasons and the other way around is more of a tradeoff of pros/cons I feel. Partial/currying is more useful to do something where you wrap the mapping function or predicate and then want to apply the same mapping to multiple collections. Which is why Functional languages had a tradition to do this. But it turned out that lambda syntax is often nicer so partial was never used that much and lambda can do a partial that captures anything. And so now we have a bit of a dichotomy with needing to remember between -> or ->> and sometimes contort one into the other. But at the same time, taking it last works better with multi-collection functions like (map + [1 1] [1 1]) You couldn't overload map like this if it took it first. I think in the end there's not really a clear winner or an always perfect rule around when to design the function to take it first or last.

Dallas Surewood21:10:58

What is the accepted way to use spec but without open maps? Are there third party libraries or does Spec itself have a way to massage a map to only the accepted keys?

p-himik21:10:06

https://github.com/metosin/spec-tools has closed-keys. And it uses https://github.com/bhauman/spell-spec which you can use on its own.

Michael Gardner23:10:39

my recommendation is to use Malli 😉

didibus04:10:28

Spec 2 has closed spec as well.

didibus04:10:53

But with spec, I normally just add a predicate that asserts there's no keys that isn't inside a set

didibus04:10:12

(s/def ::foo
 (s/and
  (s/keys :un-req [::a ::b] :un-opt [::c])
  #(every? #{:a :b :c} (keys %))))

didibus16:11:46

True, I would not use it haha. Malli is probably the spiritual spec2, though I find the original still quite good and use Spec1 normally.

kwladyka21:10:41

Why exactly https://clojure.org/reference/transients perform better? I mean detailed answer why. How Clojure behave differently for transients in details?

andy.fingerhut21:10:33

If you create a transient of an immutable collection, it creates a "mutable version" of it, that shares almost all structure with the immutable original.

andy.fingerhut21:10:12

As you do adds/rmoves/etc. operation on this mutable version, it creates new mutable tree nodes as needed, using the path copying approach that is also used for immutable updates.

andy.fingerhut21:10:36

If a transient modify operation finds a mutable node in a tree, it can mutate it in place rather than allocate a new immutable one.

andy.fingerhut21:10:26

Thus if you do a sequence of ! update operations on a transient collection, it depends on exactly what updates you do, but every time it mutates one of these already-mutable tree nodes, you save a memory allocation.

andy.fingerhut21:10:59

If you want more details than that, there is the Java source code 🙂

andy.fingerhut21:10:36

If you are not familiar with what these trees look like, or what "path copying" means, I'd recommend reading this excellent article, and its sequels (which I believe are linked in the first one): https://hypirion.com/musings/understanding-persistent-vector-pt-1

kwladyka22:10:03

> If a transient modify operation finds a mutable node in a tree, it can mutate it in place rather than allocate a new immutable one. What is the difference here? Does immutable version keep some history of changes and this is what make it slower? I would like to learn exact different.

kwladyka22:10:42

I thought Clojure make new immutable value by keeping reference to old values and add new values on top of it.

kwladyka22:10:52

Well then it can make a tree of reference in memory hmm

kwladyka22:10:23

I am thinking about this like about C++ * and & maybe it is not how it works

kwladyka22:10:32

I have to read this article, it looks like a long one

kwladyka22:10:04

oh wait, do I correctly understand Clojure literally copy data to keep it immutable? Not refer to this data but copy them?

kwladyka22:10:09

Ok so my final question is: what are use cases for transients and what are uses cases to not use transients ?

kwladyka22:10:23

Do functions like reduce recur loop etc. use data as transients by default? I don’t think so. Why it is not default?

kwladyka22:10:18

actually it start to look a little like opposite to all languages. Normally people do const to say it is immutable variable, but we should do transients to say it is mutable.

kwladyka22:10:41

otherwise we lose performance each time

seancorfield00:10:21

@U0WL6FA77 Using transients does have a cost so they're faster when a lot of updates are being made "locally" -- because repeated updates can be faster because of optimization involved (using mutability to reduce the amount of reference copying -- since the old versions no longer need to be retained). But the pattern should always be: 1. make code work 2. if you have a lot of updates on a single data structure in a limited context, try transient/`persistent!` to see whether the faster updates outweigh the conversion to/from transient data.

seancorfield00:10:36

If you look at the source of into you'll see that it decides to use transients, depending on the type of the to collection passed in and it assumes that there will be repeated calls to conj! to make that worthwhile (which it will increasingly be for larger from collections).

andy.fingerhut01:10:34

"I thought Clojure make new immutable value by keeping reference to old values and add new values on top of it." It does, for immutable collections, which requires allocating new memory to create a new collection, but it is NOT a complete copy. Typically O(log N) new memory is allocated to add or remove one element to/from a collection.

andy.fingerhut01:10:36

Using transients to build a vector of N elements starting from an empty vector in Clojure is about as optimized as can be, due to the way transients and vectors work, so into's use of transient vectors is awesome for performance when building a vector.

andy.fingerhut01:10:51

(or appending N elements to the end of an existing vector)

didibus04:10:20

The way I think of it and maybe it's also wrong:

1 -> 2 -> 3
;; Change 3 to 4
1 -> 2 -> 3
       -> 4
;; There's now two versions, but each
;; share 1 -> 2 so those are not copied
;; which means it now form a tree

;; Old reference will take the top branch
;; and see [1 2 3], but new references
;; will take the bottom branch and see
;; [1 2 4]
;;
;; With transient you would instead mutate
;; So if you make the references taking
;; the bottom branch transient and update
;; 4 to 5
1 -> 2 -> 3
       -> 4
;; Becomes
1 -> 2 -> 3
       -> 5
That means inside the transient context, the transient path is mutated, it doesn't affect other references that are taking a different path. But it's not going to mutate everything, because if you try to update 1 it can't just mutate that without affecting the other references, so in that case it would have to fork again and then from that point on it could mutate on the transient fork. This is why in practice transient is good when you will be mutating the part that is not shared, because that's when it's faster, otherwise it's the same. I also think conversion from/to transient adds a bit of overhead and so maybe for small number of operations you're going to do inside the transient context it's not worth it.

didibus04:10:35

That means transient is really good when you are creating a big collection from scratch in a single local context. Which is why into uses transient for example. Otherwise it could also be good if you're going to make a lot of updates or add a lot more elements to it especially at the end where it's for sure not gonna be shared.

kwladyka07:10:51

Thank you for all explanations

kwladyka09:10:12

Any other functions like into ?

kwladyka09:10:40

Is there a list of all clojure.lang.IEditableCollection type of data?

andy.fingerhut11:10:20

Inside the clojure.core namespace, the only other function I see that takes advantage of transients is update-vals. Others have been proposed to use transients in JIRA tickets, but either not evaluated in detail, or just not applied yet. I think zipmap might have been considered for similar treatment?

andy.fingerhut11:10:18

Oh, never mind, zipmap does use transients for a freshly created collection, always. It just doesn't use IEditableCollection, because it does not update its input collection(s).

andy.fingerhut11:10:05

grep'ing the Clojure source code is the best way, for core collection types.

kwladyka11:10:10

is it interesting mapv use transient, but map doesn’t

kwladyka11:10:54

well I understand map is lazy, but worth to notice the difference

andy.fingerhut13:10:34

Definitely a difference. I am having a difficult time imagining how transients might be helpful for producing lazy results

jaide21:10:08

Can .cljc files wrap target specific code in the reader conditionals?

jaide21:10:19

Like can I refer to js/document.querySelector if wrapped in a reader conditional?

seancorfield21:10:56

The closest example I have (in my own code) is https://github.com/clojure-expectations/clojure-test/blob/develop/src/expectations/clojure/test.cljc#L380-L394 which shows how code can be completely different for Clojure and ClojureScript and can use symbols that don't exist in the other dialect.

jaide21:10:54

Thanks! Just confirmed that behavior with a small test case too

1
jaide22:10:17

Looks like I need to convert a Java file to a byte-array. Was hoping (.getBytes file) would do the trick but that does not seem to work

hiredman22:10:47

a java.io.ByteArrayOutputStream and copy will do it like a baos

😎 1
jaide22:10:11

I found:

(java.nio.file.Files/readAllBytes (java.nio.file.Paths/get (.toURI file)))
Would you say your solution is better?

hiredman22:10:35

you can call .toPath on file

Dallas Surewood23:10:44

Is anyone using integrant? How can I access components without using integrant.repl.state?

wevrem23:10:57

You pass them down through your functions.

Dallas Surewood23:10:27

I'm not sure what you mean. I don't mean referencing them in other integrant components. When defining an init-key for a component, the result is returned to that key for other integrant components to reference. I'm wondering how to reference that component in other parts of the code, outside of system.edn. One solution would be swapping the component to an atom before returning it from init-key, but I was wondering if I was missing something

hiredman23:10:56

I haven't used integrant, but similar to component, if something needs a it, then pass it to it, nothing should need access to the entire system

hiredman23:10:38

so like, if function F1 needs component C1, well you find out that F1 is called by F2, which is called by F3, which is called from component C2, then component C2 depends on C1, and passes C1 down through F3, F2, and finally to F1

Dallas Surewood00:10:36

The component is just a function, and the place it's needed is in a route. Integrant is a little different, and I don't think there's an equivalent scenario to what you described

hiredman00:10:16

I don't think it is that different, my preferred approach is to have handlers be components, so they can directly depend on other components as needed. Then there is some kind of route building component that depends on all the handlers and wires them up with routing, then the webserver component depends on the routes

seancorfield00:10:50

There's an #C52HVRVE1 channel but I don't know how active it is. In general, with systems like these (Integrant, Component, Mount) you are building a "system" from its dependent components and you "get at" those subcomponents by pulling them out of the "system" -- and pass what bits you need down the call chain. As hiredman says, another approach for web apps is to actually make your handlers into "components" that have as dependencies exactly those parts they need and then the "system" they have access to is smaller and flat (and uses whatever names the handler's dependencies have declared rather than the generic ones in the overall "system"). And then your overall "system" depends on all those handler "components" so that the lifecycle library can pull it all together. How easy (or hard) that is might depend on what libraries you have in play in your web app...

hiredman00:10:13

That exact thing may not work, because integrant requires you to refer to components by the "global" name on the system, so having multiple instances of the same component parameterized differently doesn't work the same

wevrem00:10:55

A few months ago I went through the process of switching my app from mount to integrant. With mount you define state scattered here and there throughout your namespaces with its defstate macro. Then in your functions and all over your code you just refer to those state variables, like global variables. Switching to integrant (or if you were starting with integrant), I had to refactor my functions so that the state they needed would be passed in by the caller and they no longer rely on any global variables. In my integrant config, I have a system which is the collection of all the components I need, like databases and smtp servers and credential secrets, etc. When I start the http server (also an integrant component — basically the topmost component), it uses system to initialize itself, and passes system to anything else that needs it, like my ring handler.

seancorfield00:10:45

@UTFAPNRPT That sounds very similar to how we use Component in most of our apps at work.

👍 1
Dallas Surewood00:10:07

The component is a HugSQL query function made after attempting a conman connection. I have a function called api-routes. In your example, The route component would depend on the query function, and it would pass the query-fn to api-routes so that it could pass it further down to any controller and any function the controller is calling. Is that correct?

Dallas Surewood00:10:24

Wevrem, not sure how to pass system down to other functions.

wevrem00:10:20

I could show you, if you want. Not sure if there is some sort of “quick video call” feature on slack or not.

seancorfield00:10:59

(it's based on my Component/Compojure-based version)

Dallas Surewood00:10:15

Thanks for the offer, won't be able to take a call right now I'll look at that link

wevrem00:10:22

If you are still stumped after looking at that, we can find a time to share screen and discuss.

Dallas Surewood01:10:44

I think I get the gist. I'm using a bootstrap framework so I'm trying to draw parallels between this and what my projects structure is But it looks like I missed that this is already doing it halfway. Looks like the component I want is being passed to another component

Dallas Surewood01:10:58

:reitit.routes/api
 {:base-path "/api"
  :env #ig/ref :system/env
  :query-fn #ig/ref :db.sql/query-fn}

Dallas Surewood01:10:42

The init-key just didn't use it, so I didn't see it. I've changed it now.

(defmethod ig/init-key :reitit.routes/api
  [_ {:keys [base-path query-fn]
      :or   {base-path ""}
      :as   opts}]
  [base-path (route-data opts) (api-routes opts query-fn)])

wevrem01:10:49

That looks right.

wevrem22:08:15

Ten months later and I’ve changed my approach slightly. (Assuming anyone cares to know…) I’m still happily using integrant, but I don’t pass the system down from caller to route-handler, because doing so required me to wrap all my route handlers in what I’ll call “handler-generator-functions” that close over system. Like this:

(defn create-handler [system]
  (fn [request]
    ... refer to system as needed
    response))
And then my route definitions were themselves functions with calls to all these generator functions:
(defn routes [system]
  ["/" {:get (create-handler system)}
   ...])
That was my old way of doing it, and involved, as you can see, a lot of higher-order functions. My new way involves (1) a very important (HOF) middleware that injects the system into the request, (2) reitit’s middleware registry, and then (3) all my handlers can just pick up what they need from the request itself. No wrapping necessary, and my routes can go back to simple defines.
(ns ajax.middleware)

(defn inject-system
  "Generate middleware to inject system into request"
  [system]
  (fn [handler]
    (fn [request]
      (handler (assoc request :ajax/system)))))

(ns ajax.router
  (:require [ajax.middleware :as mw]
            [ajax.api :as api]))

(defn router [system]
  (ring/route [api/routes
               home/routes
               ...]
              {:reitit.middleware/registry
               ,, {:inject-system (mw/inject-system system)}}))

(ns ajax.api)

(def routes
  ["/api" {:middleware [:inject-system ...] 
           :get handler-that-retrieves-system-from-request
           ...}])