Fork me on GitHub
#clojure
<
2023-12-14
>
itaied06:12:19

Hey all, trying to wrap my head around spec , I don't know if I'm missing something or it's just not suitable for our needs. I'm looking for a data validation tool, and spec offers much more (I really liked the generate functionality). Point is, our data is a nested map, and I don't want to fully qualify (namespace) the keys, as there's already a lot of code that use it. Is it a requirement to use spec maps with fully qualified keywords or we can workaround this?

lassemaatta06:12:28

use the :req-un and :opt-un with s/keys

lassemaatta06:12:58

https://clojure.org/guides/spec#_entity_maps > Much existing Clojure code does not use maps with namespaced keys and so keys can also specify :req-un and :opt-un for required and optional unqualified keys. These variants specify namespaced keys used to find their specification, but the map only checks for the unqualified version of the keys.

itaied06:12:47

oh I have missed that then.. thanks! 👍

itaied10:12:32

How can I write a recursive spec? I'm trying to write something like:

{:k1 {}
 :k2 {}
 :children [:n1 v1
            :n2 v2]}
Where v1 and v2 are the same structure of this object (recursive object).

dvingo11:12:21

@U057T406Y15 I would highly recommend using malli for data validation needs you can easily define recursive schemas (and do much more) https://github.com/metosin/malli#recursive-schemas

1
lassemaatta11:12:28

perhaps something a bit like (s/def ::my-map (s/keys :opt-un [::k1 ::k2 ::children])) + (s/def ::children (s/map-of keyword? ::my-map)) might work. it's been a while since I've clojure.specced.

itaied11:12:12

ok thanks, ill take a look at malli as well

itaied11:12:53

instead of map-of, how would you validate a vector that behaves like a map? note that the structure is [:k1 :v1 :k2 :v2 ...]

lassemaatta11:12:52

https://clojure.org/guides/spec#_sequences might help, particularly the example about "opts are alternating keywords and booleans"

🙌 1
cjsauer15:12:16

👋 hi folks, is anyone using clojure with kafka? Wondering what the state of the art as far as libraries/tools is in that space nowadays.

vemv15:12:58

See also #apache-kafka

cjsauer14:12:27

much appreciated!

Noah Bogart16:12:48

is there a way to avoid RT.intCast(n) when calling something like (.charAt s n)?

Alex Miller (Clojure team)16:12:08

use int ? why would you call RT.intCast directly

oyakushev16:12:33

Use unchecked-int:

(clj-java-decompiler.core/decompile (defn char-at [^String s ^long n] (.charAt s (unchecked-int n)))

...
    public static Object invokeStatic(final Object s, final long n) {
        return ((String)s).charAt((int)n);
    }
...

oyakushev16:12:37

(int)n would basically be a no-op on little-endian architectures, so this is good enough

Alex Miller (Clojure team)16:12:04

unchecked-int is an intrinsic so gets replaced with the jvm bytecode L2I directly, which is likely optimized away by hotspot in most cases

☝️ 1
Alex Miller (Clojure team)16:12:44

just as a general rule, you should never be calling RT directly

oyakushev16:12:11

I think Noah meant how to avoid the compiler generating an RT.intCast call.

👍 1
Noah Bogart16:12:16

yeah, i'm just toying around with some ways of counting chars in a string (advent of code related), and looking at the decompiled of a loop i wrote, i see calls to RT.intCast(n)

Alex Miller (Clojure team)16:12:28

if you can let bound n as an int prior, it will be and stay an int

oyakushev16:12:39

RT.intCast pops up surprisingly often in the profile for being quite trivial when doing something string/array-index related in a loop. So unchecked-int is a good tool.

oyakushev16:12:21

This also works in my example:

(set! *unchecked-math* true)

(decompile (defn char-at [^String s ^long n] (.charAt s (int n))))

...
return ((String)s).charAt((int)n);
...

Alex Miller (Clojure team)16:12:27

(let [n (.length s)] ...) for example will let bind n as a primitive int and you will not need a long to int cast

oyakushev16:12:05

But one should be careful because I swear I sometimes see RT.uncheckedIntCast instead in case *unchecked-math* is enabled.

Alex Miller (Clojure team)16:12:29

well you should if converting long to int

Noah Bogart16:12:51

(let [s (str/join "?" (repeat 100 "a?s?df"))
      len (.length s)]
  (loop [cnt (int 0)
         idx (int 0)]
    (if (< idx len)
      (if (.equals \? (.charAt s idx))
        (recur (unchecked-inc cnt)
               (unchecked-inc idx))
        (recur cnt (unchecked-inc idx)))
      cnt)))
produces
public static Object invokeStatic() {
    final Object s = string$join.invokeStatic("?", core$repeat.invokeStatic(const__2, "a?s?df"));
    final int len = ((String)s).length();
    long cnt = RT.intCast(0L);
    long idx = RT.intCast(0L);
    while (idx < len) {
        if (((Character)const__6).equals(((String)s).charAt(RT.intCast(idx)))) {
            final long n = cnt + 1L;
            ++idx;
            cnt = n;
        }
        else {
            final long n2 = cnt;
            ++idx;
            cnt = n2;
        }
    }
    return Numbers.num(cnt);
}

oyakushev16:12:58

But doesn't *unchecked-math* already imply you are prepared for pain?

Noah Bogart16:12:03

(with direct-linking enabled)

oyakushev16:12:56

Ah, you mean if the compiler doesn't know n is a primitive, than yes, can't do it the other way.

oyakushev16:12:13

@UEENNMX0T unchecked-int

👍 1
Alex Miller (Clojure team)16:12:25

yeah, loop bindings can't remain as ints, so you really can't avoid a long to int case this way

👍 1
Noah Bogart16:12:38

i didn't know that about loop bindings

Alex Miller (Clojure team)16:12:40

unless you cheat with int[]

👀 1
Alex Miller (Clojure team)16:12:25

I actually would like to change that about loops, for exactly this kind of case - everything in Java colls and arrays and strings is int-indexed. we have a ticket for this somewhere

Alex Miller (Clojure team)16:12:06

there's not really any reason we couldn't bend loop to allow long, double, and int as primitive bindings

Alex Miller (Clojure team)16:12:37

(it does long and double now)

Noah Bogart16:12:43

the idea being that passing type hints across loops is risky?

oyakushev16:12:02

I guess the only reason was that the compiler is marginally smaller if you only get to handle longs and doubles.

oyakushev16:12:19

And it is also consistent with longs and doubles being the only primitives on function boundary.

Alex Miller (Clojure team)16:12:55

it is consistent with that, but I think bending the rule for ints would catch an extremely common interop pattern

❤️ 1
oyakushev16:12:17

I personally would love that.

Alex Miller (Clojure team)16:12:43

the long/int conversions are really cheap, but they aren't free

oyakushev16:12:46

Can also finally write safepoint-less int loops in pure Clojure.

Alex Miller (Clojure team)16:12:10

yeah, that's it (don't care about float though :)

Noah Bogart16:12:28

lol i'm not here to argue about floats, but ints would be nice 😇

Noah Bogart16:12:46

thanks for the help, friends

❤️ 1
Noah Bogart16:12:56

now it's only marginally slower than calling (count (re-seq "\?" s)) lmao

oyakushev16:12:07

If you want to get really nerdy about it, and get to pick your JVM version (post-9 uses byte arrays for ASCII strings where it's even more efficient), then you could read the string 8 bytes at a time into a long, then xor it with a prepared mask of ????????, and then use POPCNT to count the matches. Something that Richard Startin calls SWAR (SIMD within a register).

😄 1
oyakushev16:12:17

Useless here for sure, but it is a fun excercise.

Noah Bogart16:12:40

that sounds like fun haha

oyakushev16:12:38

Actually, scratch that, POPCNT wouldn't work as that would be bit-wise, not byte-wise, but there is a solution somewhere along those lines.

Alex Miller (Clojure team)16:12:48

assuming you ignore multi-byte characters of course :)

oyakushev17:12:46

Sure, it's for advent, so inputs are known

Ben Sless18:12:48

Was going to ask for primitive char comparisons 🙃

Noah Bogart18:12:17

what do you mean by "primitive char comparisons"?

Ben Sless19:12:21

There's no clojure.lang.Util/equiv(char, char)

👍 1
Ben Sless19:12:51

Any string processing in Clojure will always suffer a bit in relation to java

Ben Sless19:12:02

I have no idea why I didn't find it, I blame it on being sick

Ben Sless19:12:21

Primitive char arguments would be nice, but I know the reasons for not supporting them

Joel16:12:30

In regard to load-string and variants, can I expect to dynamically update a namespace/function if the code has been loaded using AOT previously? For example, (load-string "(ns com.project.logic) (defn f [] 42)") after I see com.project.logic__init (AOT class) was loaded?

Joel16:12:28

Just asking as we are doing some class loading shenanigans which might be causing this use case to fail. Just wanted to make sure “load after AOT bootstrap” should generally work.

Alex Miller (Clojure team)16:12:06

so you are first loading an aot class, then calling loadString?

✔️ 1
Joel16:12:13

Exactly. Thru Java really:

IFn inString = Clojure.var("clojure.core", "load-string");
inString.invoke("(ns com.project.logic) (defn f [] 42)");

Alex Miller (Clojure team)16:12:06

the namespaces are runtime state, so ns is not going to create anything new. the defn can create a new var in the existing namespace at runtime

Alex Miller (Clojure team)16:12:17

so if I understand what you're asking, I think that should work

thanks3 1
Joel16:12:38

I didn’t know if I needed to do (defn ^:dynamic f []…

Alex Miller (Clojure team)17:12:26

is f already loaded or new?

Joel17:12:40

already loaded.

Alex Miller (Clojure team)17:12:54

re-def'ing vars is something you do all the time at the repl, doesn't need to be dynamic

Joel17:12:45

Got it. I think you answered my question, basically loading/redef should still work even after AOT.

Alex Miller (Clojure team)17:12:55

AOT is really irrelevant here

seancorfield18:12:05

Does direct-linking impact this?

Joel18:12:56

@U04V70XH6 From what I gather it looks like if it were direct-linked, ^:dynamic would correct that.

seancorfield18:12:33

I want to say there's some other metadata that is specifically meant to override direct-linking... ^:redef perhaps? I've never had to use it (but we direct-link our AOT'd uberjars at work so it was a conscious tradeoff to lose a degree of REPL-based redefinition in production). Yes, we have REPLs running inside some of our production processes.

Alex Miller (Clojure team)18:12:58

there is ^:redef, but it doesn't affect the ability to rebind

Alex Miller (Clojure team)18:12:23

it does affect whether other directly linked functions see the rebinding

1
Alex Miller (Clojure team)18:12:33

redef (and dynamic) vars aren't direct linked

Joel19:12:55

Sounds like i should try tagging the AOT function as ^:redef. Is there a way to tell if I’m using direct linking? The build script I presume would indicate somehow.

Alex Miller (Clojure team)19:12:30

it's a https://clojure.org/reference/compilation#directlinking but by default it's off, so you're probably not using it

1
stephenmhopper22:12:18

I'm working on a distributed application that's going to use NATS as the messaging layer. Messages are written as byte arrays. I'm looking at using Transit for reading and writing messages. Should I be creating a new ByteArrayOutputStream and writer every time I need to convert a message to a byte array before sending it, or is there a way to re-use writers? I currently have this but it means I'm creating a ByteArrayOutputStream and writer every time I need to send a message:

(defn clj->byte-array [obj]
  (with-open [baos (ByteArrayOutputStream.)]
    (let [writer (transit/writer baos :json)]
      (transit/write writer obj)
      (.toByteArray baos))))