This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-12-14
Channels
- # adventofcode (29)
- # aws (3)
- # babashka (25)
- # beginners (13)
- # calva (4)
- # cherry (7)
- # cider (26)
- # clj-kondo (9)
- # clojure (88)
- # clojure-europe (21)
- # clojure-losangeles (3)
- # clojure-nl (1)
- # clojure-norway (6)
- # clojure-uk (11)
- # clojuredesign-podcast (2)
- # clojurescript (4)
- # cursive (10)
- # datalevin (1)
- # emacs (50)
- # gratitude (1)
- # honeysql (12)
- # hyperfiddle (19)
- # jobs-discuss (28)
- # kaocha (3)
- # lsp (53)
- # malli (4)
- # meander (3)
- # off-topic (48)
- # re-frame (11)
- # releases (2)
- # ring-swagger (2)
- # shadow-cljs (50)
- # squint (26)
- # tools-build (3)
- # tools-deps (8)
- # xtdb (4)
- # yamlscript (1)
Hey all, trying to wrap my head around spec
, I don't know if I'm missing something or it's just not suitable for our needs.
I'm looking for a data validation tool, and spec offers much more (I really liked the generate functionality).
Point is, our data is a nested map, and I don't want to fully qualify (namespace) the keys, as there's already a lot of code that use it.
Is it a requirement to use spec maps with fully qualified keywords or we can workaround this?
use the :req-un
and :opt-un
with s/keys
https://clojure.org/guides/spec#_entity_maps
> Much existing Clojure code does not use maps with namespaced keys and so keys
can also specify :req-un
and :opt-un
for required and optional unqualified keys. These variants specify namespaced keys used to find their specification, but the map only checks for the unqualified version of the keys.
How can I write a recursive spec? I'm trying to write something like:
{:k1 {}
:k2 {}
:children [:n1 v1
:n2 v2]}
Where v1
and v2
are the same structure of this object (recursive object).@U057T406Y15 I would highly recommend using malli for data validation needs you can easily define recursive schemas (and do much more) https://github.com/metosin/malli#recursive-schemas
perhaps something a bit like (s/def ::my-map (s/keys :opt-un [::k1 ::k2 ::children]))
+ (s/def ::children (s/map-of keyword? ::my-map))
might work. it's been a while since I've clojure.specced.
instead of map-of
, how would you validate a vector that behaves like a map?
note that the structure is [:k1 :v1 :k2 :v2 ...]
https://clojure.org/guides/spec#_sequences might help, particularly the example about "opts are alternating keywords and booleans"
this came up in the malli channel a while back: https://clojurians.slack.com/archives/CLDK6MFMK/p1694865844204819?thread_ts=1694457936.119919&cid=CLDK6MFMK
👋 hi folks, is anyone using clojure with kafka? Wondering what the state of the art as far as libraries/tools is in that space nowadays.
some good info here too https://www.youtube.com/watch?v=ogZHQF_cucQ
is there a way to avoid RT.intCast(n)
when calling something like (.charAt s n)
?
use int
? why would you call RT.intCast directly
Use unchecked-int
:
(clj-java-decompiler.core/decompile (defn char-at [^String s ^long n] (.charAt s (unchecked-int n)))
...
public static Object invokeStatic(final Object s, final long n) {
return ((String)s).charAt((int)n);
}
...
(int)n
would basically be a no-op on little-endian architectures, so this is good enough
unchecked-int
is an intrinsic so gets replaced with the jvm bytecode L2I directly, which is likely optimized away by hotspot in most cases
just as a general rule, you should never be calling RT directly
yeah, i'm just toying around with some ways of counting chars in a string (advent of code related), and looking at the decompiled of a loop i wrote, i see calls to RT.intCast(n)
if you can let bound n as an int prior, it will be and stay an int
RT.intCast pops up surprisingly often in the profile for being quite trivial when doing something string/array-index related in a loop. So unchecked-int
is a good tool.
This also works in my example:
(set! *unchecked-math* true)
(decompile (defn char-at [^String s ^long n] (.charAt s (int n))))
...
return ((String)s).charAt((int)n);
...
(let [n (.length s)] ...)
for example will let bind n as a primitive int and you will not need a long to int cast
But one should be careful because I swear I sometimes see RT.uncheckedIntCast
instead in case *unchecked-math*
is enabled.
well you should if converting long to int
(let [s (str/join "?" (repeat 100 "a?s?df"))
len (.length s)]
(loop [cnt (int 0)
idx (int 0)]
(if (< idx len)
(if (.equals \? (.charAt s idx))
(recur (unchecked-inc cnt)
(unchecked-inc idx))
(recur cnt (unchecked-inc idx)))
cnt)))
produces
public static Object invokeStatic() {
final Object s = string$join.invokeStatic("?", core$repeat.invokeStatic(const__2, "a?s?df"));
final int len = ((String)s).length();
long cnt = RT.intCast(0L);
long idx = RT.intCast(0L);
while (idx < len) {
if (((Character)const__6).equals(((String)s).charAt(RT.intCast(idx)))) {
final long n = cnt + 1L;
++idx;
cnt = n;
}
else {
final long n2 = cnt;
++idx;
cnt = n2;
}
}
return Numbers.num(cnt);
}
(with direct-linking enabled)
Ah, you mean if the compiler doesn't know n
is a primitive, than yes, can't do it the other way.
yeah, loop bindings can't remain as ints, so you really can't avoid a long to int case this way
i didn't know that about loop bindings
I actually would like to change that about loops, for exactly this kind of case - everything in Java colls and arrays and strings is int-indexed. we have a ticket for this somewhere
there's not really any reason we couldn't bend loop to allow long, double, and int as primitive bindings
(it does long and double now)
the idea being that passing type hints across loops is risky?
I guess the only reason was that the compiler is marginally smaller if you only get to handle longs and doubles.
And it is also consistent with longs and doubles being the only primitives on function boundary.
it is consistent with that, but I think bending the rule for ints would catch an extremely common interop pattern
the long/int conversions are really cheap, but they aren't free
related ask: https://ask.clojure.org/index.php/4612/loop-should-retain-primitive-int-or-float-without-widening
yeah, that's it (don't care about float though :)
lol i'm not here to argue about floats, but ints would be nice 😇
now it's only marginally slower than calling (count (re-seq "\?" s))
lmao
If you want to get really nerdy about it, and get to pick your JVM version (post-9 uses byte arrays for ASCII strings where it's even more efficient), then you could read the string 8 bytes at a time into a long, then xor it with a prepared mask of ????????
, and then use POPCNT to count the matches. Something that Richard Startin calls SWAR (SIMD within a register).
that sounds like fun haha
Actually, scratch that, POPCNT wouldn't work as that would be bit-wise, not byte-wise, but there is a solution somewhere along those lines.
assuming you ignore multi-byte characters of course :)
what do you mean by "primitive char comparisons"?
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Util.java#L118-L120 ?
Primitive char arguments would be nice, but I know the reasons for not supporting them
In regard to load-string
and variants, can I expect to dynamically update a namespace/function if the code has been loaded using AOT previously?
For example,
(load-string "(ns com.project.logic) (defn f [] 42)")
after I see com.project.logic__init
(AOT class) was loaded?
Just asking as we are doing some class loading shenanigans which might be causing this use case to fail. Just wanted to make sure “load after AOT bootstrap” should generally work.
so you are first loading an aot class, then calling loadString?
Exactly. Thru Java really:
IFn inString = Clojure.var("clojure.core", "load-string");
inString.invoke("(ns com.project.logic) (defn f [] 42)");
the namespaces are runtime state, so ns is not going to create anything new. the defn can create a new var in the existing namespace at runtime
is f already loaded or new?
re-def'ing vars is something you do all the time at the repl, doesn't need to be dynamic
Got it. I think you answered my question, basically loading/redef should still work even after AOT.
AOT is really irrelevant here
Does direct-linking impact this?
@U04V70XH6 From what I gather it looks like if it were direct-linked, ^:dynamic
would correct that.
I want to say there's some other metadata that is specifically meant to override direct-linking... ^:redef
perhaps? I've never had to use it (but we direct-link our AOT'd uberjars at work so it was a conscious tradeoff to lose a degree of REPL-based redefinition in production). Yes, we have REPLs running inside some of our production processes.
there is ^:redef, but it doesn't affect the ability to rebind
it does affect whether other directly linked functions see the rebinding
redef (and dynamic) vars aren't direct linked
Sounds like i should try tagging the AOT function as ^:redef. Is there a way to tell if I’m using direct linking? The build script I presume would indicate somehow.
it's a https://clojure.org/reference/compilation#directlinking but by default it's off, so you're probably not using it
I'm working on a distributed application that's going to use NATS as the messaging layer. Messages are written as byte arrays. I'm looking at using Transit for reading and writing messages. Should I be creating a new ByteArrayOutputStream
and writer
every time I need to convert a message to a byte array before sending it, or is there a way to re-use writers?
I currently have this but it means I'm creating a ByteArrayOutputStream
and writer every time I need to send a message:
(defn clj->byte-array [obj]
(with-open [baos (ByteArrayOutputStream.)]
(let [writer (transit/writer baos :json)]
(transit/write writer obj)
(.toByteArray baos))))