This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
I don't know if it's useful, but yesterday I came up with an idea of how to stick tagged literals into json: treat all maps with a single key that starts with "#" as a tagged literal.
E.g. {"#date": "2022-07-16"}
Idiomatic way to transform an empty string to nil, but leave a non-empty string as is? Right now I have:
(when (not-empty s) s)
But something feels off about that.
I answered my own question.
(not-empty s)
I sort of knew about not-empty
without knowing the details and was using it as if it were a predicate returning true
or false
. But in looking it up I learned it is not a predicate but returns a transformed nil or the original collection. That probably explains why the name doesnβt end with ?
and why my original use felt somehow off.
Alex once recommended to me to use str/blank?
instead to not coerce the string into a seq
Or rather, it doesn't coerce it to produce the result - only to check the emptiness.
And blank?
has different semantics.
Right, that's what I meant in my second message. But it's just a construction of a wrapper object for a non-empty string - not a big deal.
Of course. It's just that one has to be really careful with the semantics here - much more careful than with a construction of a throwaway object (that on modern JVM seems to be a non-issue at all).
user=> (str/blank? " ")
true
Sure, I wasn't saying that this is the answer to a question, just a comment that you might not want to create seq garbage from strings if you don't have to. May or may not be important.
But now that I'm reading back to the question, you're right that this is not what wevrem was after - I guess it was a knee-jerk reaction to: oh someone's using seq on a string again ;)
One man's garbage is another man's treasure. :) Clojure calls seq
everywhere - I wouldn't even blink at someone calling it over a string, to be honest. It's doesn't actually transform a string, it doesn't traverse it. It just inspects its length, exactly once.
If something is slow - one should profile it. I'm willing to bet that, unless there's a tight loop doing barely anything than checking string non-emptiness, seq
will not come even close to the top of the "self time" on the profiler report.
has anyone tried putting together a formal spec for which characters are allowed where in the language? for example, which are valid in keywords, which are valid in symbols, etc
I'm not sure if this is sufficient, but the LispReader class in clojure.lang has a symbolPat
regex that looks like it might be used to read a symbol or keyword - <https://github.com/clojure/clojure/blob/5ffe3833508495ca7c635d47ad7a1c8b820eab76/src/jvm/clojure/lang/LispReader.java#L66>
This is intentionally not formalized. There are characters that are explicitly allowed and a few things explicitly disallowed, and a large intentionally ambiguous area for future expansion
Which is not to say you can't tokenize it (obviously the reader does)
the thing is that I'm going to be tokenizing documentation that may have clojure symbols sprinkled within it and so tokenizing it as part of the full grammar isn't really possible
and so having an idea of what is allowed would let me guess if a stretch of text is a valid symbol or not
Symbols are intended to allow a pretty wide set of allowable things
sure seems like it from that pattern
doesn't a large ambiguous area mean that future expansion will likely break backwards compat for code in the wild?
just trying to understand things. this is super helpful π
It would depend on what the expansion entails, wouldn't it? If it's expansion of disallowed, yeah, but if it's expansion of explicit allows, then it's "requiring less".
well from the sounds of it the ambiguous area is where things will be expanded, giving certain characters new meaning, which in turn may change the meaning of your code
if that's what you mean
sorry, I'm not 100% sure what you meant there exactly
There are probably not a lot of those kinds of things, but things like | for delimiting (similar to Common Lisp) is one thing we've looked at a couple times
If you want to do what Clojure does, then certainly follow LispReader (which has rarely changed)
cool, thanks alex π
fwiw, I just meant that if, for example, the pipe symbol went from ambiguous/unspecified to explicitly disallowed, that'd be a breaking change, but if it went to explicitly allowed, that'd be a non-breaking change
Yes, which is why any such change would only be made with a lot of thinking and early notice
I ask because I want to change tokenizing on cljdoc's docset search so that it can find valid symbols and keywords and such