This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
This might be more of a java then clojure issue but here goes. I’m getting a different sequence of bytes when I call (.getBytes (slurp “/path/to/somefile”)) on sun java 8 vs openjdk java 7. How is this possible? I assume it’s a unicode issue, but still...
is slurp’s default behavior going to change in java versions (same version of clojure on both jvms)
@max: does (java.nio.charset.Charset/defaultCharset) return the same thing in both?
@timothypratley: ha no. UTF-8 vs US-ASCII
so in practical terms, I found this bug because I was getting the wrong crc32 checksum of a string got from a POST request in a Ring app. This particular crc library uses .getBytes.
You can specify a character encoding in the .getBytes call, or set the default encoding as a system property
@otfrom: I use CRFs for tagging parts of natural language phrases, as a module in our search engine, for matching products to cooking recipe ingredients. As I said, directly using the Java APIs works, but could be made nicer.
@max: whenever I see .getBytes
in code, I think of a landmine — it's bound to explode someday. The only way to use it safely is with UTF-8
and then you have to be very sure that everything inside and around your app is UTF-8.
@max: you might want to report this as a bug to the library maintainers — calling .getBytes
without an encoding parameter is basically undefined behavior.
What is the general consensus on shadowing with let
assignments? Rather than doing stuff like myvar
and then myvar'
.
I don't know about the consensus, but I try to avoid it
yes; it's confusing because even when your eyes catch the outer myvar
binding, you can't be sure that there might be another inner myvar
binding
myvar* is not pretty, but a bit less confusing
Okay. I’ve always shadowed but a co-worker pointed out that he prefers the other way unless it’s immediately obvious so you can avoid stuff like myvar'
and myvar''
.
It depends on the use. I find I don’t need to shadow too often. Sometimes it makes sense, sometimes not.
I don’t think there’s a consensus either way
a related question is whether to shadow clojure.core/type, clojure.core/time etc.
I find I do it often, actually. Let’s say I have a function that processes a string. I tend to just shadow the string in the processing.
@pesterhazy: Personally I’d say beware of that. Can lead to tricky bugs
Not shadowing doesn’t bother me. A little added noise is worth it if it makes things clearer.
@andrewmcveigh: I agree, though it's tempting with common nouns like "type"
Well, the worst thing I find is constructing maps with keys like {:name “something” :type “error” …}
and then trying to destructure.
yes I was just about to mention this
Then you call name, and you get some error like cannot call string
so it extends to keywords as well (if you want to use them in destructuring let's)
Just be wary
Sometimes the best thing to call something is ‘reserved'
in particular, if you refactor the compiler won't catch a mistake (if you rename "type" or "name")
Though I guess, the main ones for me are: type, name, meta,
I never shadow something like map, list, etc.
map is pretty sad as well 😞
you end up having to choose single-letter abbreviations (m), a misspelling (klass) or punctuation (map*)
@akiva: Sure, but there’s always a time where the “best” name would shadow something, E.G., (defn operation-on-a-list [list]…)
Though in that case I guess there are unofficial idioms. x
, coll
, etc.
Well, if it’s still a category...
worst of all, now we also have "update" and "fold", more things to be careful with
ah, not fold
, sorry
Yeah, update
really kills me - I write a lot of animation/game code where it's a super-common term.
Though I guess more as a function name than a binding name.
For local assignment (and to a slightly lesser extent, destructuring), I find using a gensym fairly reasonable. Something like category#
or type#
.
voxdolo: is this better in any way than category* or category'?
err. type*
or type'
@pesterhazy: what’s the difference between the binding category
and category*
?
Pesterhazy: Since it's a reader literal for the gensym
function, I'd say so: https://clojuredocs.org/clojure.core/gensym
@andrewmcveigh: I meant to say type*
or type'
, because category
is actually not in clojure.core
I've only been working on clojure professionally for the last year, but the #
at the end is something I subconsciously scan for and attach the additional meaning to of "this thing is meant to alias something else and not to collide with another meaningful variable".
It's meant to be used in macros, but I think judicious use outside of them is also reasonable.
So, if the :type
is semantically a type, then name it ’type
, or be more specific. I don’t think you need to get hung up on it.
:type#
seems superfluous. Or don’t destructure keys like :type
, :name
, etc.
gensym
literals don't work for keywords :) thus why I said local assignment (ala let
) and to a lesser extent destructuring.
Yeah, sure.
But, part of the discussion was about keyword destructuring. Also applies to :type*
and :type’
.
Personally, I’d not bother with the type#
gensym. You only find them rarely outside of macros.
If it’s still the same thing we’re talking about, no reason to rename it. If not, think of something else to call it.
Or use some combination of (-> …)
and don’t bother naming it in the first place.
Whenever I see gensym, I assume I’m looking at a macro. I’d probably not want to see it outside of that, really.
gensym comes in handy in internal DSLs, even when macros not involved