This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-02-18
Channels
- # announcements (5)
- # aws (4)
- # babashka (30)
- # beginners (90)
- # calva (31)
- # clj-on-windows (16)
- # clojure (110)
- # clojure-dev (10)
- # clojure-europe (26)
- # clojure-nl (1)
- # clojure-norway (20)
- # clojure-spec (25)
- # clojure-uk (15)
- # clojured (2)
- # clojurescript (12)
- # code-reviews (2)
- # community-development (3)
- # conjure (14)
- # datomic (26)
- # defnpodcast (2)
- # events (1)
- # fulcro (17)
- # graalvm (8)
- # gratitude (1)
- # introduce-yourself (2)
- # jobs-discuss (7)
- # kaocha (6)
- # lsp (9)
- # luminus (5)
- # nextjournal (7)
- # observability (9)
- # off-topic (71)
- # portal (5)
- # practicalli (1)
- # rdf (21)
- # re-frame (15)
- # releases (1)
- # shadow-cljs (24)
- # testing (7)
- # tools-build (13)
- # tools-deps (14)
- # xtdb (7)
Ok I'm back with another url-encoder type of question... the thing is, the built in function from ring replaces special characters and spaces with %20 and so on... this makes the urls look really ugly. Websites like quora and medium will take a blog post title and have a url "that-is-like-this". So I wonder, can I just write a clojure function that would take a blog title (a string), and ignore all special characters? I know I can use replace to change the spaces to "-" but how do I ignore the special characters, or take only numbers and letters?
If you want a valid URL, then you need things like %20, yes? If you want to avoid those, then you need to find other replacements that work well enough for your use case. However, if your use case is "turn an arbitrary Unicode string into a valid URL", I doubt you can avoid the %20 kind of sequences in the result.
I don't think I need %20 to have a valid url... Medium is a popular website and their url's just turn spaces into "-". Maybe I'm not sure what you mean?
For example if someone has a blog post "my #1 blog's title" then medium has a url to the blog post "/my-1-blogs-title". However url encoders have the output "my%20%231%20blog%27s%20title". I want to figure out how to get the same url that Medium has? It ignores special characters
Then you can look for any documents that explain what Medium uses for their translation of strings to URLs. One question you may or may not need to worry about is "if symbols like # and ' are simply removed, what should happen when two different titles produce the same URL?"
Oh, I see. Yes it appears they have some characters that come after the title, which must be to take care of that problem
There very well might be some widely used function that Medium uses for that title-to-URL translation, but it might also be some custom thing they wrote themselves.
Thanks, that's helpful
For the sake of trying to tie this in with clojure... Any thoughts on how I can take a string and get rid of special characters in clojure?
I'm thinking (replace "my $string" "$" "")
But how would this cover all the special characters? Is there a code I can use?
If you enumerate all characters you want to keep, that is probably a shorter list than all characters you want to remove, but not sure whether you consider most of the Unicode character set as something you want to remove, or something you want to keep. It is a huge character set.
I see... so if I want to just keep certain characters from a string... I would probably use a different function than replace, right?
What about doing something like:
(apply str
(remove #{\# \%} "the#quick%brown")) => "thequickbrown"
Thanks, chase. I think that's similar to what I'm looking for. Is there a way I can remove all special characters without listing every one I can think of? (Or I can probably look up which characters are not allowed in URLs)
Look at the docs for (Java) regular expressions: it has escaped characters that represent entire classes of Unicode characters.
In particular this page: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html. and things like \p{Lower}
and others listed there
Here's a function from our code that attempts to check whether something is a vaguely valid URI:
(defn invalid-uri?
"Given a URI, return true if it contains characters that we don't believe
should be in a valid URI. For now, what we allow in a URI are:
* - . ( ) ' / _
* Any Unicode alphanumeric p{L} letter p{M} combining mark p{N} numeric
Anything else, we'll assume it's a bad URI and we'll give a 404 for it.
As of WS-10501, we also use this to validate affiliate IDs."
[uri]
(try
(->> uri
(java.net.URLDecoder/decode)
(decode-location)
(re-find #"[^-\.()'/_\p{L}\p{M}\p{N}]")
(boolean))
(catch Throwable _
;; any URI that won't decode is almost certainly invalid!
true)))
This Java doc page for URL encoder mentions explicit lists of characters that it does not transform: https://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html
This uses a negative match [^...]
followed by things we do allow.
(don't worry about the decode-location
function in the middle -- ignore that, it's specific to some URI parsing we do -- the important part is the re-find
call @bradj4333)
Cool thanks! This is really helpful @andy.fingerhut @seancorfield
I still have the question though if you don't mind š Let's say I come up with a list of characters that I want to keep from a string - is there an easy way to just take certain characters from a string in clojure? (for example if I have a string with numbers and letters, and only want to take the numbers - or have alphanumeric mixed with special characters and only want to take alphanumeric)
I see there's ways to remove & replace - should I just focus on using those for my needs?
At this point I kind of want to try writing my own url-translation function for practice š
Well the opposite of remove
is filter
if that is what you are asking.
(apply str
(filter #{\# \%} "the#quick%brown")) ;; "#%"
I'm still in a "I can do regex but it takes a lot of googling and it's write only (as in I won't understand my own solution 6 weeks from now" phase
Ahh ok thanks @chase-lambert Yea, I am VERY intimidated by regex at this point! I'll try filter first, but if I can't do that then I'll look more into regex (which I need to learn better at some point anyway š )
dev=> (clojure.string/replace "the quick brown fox jumped over the lazy dog" #"[remove]" "")
"th quick bwn fx jupd th lazy dg"
dev=>
#"[...]"
is regex-speak for "the set of characters inside the brackets". Then you just replace those matched substrings with the empty string. Matched characters begone! @bradj4333
dev=> (clojure.string/replace "the quick brown fox jumped over the lazy dog" #"[^remove]" "")
"eroomeovereo"
dev=>
and with ^
inside the brackets -- as I mentioned above and showed in that function from work -- it keeps just the characters you specify, by removing anything that doesn't match.An example closer to what you're asking about: only keeping alphanumeric characters from a URI:
dev=> (clojure.string/replace "" #"[^a-zA-Z0-9]" "")
"httpsappslackcomclientT03RZGPFRC053AK3F9cdnfallback1"
dev=>
Oohhhhh cool thanks @seancorfield! Something like #"[^a-zA-Z0-9]" is exactly what I was looking for. I wasn't sure how to represent all that so succinctly. Thanks a lot!
(def t {:a "a" :b "b" :c "c})
((juxt :a :b) t)
=> ["a" "b"]
how can I pass dynamic parameter to juxt
p1 [:a :b] ,p2[:a :c] (juxt p1) means (juxt :a :b)fwiw, i think the use of apply and juxt is great, but people might find this more obvious at first glance if i understnad your input and output spec right: (-> {:a "1" :b "2" :c "3"} (select-keys [:a :b]) vals);; => ("1" "2").
though juxt might be faster.
I suppose i would hope the order didn't matter given it came from a hashmap. But yea.
You could do it with map as well: (map m ks)
(I still prefer juxt though)
(defprotocol DatetimeToStrProtocol
(date-str [date-time])
(date-add-min [date-time min])
(str-date [s pattern]))
(extend-protocol DatetimeToStrProtocol
;; LocalDateTime to string
java.time.LocalDateTime
(date-str
[date-time]
(.format
(java.time.format.DateTimeFormatter/ofPattern "yyyy-MM-dd HH:mm:ss")
date-time))
(date-add-min
[date-time min]
(.plusMinutes date-time min))
(str-date
"string to date
format : yyyy-MM-dd HH:mm:ss"
[s p]
(.parse
(java.time.format.DateTimeFormatter/ofPattern p)
s))
; Date to string
java.util.Date
(date-str [date-time]
(.format
(java.text.SimpleDateFormat. "yyyy-MM-dd HH:mm:ss")
date-time))
(str-date
"string to date
format : yyyy-MM-dd HH:mm:ss"
[s p]
(.parse
(java.text.SimpleDateFormat. p)
s)))
show down error message:
; Evaluating file: date.clj
; Syntax error (IllegalArgumentException) compiling at (src/xing/util/date.clj:12:1).
; Don't know how to create ISeq from: java.lang.Character
; Evaluation of file date.clj failed: class clojure.lang.Compiler$CompilerException
How do I understand this error to know what needs to be fixedRemove the docstrings you have in those protocol fns implementations - they are not allowed there.
You should put them inside defprotocol
if you need them.
I don't have a great advice how to debug this but read the docstring and maybe remove portions of your extend-protocol body one by one and see when it starts failing.
What would I need to run on the terminal to get a clojure process to start with 3 aliases selected and then to start a socket repl?
Iāve tried a few variants on this clj -A:testing:test:app-clj -Dclojure.server.repl="{:port 5555 :accept clojure.core.server/repl}"
but with no luck
ah figured this out it should be:
clj -A:testing:test:app-clj -J-Dclojure.server.repl="{:port 5555 :accept clojure.core.server/repl}"
how can I use IN(?,?,?) in next.jdbc with mysql. I wrote code like this:
(defn query-task
[task-id-coll call-times]
(db/query db-con-spec ["SELECT * FROM task_phone_record
where task_id = (?)
and call_times = ?
" (int-array task-id-coll) call-times]))
(query-task [123 124] 1)
sql log:
next.jdbc/execute! [SELECT * FROM task_phone_record
where task_id = (?)
and call_times = ?
#object[[I 0x4f60ec51 [I@4f60ec51] 1]
You probably need to dynamically generate a valid SQL string with ā¦ IN (?, ?, ?)
, producing one ?
for each item in the collection.
Something like:
(let [sql (str "SELECT ... WHERE call_times = ? and task_id IN ("
(clojure.string/join ", " (repeat (count task-id-coll) "?"))
")")
params (into [sql call-times] task-id-coll)]
(db/query db-con-spec params))
Or use HoneySQL which takes care of that for you.
Also, if you're using PostgreSQL, you can pass an array -- the next.jdbc
docs have examples of that.
Is there a nicer way to handle different failure conditions than nested ifs?
(if (s/valid? ...)
(if (s/valid? ...)
;; more nested ifs
(throw (ex-info "Invalid ..."
{::error-id :validation
:errors "..."})))
(throw (ex-info "Invalid ..."
{::error-id :validation
:errors "..."})))
Also there are couple of libs that are made for these nested error handling: ā¢ https://github.com/adambard/failjure#try-all ā¢ https://funcool.github.io/cats/latest/#mlet
you could also try accumulating error messages as data and then returning them in a group
(let [errors
(cond-> {}
(not (s/valid? ...))
(update :errors conj :foo-error)
(not (s/valid? ...))
(update :errors conj :bar-error))]
(when (not-empty errors)
(throw (ex-info "invalid..." errors))))
in general, nested ifs can be flattened using cond->
, cond
, case
, etc
Is it weird that I always feel I need an extra pair of parenthesis for anonymous functions š
(defn abracadabra
"Find the number of c's in abracadabra"
[input]
(count (filter #(= \c %) input))
You can always use non-shortened form of anonymous function)
For this particular case though:
(defn abracadabra
"Find the number of c's in abracadabra"
[input]
(count (filter #{\c} input))
now that is cool
I meanā¦ if you want to get terse:
(def abracadabra #(-> (filter #{\c} %) count))
But you lose the documentation. Getting that back takes the character count back up again.For completeness, Iāll post that here, but donāt do it this way! Just use defn
(alter-meta! (def abracadabra #(-> (filter #{\c} %) count)) assoc :doc "Find the number of c's in abracadabra")
Btw. this works too)
(def abracadabra "Find the number ..." #(...))
where would they go? Iāll say you get used to it and I donāt see it but interested to see where your your instinct says they go
I start typing something similar to -
#( (= \c %...
More than verbose, it's just hard to remember the name of the function (at least for me)
constantly is great https://gist.github.com/hiredman/5644dd40f2621b0a783a3231ea29ff1a#file-yield-clj-L30
Iām not sure if I think this is cute because you avoid the use of a temporary var by reseting the array entry as you pass the result of the reset as an ignored argument, or if I think itās horrible because of the obtuseness of the code!
Thinking about itā¦ If this were #clojure then I would opt for ācuteā. But itās an area for beginners, so Iām going to settle on āhorrifiedā
Can someone explain the goal of that namespace or the context for it? Like is it implementing an algorithm i can read about at a high level first or something?
The go macro in core.async can be seen as an implementation of delimited continuations or shift/reset, which I think there is a comment on the code linking to something more about
Interesting, I'll need to look up delimited continuations and shift/reset. Ty for the response.
On one hand, I see your point, on the other, it can be seen as "if you use constantly you, too, can be as cool".
Hey. How can I serialize a map entry differently from the rest of it? I have my app state in an atom and I have a datascript db inside my app state under :ds-db. Currently Iām calling (update app-state :ds-db ds/serializable)
and then calling transit write
. Can I write a function that can serialize my app state the ārightā way without hard coding or passing in the path to the datascript db.
Thanks!
I think I could make something work with metadata but Iām wondering if thereās already support for this somehowā¦
Would the section on https://github.com/cognitect/transit-format#extensibility be of any help?