Fork me on GitHub
#beginners
<
2022-02-18
>
John Bradens00:02:02

Ok I'm back with another url-encoder type of question... the thing is, the built in function from ring replaces special characters and spaces with %20 and so on... this makes the urls look really ugly. Websites like quora and medium will take a blog post title and have a url "that-is-like-this". So I wonder, can I just write a clojure function that would take a blog title (a string), and ignore all special characters? I know I can use replace to change the spaces to "-" but how do I ignore the special characters, or take only numbers and letters?

andy.fingerhut00:02:04

If you want a valid URL, then you need things like %20, yes? If you want to avoid those, then you need to find other replacements that work well enough for your use case. However, if your use case is "turn an arbitrary Unicode string into a valid URL", I doubt you can avoid the %20 kind of sequences in the result.

John Bradens00:02:52

I don't think I need %20 to have a valid url... Medium is a popular website and their url's just turn spaces into "-". Maybe I'm not sure what you mean?

John Bradens00:02:58

For example if someone has a blog post "my #1 blog's title" then medium has a url to the blog post "/my-1-blogs-title". However url encoders have the output "my%20%231%20blog%27s%20title". I want to figure out how to get the same url that Medium has? It ignores special characters

andy.fingerhut00:02:09

Then you can look for any documents that explain what Medium uses for their translation of strings to URLs. One question you may or may not need to worry about is "if symbols like # and ' are simply removed, what should happen when two different titles produce the same URL?"

John Bradens00:02:41

Oh, I see. Yes it appears they have some characters that come after the title, which must be to take care of that problem

andy.fingerhut00:02:32

There very well might be some widely used function that Medium uses for that title-to-URL translation, but it might also be some custom thing they wrote themselves.

John Bradens00:02:44

Thanks, that's helpful

John Bradens00:02:24

For the sake of trying to tie this in with clojure... Any thoughts on how I can take a string and get rid of special characters in clojure?

John Bradens00:02:37

I'm thinking (replace "my $string" "$" "")

John Bradens00:02:48

But how would this cover all the special characters? Is there a code I can use?

andy.fingerhut00:02:36

If you enumerate all characters you want to keep, that is probably a shorter list than all characters you want to remove, but not sure whether you consider most of the Unicode character set as something you want to remove, or something you want to keep. It is a huge character set.

John Bradens00:02:19

I see... so if I want to just keep certain characters from a string... I would probably use a different function than replace, right?

Chase00:02:32

What about doing something like:

(apply str                                                                                       
       (remove #{\# \%} "the#quick%brown")) => "thequickbrown"

John Bradens00:02:05

Thanks, chase. I think that's similar to what I'm looking for. Is there a way I can remove all special characters without listing every one I can think of? (Or I can probably look up which characters are not allowed in URLs)

seancorfield00:02:28

Look at the docs for (Java) regular expressions: it has escaped characters that represent entire classes of Unicode characters.

andy.fingerhut00:02:18

In particular this page: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html. and things like \p{Lower} and others listed there

seancorfield00:02:26

Here's a function from our code that attempts to check whether something is a vaguely valid URI:

(defn invalid-uri?
  "Given a URI, return true if it contains characters that we don't believe
  should be in a valid URI. For now, what we allow in a URI are:

  * - . ( ) ' / _
  * Any Unicode alphanumeric p{L} letter p{M} combining mark p{N} numeric

  Anything else, we'll assume it's a bad URI and we'll give a 404 for it.

  As of WS-10501, we also use this to validate affiliate IDs."
  [uri]
  (try
    (->> uri
         (java.net.URLDecoder/decode)
         (decode-location)
         (re-find #"[^-\.()'/_\p{L}\p{M}\p{N}]")
         (boolean))
    (catch Throwable _
      ;; any URI that won't decode is almost certainly invalid!
      true)))

andy.fingerhut00:02:29

This Java doc page for URL encoder mentions explicit lists of characters that it does not transform: https://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html

seancorfield00:02:58

This uses a negative match [^...] followed by things we do allow.

seancorfield00:02:08

(don't worry about the decode-location function in the middle -- ignore that, it's specific to some URI parsing we do -- the important part is the re-find call @bradj4333)

John Bradens00:02:26

Cool thanks! This is really helpful @andy.fingerhut @seancorfield

John Bradens00:02:20

I still have the question though if you don't mind šŸ˜… Let's say I come up with a list of characters that I want to keep from a string - is there an easy way to just take certain characters from a string in clojure? (for example if I have a string with numbers and letters, and only want to take the numbers - or have alphanumeric mixed with special characters and only want to take alphanumeric)

John Bradens00:02:46

I see there's ways to remove & replace - should I just focus on using those for my needs?

John Bradens00:02:08

At this point I kind of want to try writing my own url-translation function for practice šŸ™‚

Chase00:02:59

Well the opposite of remove is filter if that is what you are asking.

(apply str                                                                                       
       (filter #{\# \%} "the#quick%brown")) ;; "#%"

Chase00:02:17

But it sounds like you want to be exploring regex es more.

Chase00:02:15

I'm still in a "I can do regex but it takes a lot of googling and it's write only (as in I won't understand my own solution 6 weeks from now" phase

John Bradens00:02:00

Ahh ok thanks @chase-lambert Yea, I am VERY intimidated by regex at this point! I'll try filter first, but if I can't do that then I'll look more into regex (which I need to learn better at some point anyway šŸ™‚ )

seancorfield00:02:10

dev=> (clojure.string/replace "the quick brown fox jumped over the lazy dog" #"[remove]" "")
"th quick bwn fx jupd  th lazy dg"
dev=> 

seancorfield01:02:01

#"[...]" is regex-speak for "the set of characters inside the brackets". Then you just replace those matched substrings with the empty string. Matched characters begone! @bradj4333

seancorfield01:02:19

dev=> (clojure.string/replace "the quick brown fox jumped over the lazy dog" #"[^remove]" "")
"eroomeovereo"
dev=> 
and with ^ inside the brackets -- as I mentioned above and showed in that function from work -- it keeps just the characters you specify, by removing anything that doesn't match.

seancorfield01:02:38

An example closer to what you're asking about: only keeping alphanumeric characters from a URI:

dev=> (clojure.string/replace "" #"[^a-zA-Z0-9]" "")
"httpsappslackcomclientT03RZGPFRC053AK3F9cdnfallback1"
dev=> 

John Bradens01:02:14

Oohhhhh cool thanks @seancorfield! Something like #"[^a-zA-Z0-9]" is exactly what I was looking for. I wasn't sure how to represent all that so succinctly. Thanks a lot!

quan xing04:02:57

(def t {:a "a" :b "b" :c "c})
((juxt :a :b) t)
=> ["a" "b"]
how can I pass dynamic parameter to juxt p1 [:a :b] ,p2[:a :c] (juxt p1) means (juxt :a :b)

dpsutton04:02:23

(let [ks (conj [:a] :b)] ((apply juxt ks) {:a "a" :b "b" :c "c"})) -> ["a" "b"]

quan xing05:02:23

((apply juxt [:a :b]) t) =>["a" "b"] šŸ‘Œ

Drew Verlee05:02:04

fwiw, i think the use of apply and juxt is great, but people might find this more obvious at first glance if i understnad your input and output spec right: (-> {:a "1" :b "2" :c "3"} (select-keys [:a :b]) vals);; => ("1" "2"). though juxt might be faster.

dpsutton06:02:45

Vals on select keys won't necessarily return the keys in the order expected here

Drew Verlee06:02:19

I suppose i would hope the order didn't matter given it came from a hashmap. But yea.

dpsutton06:02:33

Juxt is how you can do a positional selection though

pavlosmelissinos07:02:55

You could do it with map as well: (map m ks) (I still prefer juxt though)

ā˜ļø 1
quan xing08:02:15

(defprotocol DatetimeToStrProtocol
  (date-str [date-time])

  (date-add-min [date-time min])

  (str-date [s pattern]))

(extend-protocol DatetimeToStrProtocol
  ;; LocalDateTime to string
  java.time.LocalDateTime
  (date-str
    [date-time]
    (.format
     (java.time.format.DateTimeFormatter/ofPattern "yyyy-MM-dd HH:mm:ss")
     date-time))
  (date-add-min
    [date-time min]
    (.plusMinutes date-time min))
  (str-date
    "string to date 
    format : yyyy-MM-dd HH:mm:ss"
    [s p]
    (.parse
     (java.time.format.DateTimeFormatter/ofPattern p)
     s))

  ; Date to string
  java.util.Date
  (date-str [date-time]
    (.format
     (java.text.SimpleDateFormat. "yyyy-MM-dd HH:mm:ss")
     date-time))

  (str-date
    "string to date 
    format : yyyy-MM-dd HH:mm:ss"
    [s p]
    (.parse
     (java.text.SimpleDateFormat. p)
     s)))
show down error message:
; Evaluating file: date.clj
; Syntax error (IllegalArgumentException) compiling at (src/xing/util/date.clj:12:1).
; Don't know how to create ISeq from: java.lang.Character
; Evaluation of file date.clj failed: class clojure.lang.Compiler$CompilerException
How do I understand this error to know what needs to be fixed

jumar13:02:34

Remove the docstrings you have in those protocol fns implementations - they are not allowed there. You should put them inside defprotocol if you need them. I don't have a great advice how to debug this but read the docstring and maybe remove portions of your extend-protocol body one by one and see when it starts failing.

Adrian Smith13:02:47

What would I need to run on the terminal to get a clojure process to start with 3 aliases selected and then to start a socket repl?

Adrian Smith13:02:50

Iā€™ve tried a few variants on this clj -A:testing:test:app-clj -Dclojure.server.repl="{:port 5555 :accept clojure.core.server/repl}" but with no luck

Adrian Smith13:02:28

ah figured this out it should be: clj -A:testing:test:app-clj -J-Dclojure.server.repl="{:port 5555 :accept clojure.core.server/repl}"

quan xing15:02:19

how can I use IN(?,?,?) in next.jdbc with mysql. I wrote code like this:

(defn query-task
  [task-id-coll call-times]
  (db/query db-con-spec ["SELECT * FROM task_phone_record 
                          where task_id = (?)
                          and call_times = ? 
                          " (int-array task-id-coll) call-times]))

(query-task [123 124] 1)

sql log:
next.jdbc/execute! [SELECT * FROM task_phone_record 
                          where task_id = (?)
                          and call_times = ? 
                           #object[[I 0x4f60ec51 [I@4f60ec51] 1]

Ferdinand Beyer16:02:08

You probably need to dynamically generate a valid SQL string with ā€¦ IN (?, ?, ?), producing one ? for each item in the collection. Something like:

(let [sql (str "SELECT ... WHERE call_times = ? and task_id IN ("
               (clojure.string/join ", " (repeat (count task-id-coll) "?"))
               ")")
      params (into [sql call-times] task-id-coll)]
  (db/query db-con-spec params))

seancorfield18:02:47

Or use HoneySQL which takes care of that for you.

seancorfield18:02:15

Also, if you're using PostgreSQL, you can pass an array -- the next.jdbc docs have examples of that.

Lycheese16:02:18

Is there a nicer way to handle different failure conditions than nested ifs?

(if (s/valid? ...)
    (if (s/valid? ...)
      ;; more nested ifs
      (throw (ex-info "Invalid ..."
                      {::error-id :validation
                       :errors "..."})))
    (throw (ex-info "Invalid ..."
                    {::error-id :validation
                     :errors "..."})))

Lycheese16:02:37

Wait, I can just wrap the preps in not and make it a cond

lispyclouds16:02:10

Also there are couple of libs that are made for these nested error handling: ā€¢ https://github.com/adambard/failjure#try-all ā€¢ https://funcool.github.io/cats/latest/#mlet

Lycheese16:02:42

Oooh, those look nice. Thank you very much.

Michael Stokley17:02:21

you could also try accumulating error messages as data and then returning them in a group

(let [errors
      (cond-> {}
        (not (s/valid? ...))
        (update :errors conj :foo-error)

        (not (s/valid? ...))
        (update :errors conj :bar-error))]
  (when (not-empty errors)
    (throw (ex-info "invalid..." errors))))

Michael Stokley20:02:19

in general, nested ifs can be flattened using cond-> , cond, case, etc

Matej Å arlija17:02:16

Is it weird that I always feel I need an extra pair of parenthesis for anonymous functions šŸ˜…

(defn abracadabra
  "Find the number of c's in abracadabra"
  [input]
  (count (filter #(= \c %) input))
  

delaguardo18:02:38

You can always use non-shortened form of anonymous function)

quoll18:02:17

That would require extra parens AND square brackets! (fn [c] (= \c c))

quoll18:02:53

For this particular case though:

(defn abracadabra
  "Find the number of c's in abracadabra"
  [input]
  (count (filter #{\c} input))

Matej Å arlija20:02:32

now that is cool

quoll20:02:00

I meanā€¦ if you want to get terse:

(def abracadabra #(-> (filter #{\c} %) count))
But you lose the documentation. Getting that back takes the character count back up again.

quoll20:02:32

For completeness, Iā€™ll post that here, but donā€™t do it this way! Just use defn

(alter-meta! (def abracadabra #(-> (filter #{\c} %) count)) assoc :doc "Find the number of c's in abracadabra")

delaguardo22:02:22

Btw. this works too)

(def abracadabra "Find the number ..." #(...))

quoll03:02:31

I totally went blank on def accepting docs now šŸ˜³

dpsutton18:02:04

where would they go? Iā€™ll say you get used to it and I donā€™t see it but interested to see where your your instinct says they go

Matej Å arlija18:02:45

I start typing something similar to -

#( (= \c %...

hiredman18:02:55

it makes a lot of sense, the lack of extra parens there trips a lot of people up

hiredman18:02:33

people try things like #(1) to return 1 sometimes, which of course doesn't work

quoll18:02:36

ā€¦ and (constantly 1) just seems so verbose!

solf18:02:59

More than verbose, it's just hard to remember the name of the function (at least for me)

quoll19:02:07

Iā€™m not sure if I think this is cute because you avoid the use of a temporary var by reseting the array entry as you pass the result of the reset as an ignored argument, or if I think itā€™s horrible because of the obtuseness of the code!

quoll19:02:13

Thinking about itā€¦ If this were #clojure then I would opt for ā€œcuteā€. But itā€™s an area for beginners, so Iā€™m going to settle on ā€œhorrifiedā€

Ben Sless21:02:13

oh come on, have you seen the slides? totally cute!

Ben Sless21:02:35

also terrifying, like a tiger playing with a pumpkin

quoll03:02:22

#beginners Ben!

Drew Verlee03:02:33

Can someone explain the goal of that namespace or the context for it? Like is it implementing an algorithm i can read about at a high level first or something?

hiredman03:02:44

It is an incomplete reimplementation of core.asyncs go macro.

hiredman03:02:55

The go macro in core.async can be seen as an implementation of delimited continuations or shift/reset, which I think there is a comment on the code linking to something more about

hiredman03:02:58

(comment in the yield code above, the core.async has few explanatory comments)

Drew Verlee03:02:34

Interesting, I'll need to look up delimited continuations and shift/reset. Ty for the response.

Ben Sless05:02:42

On one hand, I see your point, on the other, it can be seen as "if you use constantly you, too, can be as cool".

dpsutton19:02:05

I get tripped up in repeat and repeatedly every now and then

Richie21:02:52

Hey. How can I serialize a map entry differently from the rest of it? I have my app state in an atom and I have a datascript db inside my app state under :ds-db. Currently Iā€™m calling (update app-state :ds-db ds/serializable) and then calling transit write. Can I write a function that can serialize my app state the ā€œrightā€ way without hard coding or passing in the path to the datascript db. Thanks!

Richie21:02:32

I think I could make something work with metadata but Iā€™m wondering if thereā€™s already support for this somehowā€¦

Richie22:02:33

Yes, I think that's what I'm looking for. I don't have a solution yet but I don't want to leave you hanging any longer. Thanks!