clojure 2021-11-30 | Slack Archive

Yeah, I've been trying to ask myself that. 🙂 I'm trying to write a docstring for a function that accepts a list (but not just a Clojure list) of things.

p-himik15:11:29

A map is also a collection of kv pairs. If your function can't accept kv pairs but requires something specific then it would simply be "a collection of the-thing?" or something like that.

flowthing15:11:54

Right. An example would be a function that calculates the median of a collection of numbers. A list, a vector, a set, or an array all work, but a map doesn't.

emccue15:11:27

a map doesnt work because it is a collection of map entries, not numbers

flowthing15:11:25

Oh, yeah. I was overthinking it. That's a good point, thanks. 🙂

jjttjj16:11:06

Just to pick on that example (only because I've been working with medians/percentiles recently too), there's a lot of nuance to this: > A list, a vector, a set, or an array all work Like it has to be sorted, and a set of numbers wouldn't truly yield you a median (ie (into #{} [1 1 1 1 1 1 50])). So this example isn't super convincing for the case of it being a core function

➕ 1

Alex Miller (Clojure team)16:11:11

"seqs to nums" is probably closest to what you want (ignoring sorting)

vemv18:11:31

Been seeing this lately OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release. even though I have verify -like flags set nowhere. Is there some less obvious way in which the warn may be triggered?

dpsutton18:11:11

are you using lein?

vemv18:11:04

yes thanks! yeah I suspected it had to do w/ lein's first jvm

dpsutton18:11:15

paulspencerwilliams18:11:48

So, I’ve somehow managed 25 years of (un)professional software development and have yet to understand regular expressions really. I’m working my way through O’Reilly’s Introducing Regular Expressions, which recommend the following sed, to extract the first line of some text and wrap it in <h1></h1> tags :

sed -n 's/^/<h1>/;s/$/<\/h1>/p;q' rime.txt

I’ve converted this into the following Clojure that works, but am disappointed that it doesn’t really lean heavily on regex but rather string concatenation.

(str "<h1>" (re-find #".+" rime-intro) "</h1>")

I’ve tried a few ways of writing regexes to capture the first as a group and use replace but not work. How would seasoned Clojure regexers tackle this easy challenge?

Ed18:11:36

the sed command is performing a series of transformations in order, matching the start of a line and putting in a <h1> and inserting a closing </h1> at the end of each line. In java regexes you need to specify the mulitline switch, so I think your sed example is a bit more like this:

(require '[clojure.string :as str])
  (let [txt (str/join "\n" ["title 1" "title 2" "title 3"])]
    (-> txt
        (str/replace #"(?m)^" "<h1>")
        (str/replace #"(?m)$" "</h1>")))

cdeszaq19:11:37

> How would seasoned Clojure regexers tackle this easy challenge? They wouldn’t? RegEx is a tool often best used when other more human-friendly approaches fail for some reason.

paulspencerwilliams19:11:34

Yeah, I’ve heard that opinion too many times for it not to be true!

Ed22:11:06

Yeah ... Having done a whole bunch of perl in my time, regex + html brings me out in hives ... For my money, this problem is better solved by "parsing" the input with something like clojure.string/split , processing with map and joining them together again ...

paulspencerwilliams20:12:22

Haha, consider me well warned! I just want to understand regexes enough to know when to use them and how so. I’m well aware of their maintenance burden as well having inherited a few in my time 😢

cdeszaq20:12:51

99% of what I know about regexs I learned from Java’s docs on Regexes, Grep’s docs, and playing with them in an editor to mess with blobs of semi-structured data. You can go REALLY deep on regexes, but “play” + a few concepts gets you most of the way (in my experience)

hiredman18:11:19

well, that re-find is entirely superfluous

hiredman18:11:20

the thing with the sed example is it is not purely a regex operation

paulspencerwilliams18:11:35

Exactly!!

hiredman18:11:44

sed is a stream editor where you can use regexes to drive editor operations like selecting text

paulspencerwilliams18:11:22

The book offers both a sed and perl example and both use program constructs and not purely regexes to tackle this problem. Perhaps how I should do it relies on stuff introduced later in the book, so maybe I’m jumping the gun…

hiredman18:11:55

I don't think you've made something entirely equivalent in functionality either, sed is operating line by line

hiredman18:11:17

while your string concat version is not line by line

paulspencerwilliams18:11:30

I’d imagine the way I should tackle it would be to capture text before the first carriage return and thrown away the first carriage return and subsequent characters? And using clojure.string/replace with the template <h1>$1</h1> would the the way?

hiredman18:11:31

something like

user=> (.replaceAll "foo\nbar" "(.+)" "<h1>$1</h1>")
"<h1>foo</h1>\n<h1>bar</h1>"
user=>

might be closer

paulspencerwilliams18:11:58

That sed does operate line by line, but finishes after line one - the expected answer is specifically just the fine line wrapped :thinking_face:

paulspencerwilliams18:11:31

I say that with conviction because the book saids that the q stops sed after the first line - I’m learning sed passively to get through this book.

dpsutton18:11:35

because it doesn’t have a /g. sed has some concise grammar for its programming language. It is far and beyond just regexes. note the s/../p that you have in your example are not valid regex constructs

paulspencerwilliams18:11:16

Oh, right, I thought that might be the case.

emccue18:11:34

I would reccomend finding the undergrad coursework for regular expressions

emccue18:11:58

it makes the whole nightmare make a little more sense with that foundation

emccue18:11:09

like, no language’s regexes are the academic regex

😢 1

dpsutton18:11:21

this began from an oreilly book on regular expressions. I think the only thing needed is to remember the separation between a programming language that uses regexs and regexs themselves

paulspencerwilliams18:11:22

Yeah, I want to focus on regex and not sed / perl, but I was advised this book was a less painful option 🤷

emccue18:11:05

but thinking in terms of the academic version made things make a lot more sense, at least to me

paulspencerwilliams18:11:25

Right, any specific coursework / resources?

Ed18:11:25

I always have to refer to this : https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/util/regex/Pattern.html

👍 1

paulspencerwilliams19:11:40

Thanks

dorab22:11:03

More than you ever want to know about the various types of regex..https://www.regular-expressions.info/

paulspencerwilliams20:12:09

Thanks for the recommendation - I’ll have a browse around that site!

emccue18:11:04

my class used Introduction to the Theory of Computation, 3rd edition

emccue18:11:24

which you can probably find

paulspencerwilliams18:11:13

Thanks, I’ll have a hunt

dpsutton18:11:23

that’s gonna most likely start at nfa and dfa level. and go through things like pumping lemma. up to you if that’s the level you want or just regexs in practice

paulspencerwilliams18:11:01

I think regexs in practice atm.

dpsutton18:11:13

honestly i’d stay with the oreilly book then

paulspencerwilliams18:11:39

So, if I do that, would you say my approach is sound; Capture characters before the first carriage return, and throw away the rest, and exploit the group in a replace?

paulspencerwilliams18:11:12

Something like (clojure.string/replace #"(.*)\n[.*]" "foo\nbar", "<h1>$1</h1>)* - I know this won’t actually work

dpsutton18:11:57

put the arguments in the right order and it works

user=> (clojure.string/replace "foo\nbar" #"(.*)\n.*" "<h1>$1</h1>")
"<h1>foo</h1>"
user=>

dpsutton18:11:22

but just work through the book. some thoughtful authors have spent a lot of time refining their teaching and coming up with examples

paulspencerwilliams19:11:38

Oh, I forget the params are in a different order for replace than for re-seq etc. Oh good, I am learning! Yeah, I think sticking to the book and then applying it to Clojure would be a better bet. Cheers for the advice.

dpsutton19:11:01

yeah. learn regexs. then use them in clojure. if you now regexs all you will need to do is consult https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for the features and particular dialect of Clojure’s regexs (which are just the native platform’s regexs)

j19:11:28

What's the easiest way to log to file? (spit) ? Is there a easy to use logging library I can use that's like using spit ? I've looked at timbre, pedestal, tools.logging so far.

potetm19:11:47

oh sweet summer child

😂 4

potetm19:11:58

“Easy way to log in Java”

j19:11:40

I guess I should've qualified that question with "preferable with no XML"

potetm19:11:07

Yeah, there’s no easy way. I prefer tools.logging, but even that requires a fair amount of dependency management and XML.

potetm19:11:28

Actually

j19:11:30

Got it, I wanted to make sure that I wasn't missing something obvious

potetm19:11:48

You can add-tap a fn that logs to a file.

potetm19:11:56

Then use tap>

dpsutton19:11:05

i think timbre has an “easy” way to do this and set up a file logger

dpsutton19:11:52

so if you don’t want xml, this might be a fit for your needs. https://github.com/ptaoussanis/timbre#basic-file-appender

👍 1

Darin Douglass19:11:29

https://github.com/BrunoBonacci/mulog/blob/master/doc/publishers/simple-file-publisher.md ^ if you wanted to go the structured logging route

potetm19:11:38

Like (add-tap #(spit "my-file" %))

dpsutton19:11:42

tap can work. but it does have less flexibility than a proper logging setup. all taps are logs, no logging levels, etc.

j19:11:26

@U07S8JGF7 I don't really know what tap is. I've been able to pretty easily spit to file. What does tap give you?

potetm19:11:48

Tap will allow you to spit from multiple threads.

potetm19:11:05

You don’t need it if you don’t ever spawn a thread (which is pretty rare, but does happen).

potetm19:11:00

Oh and you want to do (spit "my-file" % :append true)

j19:11:02

Ahh, that's good to know. I'm guessing timbre by default can handle logging to file with multiple theads?

potetm19:11:14

Yes, any logging library can do that out of the box.

potetm19:11:33

And like @U11BV7MTK said: the tap thing is easy. It’s not appropriate for production-level logging.

j19:11:30

Ahh, that's good to know. about the spit option and logging libraries. I'll take a look at both. Thanks all!

cdeszaq19:11:27

The biggest things logging libraries give you which make them more prod-ready: 1. Control at a distance (you don’t need to change the log-emitting code to change the logging behavior) 2. Bounds / performance considerations. (eg. what to do when your disk can’t keep up)

🙌 1

j19:11:49

@U02EA2T7FEH Thanks for the ulog link! Looks interesting!

Darin Douglass20:11:36

i love me some structured logging!

devn19:11:14

i have a bunch of bad URL data that i’m ingesting. urls are like “http://example.com/foo” “http://www.example.com/” and then some are just trash like “n/a”. anyone have a rec. on a library for dealing with this? ideally i’d like to validate and/or “fix” them to actually have a protocol, or throw them out if they’re nonsense.

delaguardo19:11:08

You can try https://docs.oracle.com/javase/7/docs/api/java/net/URL.html Unfortunately there is no method to check validity, but enough instruments to build validator yourself.

devn20:11:54

thanks. yeah, should have just looked to good ol’ http://java.net.*

devn19:11:56

note: perfection is not required. there’s no way to know without actually making requests to know whether say http or https is correct for instance, but i’m looking for a basic normalization/validation step that’s a bit more robust than manually adding a protocol where it doesn’t exist, though as i type this, that may be sufficient for now

jjttjj19:11:50

https://github.com/michaelklishin/urly might be useful or possibly https://github.com/lambdaisland/uri

devn20:11:12

ah yes, forgot about the good folks and their lambda-shaped island. thank you.

2021-11-30

Channels