Fork me on GitHub
#clojure
<
2020-06-10
>
lilactown00:06:33

fun bug I just tracked down, caused by a fact I didn't know: sorted-set-by uses the comparator to assert identity

lilactown00:06:41

I had something akin to:

(sorted-set-by :order #{:order 2 :foo "bar"} #{:order 2 :foo "baz"})

seancorfield04:06:33

Looks like a typo, the docstring says

user=> (doc vary-meta)
-------------------------
clojure.core/vary-meta
([obj f & args])
  Returns an object of the same type and value as obj, with
  (apply f (meta obj) args) as its metadata.
nil
user=>

seancorfield04:06:11

The arguments are listed incorrectly too.

seancorfield04:06:48

@srijayanth Good catch on that. I submitted a PR to fix it.

hindol06:06:31

Since we have namespaces keywords now, is this acceptable or just a hack?

(defprotocol Service
  (start [this])
  (stop [this]))

(extend-protocol Service
  APersistentMap
  (start [this]
    (if-let [start-fn (::start-fn this)] ; Note: namespaced key
      (start-fn this)
      (throw (ex-info "This map does not satisfy protocol 'Service'"
                      {:map this}))))
  (stop [this]
        ;; Similar to start, omitted
        ))
Then all maps having the right namespaced keys "automatically" satisfy the protocol without using with-meta.

vlaaad06:06:00

what's wrong with with-meta ?

vlaaad06:06:15

this looks fine 🙂

hindol06:06:08

Nothing wrong with with-meta except it is per map. In the above approach, you write the rule only once.

vlaaad06:06:24

it is still per map, you just have to put your start function in a different place

hindol06:06:08

You are right, it is still per map. Every map still needs to add the right key. with-meta is slightly longer I think.

seancorfield06:06:59

@hindol.adhya I'm not sure what problem that solves. ::start-fn resolves to a specific qualified keyword, based on the namespace that code is in. How is that better than adding metadata to a map that refers to protocols?

seancorfield06:06:58

You've introduced a convention (that might be different for each library) compared to a standard way to interact with every protocol.

hindol06:06:48

I meant, the ns that is using this will have the :<right ns>/start-fn key. :: is just for this namespace where I have the protocol.

seancorfield06:06:14

Right, but that is a convention that is separate from the protocol itself.

seancorfield06:06:46

You've called it start-fn (and, presumably, stop-fn) in your service -- but other protocols in other namespaces could do something differently.

seancorfield07:06:09

It's much better to use the fully-qualified names of the protocol functions themselves.

seancorfield07:06:14

You've also restricted it to just a hash map -- protocols can be satisfied via metadata for any IObj

seancorfield07:06:36

(so, in answer to your question, I'd say it's a horrible hack!)

hindol07:06:40

I agree with the "deviation from standard convention is a bad thing" bit. But I would argue a fully qualified protocol fn is equivalent to a fully qualified keyword.

hindol07:06:30

🙂 Yeah, probably it is a bad hack. That's why I am seeking opinions.

seancorfield07:06:39

Protocols can be extended to a lot more types than a qualified keyword can participate in.

seancorfield07:06:30

(with-meta [1 2 3] {'com.stuartsierra.component/start (fn [v] ...)}) for example

hindol07:06:25

Yes, and the protocol is still available for every other type. I meant this just for maps.

seancorfield07:06:30

@hindol.adhya I think it's a mistake to take a generalized feature like this and add a convention that restricts it to a narrower type...

hindol07:06:03

I am kind of convinced now that this is a bad idea. Mainly because with-meta is more explicit and the above approach is more implicit.

1
Ykä12:06:58

https://twitter.com/andrestaltz asked on Twitter > What’s the software or tool or language or framework where you’ve felt the most productive? Clojure seems to be pretty popular aswer. 🎉 If you want to answer his poll, it’s still open for 6 hours: https://twitter.com/andrestaltz/status/1270438843880259590

William Skinner14:06:16

Hi, I'm new here! I wanted to get a few recommendations for must have books that teach systems architecture and design. I became interested in clojure back in 2012 after starting my first programming job but only got the chance to use it for toy projects. Now I've come a long way but still don't feel have a true reference for large scale system design. Specifically at my job I've been tasked with rewriting a large batch transaction processing system and would like to read about the state of the art in this field.

lilactown16:06:47

“Designing Data-Intensive Applications” has been recommended to me multiple times and it is currently sitting at the top of my to-read pile 😛

2
lilactown16:06:28

the way I need to use this structure is I’m taking the “first” element from the set, doing an operation, and then adding a bunch of new elements. so the ordering is rebalanced frequently

William Skinner16:06:58

Thanks @lilactown! This looks promising

alexmiller16:06:12

I would use a queue and a separate "seen" set to avoid re-processing

1
lilactown16:06:05

is your intuition that it would be faster than using a sorted set?

alexmiller16:06:11

I just find it to be clearer in my head. I doubt the perf difference matters as it's probably dominated by the processing

bhauman16:06:43

a general question about practice here: it seems safe to generate and distribute a selfsigned certificate pem and keystore.jks that only refers to localhost and say www.localhost to use for a local devserver

bhauman16:06:10

where developers can import and trust the pem or p12 cert

bhauman16:06:07

any auth attack would have to come from localhost 127.0.0.1???

martinklepsch17:06:05

Does anyone know of examples using datafy to wrap an HTTP API? Is this a bad idea? :thinking_face:

raspasov17:06:44

This is from another discussion, but I was wondering if somebody knows: how consistent is (str {:a 1}), aka (str …) on maps… does it produce the same string every time for the same map (I assume yes)? What about across Clojure vs ClojureScript?

alexmiller17:06:28

for the same instance, yes, seq order is stable

alexmiller17:06:55

for clj vs cljs, not guaranteed and I wouldn't rely on them being the same

raspasov17:06:30

@alexmiller thanks! That’s actually the important part; I assume same instance, you mean like consistent version of Clojure? Or that can change across JVM instances of the same version?

noisesmith17:06:53

if you mean same as in identical? yes, but it will definitely print equal small maps differently from one another

user=> (let [a {:a 0 :b 1}
             b {:b 1 :a 0}]
          [(= a b) (= (str a) (str b))])
[true false]

alexmiller17:06:02

I mean if you have the same object in the same vm and print it twice, it will be the same

alexmiller17:06:19

no guarantee across jvm versions, clojure versions, jvm executions, etc

alexmiller17:06:27

maps are unordered, don't rely on order

raspasov17:06:52

@alexmiller I see, makes sense; thanks again!

alexmiller17:06:05

the actual order is based on map impl, key hash, and possibly on the history of the map. hashing has changed in the past (in Clojure 1.6 most recently)

🎯 1
raspasov17:06:32

yea… hm… that stemmed from another discussion… what’s the best way to to (compare …) maps; is that even recommended to do?

noisesmith17:06:25

as in comparator, for sorting?

raspasov17:06:50

rf.shared.nutrients=> (compare (vec (seq {:a 1 :b 2})) (vec (seq {:a 1 :b 2}))) 0

noisesmith17:06:57

you can compare by hash

(ins)user=> (hash {:a 0 :b 1})
1561470772
(ins)user=> (hash {:b 1 :a 0})
1561470772

raspasov17:06:59

rf.shared.nutrients=> (compare (vec (seq {:a 1 :b 2})) (vec (seq {:b 2 :a 1}))) -1

noisesmith17:06:11

of course hash is lossy...

raspasov17:06:24

Hash collisions potentially?

noisesmith17:06:38

right, a collision would not be sorted in a stable way

noisesmith17:06:50

and don't use hash alone to remove dups, obviously

noisesmith17:06:58

but other than that, the sorting will work

raspasov17:06:03

Basically, how to stably (is that a word) sort maps

raspasov17:06:34

Sort on, say :a, and if that is equal, sort on the whole map (somehow)

raspasov17:06:38

consistently

noisesmith17:06:49

you could compare hash, and when not identical, compare the set of hashes of entries

raspasov17:06:50

Without losing data

noisesmith17:06:24

but this is half baked, it's an interesting problem

dpsutton17:06:27

i've been down this road and its not fun

🎩 1
dpsutton17:06:49

you're going to get to a point too where you want to ignore some differences and then you're really sunk

raspasov17:06:38

@dpsutton thanks for the input 🙂

noisesmith17:06:44

@raspasov if you have some domain logic that lets you select keys and be sure of types, you could compare the vector of "interesting" keys

raspasov17:06:06

@noisesmith It was somebody else’s problem 🙂 I just got curious…

ghadi17:06:11

to compare maps, it's as simple as =

dpsutton17:06:27

are you saving state and checking to see if it is unchanged?

raspasov17:06:22

@dpsutton

(defn height-with-tie-break [x y]
  (let [c (compare (get x :height) (get y :height))]
    (if (not= c 0)
      c
      ;; Otherwise we don't care as long as ties are broken consistently
      (compare (vec (seq x)) (vec (seq y))))))

(def s3 (sorted-set-by
          height-with-tie-break
          {:height 0 :foo "bar"}
          {:height 3 :foo "asdf"}
          {:height 0 :foo "baz"}
          {:height 2 :foo "baz"}))
Sort by :height, if that is equal, how to tie break consistently

raspasov17:06:42

(I don’t think this is fully correct)

raspasov17:06:52

Possibly, this is the wrong approach about this… a map cannot be fundamentally “bigger than” or “less than” another map because it is, like @alexmiller said, un-ordered

raspasov17:06:57

so trying to use consistent order where there isn’t one by definition is probably the wrong way of solving that problem

dpsutton17:06:55

clojure's persistent maps are unordered. but you're talking about ordering the set of maps which contain keys :height and :foo. that doesn't seem fundamentally impossible

craftybones17:06:04

There was a great talk many years ago about how to help a team migrate to clojure(or just new languages in general, but using Clojure to illustrate it)

craftybones17:06:22

I can’t remember whose talk it was and I can’t find it any more. Did I hallucinate the whole thing?

dpsutton17:06:53

gotcha. i thought you were going down a different path 🙂

dpsutton17:06:09

i think this should work

(sorted-set-by
  #(let [f (juxt :height :foo)]
     (compare (f %1) (f %2)))
  {:height 0 :foo "bar"}
  {:height 3 :foo "asdf"}
  {:height 0 :foo "baz"}
  {:height 2 :foo "baz"})

raspasov17:06:22

I guess the problem becomes… in the case of map… what if those two keys are equal again, :height and :foo - how do you tie break then in the case of a map? 🙂 It’s like a never ending story

raspasov17:06:09

@dpsutton right… I guess if your maps are guaranteed to NOT have any other keys except for :height and :foo, you can do it

raspasov17:06:30

Seems a bit at odds with the Clojure’s philosophy of open-data

raspasov17:06:57

(sorted-set-by
  #(let [f (juxt :height :foo)]
     (compare (f %1) (f %2)))
  {:height 0 :foo "bar"}
  {:height 0 :foo "bar" :c "HERE I COME"}
  {:height 3 :foo "asdf"}
  {:height 0 :foo "baz"}
  {:height 2 :foo "baz"})

raspasov17:06:00

{:height 0 :foo "bar" :c "HERE I COME"}
Is gone 🙂

raspasov17:06:20

My personal conclusion is, “don’t do it” lol… unless you are really sure you just want those two keys (:height and :foo)

raspasov17:06:04

@isak potentially… that might be a decent idea

raspasov17:06:39

@isak thanks for the input though 🙂

isak17:06:02

hm, is it because the maps don't have the same keys? if so, could add a step to add missing keys @raspasov

raspasov17:06:33

@isak I think, fundamentally, trying to apply order where there isn’t one is just not a very good approach; a comparator requires not only “equal” semantics, but also “less than” and “greater than”

raspasov17:06:15

Extracting “less than” and “greater than” semantics from a generalized un-ordered collection sounds like a mathematical impossibility; maybe a certain heuristic approach can work reasonably well, but I don’t think a general and consistent solution is possible

raspasov17:06:04

That’s why if you start out with vectors, it’s much easier, they have order built-in

raspasov17:06:12

Which you can rely on

isak18:06:43

I think you can - worst case you just add more things to the comparison, like .getType, but I won't argue. Imo the real problem would be to make a generalized solution that is also efficient

✌️ 1
lilactown18:06:13

I think I figured out what I want: an array of sets 😄

lilactown18:06:28

I might distill this into a lib that implements the appropriate interfaces/protocols

lilactown19:06:51

ahaha, right. sorry, this was a solution to a problem I posted about above: https://clojurians.slack.com/archives/C03S1KBA2/p1591805338491800

lilactown19:06:28

we went around and around w/ how we could hack ordered-set-by to get good behavior

lilactown19:06:33

the problem is that sorted-set-by is built for a total ordering, so you end up having to hack on to the comparison fn how to deal with when a piece of data that your ordering is = but the elements are actually distinct

lilactown19:06:00

which rolled into the “how do you compare maps” convo

avi21:06:25

👋 I’ve been starting up a socket repl and specifying the port as 0, which IIRC is a unix convention for “find some available port”. This has been working well for me, but then I have to find out what the actual port is before I can connect. Just in case it might be helpful to anyone, here’s what I’m running when I start my REPL: (println "Socket REPL listening on port" (.getLocalPort (get-in @#'clojure.core.server/servers ["repl" :socket])))

alexmiller21:06:44

why don't you just tell it a port to use?

avi21:06:53

Because I’m sometimes working on multiple projects simultaneously, each with its own socket repl. If I hard-code the port numbers, then they’ll either conflict, or I’ll have to remember which port number I’d assigned to which project. Which I don’t think I can do…

seancorfield21:06:16

(and perhaps worth noting there can be multiple Socket REPLs with names corresponding to the JVM option name used to start each, and so the get-in would need a different string from "repl")

👍 1
avi21:06:22

Also, I don’t know, I kinda like the semantics of specifying port 0 — it’s neat

avi21:06:53

(good point Sean, thanks!)

seancorfield21:06:27

It's interesting, I hadn't even thought to try port 0. That is neat. But discoverability is kind of ugly.

dominicm21:06:52

I think port 0 is a Java thing.

avi21:06:35

I don’t think so; or at least, not now — it works in Python too, IIRC.

dominicm21:06:21

https://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket(int) > A port number of 0 means that the port number is automatically allocated, typically from an ephemeral port range. This port number can then be retrieved by calling getLocalPort.

dominicm21:06:52

Python probably also special cases it

avi21:06:47

Yeah I saw that in the Java docs, but it doesn’t say where the convention came from. The docs don’t claim it originated in Java. (Not saying that that’s proof that it did not.)

andy.fingerhut21:06:20

I am pretty sure the port=0 choosing a local free port on the system originated with the BSD socket API, at the C level.

👍 2
dominicm21:06:32

Oh, I'm not trying to suggest that. Just that it's unix agnostic :)

avi21:06:32

From https://www.grc.com/port_0.htm → > The designers of the original Berkeley UNIX “Sockets” interface, upon which much of the technology and practice we use today is based, set aside the specification of “port 0" to be used as a sort of “wild card” port. When programming the Sockets interface, the provision of a zero value is generally taken to mean “let the system choose one for me”. Programmers who specify “port 0” know that it is an invalid port. They are asking the operating system to pick and assign whatever non-zero port is available and appropriate for their purpose.

noisesmith21:06:34

yeah, it's a Posix thing for sure

seancorfield21:06:02

[email protected]:~/clojure$ clj -J-Dclojure.server.repl='{:port,0,:accept,clojure.core.server/repl}' -e "(.getLocalPort,(get-in,@#'clojure.core.server/servers,[\"repl\",:socket]))" -r
60438
user=>
[email protected]:~/clojure$ clj -J-Dclojure.server.repl='{:port,0,:accept,clojure.core.server/repl}' -e "(.getLocalPort,(get-in,@#'clojure.core.server/servers,[\"repl\",:socket]))" -r
60439
user=>
(that's WSL 1 on Windows, BTW).

notbad 1
🤯 1
dominicm21:06:03

Ah, the berkeley socket api is closely followed by windows too, makes sense that it's pretty universal then :)

avi21:06:38

I only learned of it recently myself, via Simon Willison, when he created this ticket: https://github.com/encode/uvicorn/issues/530

noisesmith21:06:39

s/closely followed/copy pasted/ :D

avi21:06:41

@seancorfield can you think of some trick to get my project to automatically print out the port every time I start my repl with a particular tools.deps alias? I tried to naïvely add it to :main-opts as a -e arg but ran into the usual issues with escaping. And then I discovered that -r and -e seem to be mutually exclusive, so if I specify -e then the process exits after the expression is evaluated, and if I specify -r along with the -e then the repl starts but the expression doesn’t seem to be evaluated.

avi21:06:22

oh wait, it seems to have worked for you in the snippets you posted above — I’ll try again 😅

avi21:06:18

Aha, the trick is to specify the -e first and the -r second!

seancorfield21:06:22

Confirmed that it works on Powershell (Windows) too -- I added :socket-zero to my dot clojure file, ran clj -A:socket-zero -r on Powershell, then attached to it from WSL:

$ telnet 127.0.0.1 60468
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
user=> (System/getProperty "user.dir")
"C:\\Users\\seanc\\clojure"
user=>

seancorfield21:06:37

Here's where I start it in Powershell:

PS C:\Users\seanc\clojure> clj -A:socket-zero -r
60468
user=>

avi21:06:24

Yup, I’ve got nearly the exact same thing set up for myself now, working great in bash on MacOS

seancorfield21:06:54

I off work today so I'm on my ancient little laptop: Windows with WSL 1 (since it can't run any virtualized stuff).

seancorfield21:06:33

I'll try it on macOS when I'm back at work tomorrow. I should update the README in my dot clojure project to describe the new alias.

alexmiller21:06:33

you could also -e and start the server manually (via clojure.core.server/start-server), which returns the ServerSocket

👍 2
avi21:06:03

Good point, that might be less brittle, in that maybe it wouldn’t require a specific name

alexmiller21:06:03

needs the commas etc to go in an alias

alexmiller21:06:13

$ clj  -e "(.getLocalPort ((requiring-resolve 'clojure.core.server/start-server) {:port 0, :accept 'clojure.core.server/repl :name \"repl\"}))" -r
57739
user=>

seancorfield21:06:59

@aviflax Still needs to specify the name in there (right, @alexmiller)

alexmiller21:06:36

is that an issue?

alexmiller21:06:11

if you want a random name (str (gensym))

avi21:06:59

I’m not sure how/where the name is used, so I’m unclear on whether it matters

alexmiller21:06:43

doesn't unless you're starting multiple

alexmiller21:06:27

name is used in the thread names and to key the per-server state

pmonks21:06:26

Would it be possible to put the port # in the name, somehow? I know about the trick of asking for port 0 (to get a randomly available port), immediately unbind from that port, then rebind it directly by number…

dominicm21:06:49

@pmonks probably better to rename than to try and claim & unclaim a port

dominicm21:06:58

but then the thread needs renaming... yuk :)

pmonks21:06:13

Yeah - that was where I started getting stuck too. 😉

pmonks21:06:34

I know that elsewhere the “bind twice” pattern is somewhat common / accepted. Just don’t know if there’s a better way in Clojure.

alexmiller21:06:36

the code could do so, but it doesn't right now

pmonks21:06:02

(and of course it has a race condition)

alexmiller21:06:04

I'm not sure why that would be usfeul

alexmiller21:06:16

thread names are mutable

alexmiller21:06:23

so it's quite easy to do

pmonks21:06:27

So that if you’re looking at thread names (e.g. in a dump) it’s more obvious which thread is bound to which port.

alexmiller21:06:39

well that's why it's named :)

alexmiller21:06:58

you started the server and gave it a name, so ...

pmonks21:06:14

But you don’t know the actual port number until after the thread is initially named (though thread renaming could get around that, of course).

alexmiller21:06:37

if you start it explicitly, then you have the server socket and know the port and can do whatever you want

👍 1
pmonks21:06:13

Right. Either you double bind (as is done in some other languages), or you mutate the thread name. They’re both a bit icky (though not terrible). 😉

dpsutton21:06:45

What’s icky about changing a thread name?

dominicm21:06:46

Mutating the thread name seems OK. But really you just need to compose the pieces together yourself at this point. The abstraction doesn't hold any more, so compose the smaller pieces (SocketServer & starting a thread)

pmonks21:06:11

@dpsutton how do you retrospectively change the thread name in things like log files? I mean it’s possible, but situations like these are icky.

alexmiller21:06:15

abusing mutable thread names has a long tradition in Java

alexmiller21:06:25

it's a great hack to actually use the thread name to leak a state variable (like a system wide tap>)

😱 2
alexmiller21:06:00

you can then watch it in a mbean or profiler via the debug apis

pmonks21:06:45

Yeah - or a correlation id of some sort, to track a single logical “transaction” through multiple threads / JVMs / whatever.

pmonks21:06:42

Anyhoo, seems like the simplest thing would be to just rename the thread to “repl-on-port-<concrete port number>” after the socket is bound, and try not to do anything in that thread before the rename (i.e. to avoid logging with the wrong thread name, or other side-effecty shenanigans).

isak21:06:56

Little warning when you compare lines of code between Clojure and other languages - clojure docstrings count as lines of code, not a comment (tested with tokei)

avi21:06:07

have you tested with cloc ? That’s the tool I’ve used for years…