Fork me on GitHub
#clojure
<
2021-01-05
>
didibus01:01:43

Anybody has a trick for discarding an argument when using #() ?

seancorfield03:01:59

I'm actually a little shocked that #_% works in this case. The presence of #_% (or #_%2, #_%3, etc) cause the anonymous function to have that many arguments, but #_ then "throws away" the form that follows it: (mapv #(#_% rand-int 10) (range 100))

👍 9
seancorfield03:01:18

(I know this is a bit of a repeat, but I wanted the channel to see it, in a single coherent response 🙂 )

didibus03:01:17

Ya, I'm a bit worried about this maybe being too depended on accidental implementation details of the Clojure reader though

didibus03:01:45

Like it seems the reader first process the forms with #(), and then with #_

didibus03:01:55

But is this order guaranteed?

didibus03:01:50

Nice, it works in different orders too: (mapv #(rand-int 10 #_%) (range 100)) which I find a bit more readable

p-himik04:01:57

Wouldn't simple fn be even more readable?

☝️ 9
didibus04:01:17

Well, yes and no. Yes cause its ugly to put #_% at the end, and most people might be thrown off by it. But no, for the same reason Rich Hickey added `#()` in the first place :stuck_out_tongue:

seancorfield04:01:14

Re: different orders -- yeah, it's not going to make any difference where the ignored arg form is. I originally stuck it in the middle: #(rand-int #_% 10)

didibus04:01:35

Well, I don't fully know why he did, but for me, there's something visually nice about the parenthesis not repeating.

didibus04:01:07

Looks like it works in ClojureScript as well

didibus04:01:27

But not in Babashka 😞

seancorfield04:01:49

File a GitHub issue! @U04V15CAJ will be thrilled by this weirdness 🙂

didibus04:01:10

Haha, he does like that kind of stuff

seancorfield04:01:18

I just looked over the source of the Clojure reader and I think you can rely on this behavior: the ArgReader (which processes % and %<n>) is what tracks the highest arg count in an expression, and the DiscardReader (for #_) has to read the next form in order to discard it. The ArgReader is a reader macro so it will be triggered just by reading. So the arguments are always going to be read (and tracked), even if the resulting form containing them is subsequently discarded.

didibus04:01:03

Its not possible that the DiscardReader runs first, modifies the form, and then the ArgReader would run, no longer seeing the discarded code?

didibus04:01:28

Or to reader macros like all run on the same original forms?

didibus04:01:27

Nice, it also works in Clojerl

seancorfield04:01:55

DiscardReader invokes the reader on the following form: The form to be discarded has to be read. The ArgReader is how the % forms are read. Could the code be changed so that any reader side-effects could also be erased by the DiscardReader? I guess it could but it would be a fair bit of effort to do it correctly without breaking anything I suspect. That said, I'm sure Alex and Rich would say, absolutely don't do this 🙂

😁 3
seancorfield04:01:49

@U2FRKM4TW Yeah, if ever I find myself reaching for %2 I take a step back and think about readability!

p-himik04:01:49

I guess it goes to show how little excitement I have in my life, but seeing #_% within a # function riles me up just as much as reading political news does. And honestly, I'm a bit surprised by that myself. :)

didibus04:01:52

Ya, but:

(let [agents (repeatedly 5 #(agent []))]
  (run! #(call-api client) agents))
So now its between:
(let [agents (repeatedly 5 #(agent []))]
  (run! #(call-api client #_%) agents))
and:
(let [agents (repeatedly 5 #(agent []))]
  (run! (fn [_] (call-api client)) agents))

p-himik04:01:45

Maybe I'm just waking up, but how does having agents affect anything in your code? It seems like it can be just

(dotimes [_ 5]
  (call-api client))

p-himik04:01:03

(and I absolutely without a shadow of a doubt prefer the fn version)

didibus04:01:53

Ya, bad example, well. I can't remember what the situation was, but, when you do lots of doing side effects inside loops there's a few times it comes in handy

didibus05:01:07

Oh, now I remember, it was:

(send-off agnt #(call-api client))

p-himik05:01:06

"Handy" does not mean "worth it". :) So many more things come in handy in other languages - and all of those are almost exclusively the reason why I stopped using them. Or rather, over-reliance on such things by colleagues and overall language community. Oh, I just figured out why I feel so strongly about #_% - it brings back the memories of having to write "clever" C++ code that heavily relied on macros, templates, and quirks of a particular version of MSVC.

didibus05:01:35

I don't disagree, but some things are handy and worth it. Now this particular one, I don't think is worth it, cause it does still seem like an accident that it works.

👍 3
seancorfield05:01:43

Hahaha... Ah, yes, that brings back memories of being on the ANSI C++ Committee for eight years and having several discussions with the MS rep about VC++

😄 3
seancorfield05:01:24

(send-off agnt (fn [_] (call-api client))) ?

didibus05:01:43

That said, in clojure, not all scenarios have the same level of needing to be worthy. Sometimes I code Clojure on my phone for example, and on such a device, edits are really hard lol, so this is a nice trick to know. Same thing, sometimes I do things on a command line REPL with terrible read-line support, so you can't move the cursor back, you have to delete everything, etc. So I can see scenarios where this is useful

didibus05:01:17

But... honestly this syntax is growing on me, I feel somethings like this as well are a matter of idiom, people could pretty easily get used to it. #(rand-int 1 #_%) If I read it as: "call rand-int with arg 1 and discard passed in argument" its not that bad actually. A bit like how using _ is an idiom when you discard an arg in fn

didibus05:01:31

I won't send a PR with it though I swear 😋

emccue07:01:27

This feels like clojure behavior that needs to be silently smothered in 1.11 before anyone realizes it exists

borkdude08:01:37

@U0K064KQV I remember we have been over this on Twitter before where @U064X3EF3 said this was undefined behavior

borkdude08:01:54

If core says this should be supported I would be happy to look into it.

borkdude08:01:10

Having said that, I will take a look, if it's not an invasive change, then might fix.

Alex Miller (Clojure team)12:01:02

I do not believe this is behavior you should rely on. We’ve even looked at changes to the discard reader recently re tagged literals or reader conditionals and it’s not clear to me that things would still work this way after those changes. So, please don’t do this.

Alex Miller (Clojure team)12:01:54

It’s so much clearer to just use fn

💯 3
borkdude13:01:52

:thumbsup: I agree, and thanks for confirming.

seancorfield16:01:07

Ah, yeah, I remember the tagged literal discussions around the discard reader... true, if you're going to do that much rework on that reader, you might as well also ensure that the side-effecting parts of arg reader are discarded. It occurred to me overnight that people may well expect this to be a single-argument function: #(* 2 %1 #_%2) -- (map #(* 2 %1 #_%2) (range 10)) throws "Wrong number of args (1) passed"

Alex Miller (Clojure team)16:01:21

there are no defined semantics for the combination of discard reader and anonymous function arguments

Alex Miller (Clojure team)16:01:33

so you should not have any expectations imo

seancorfield17:01:45

This might be a good thing for a linter to check then, I guess: if % args appear inside a discarded form inside #( .. ), squawk! 🙂 /cc @U04V15CAJ

borkdude17:01:37

So the discussion has gone from: babashka doesn't support this, to: clj-kondo should forbid this? 😆

3
seancorfield18:01:44

New evidence was brought to the table: a likely rework of the DiscardReader which may invalidate this construct 🙂

Alex Miller (Clojure team)18:01:03

this is regardless of any potential change in the discard reader (that's just one example of a way in which false expectations could come back to haunt you)

didibus19:01:05

That tweeter stream would imply some other people might be using this trick as well hehe. Personally, I think the intuitive behavior is that discard reader would discard the form, even if used in other reader macros. Thus (mapv #(rand-int #_%) (range 10)) should say Wrong number of args (1) passed to: user/eval8628/fn--8629. I feel this is what most people would assume if you quizzed them on it. And I'd be happy actually if that became a guarantee of the reader, and a formal semantic.

borkdude19:01:10

This is how bb does it right now

seancorfield19:01:47

I am now officially sorry that my devious mind came up with this idea in the first place... but I'll blame @U0K064KQV for asking that intriguing question yesterday... 🙂

😅 6
didibus19:01:43

(defn &-tag-reader
  [[f & args]]
  `(fn [~'& ~'args] (~f ~@args)))

(set! *data-readers* (assoc *data-readers* '& user/&-tag-reader))

(mapv #&(rand-int 5) (range 10))
;;=> [3 0 4 0 2 0 4 4 4 2]
Maybe this is a more sane approach if I want such convenience. Yes, yes, except for the fact that unqualified tagged literals are reserved for Clojure 😛

didibus01:01:10

Say I'm doing: (mapv #(rand-int 10) (range 100))

didibus01:01:57

I guess I can do: (mapv #(do % (rand-int 10)) (range 100)) hum... :thinking_face:. But that's the same as using fn[_]

raspasov02:01:43

Perhaps not what you’re looking for but:

(vec (repeatedly 100 #(rand-int 10)))

didibus02:01:59

Good answer 😛, but ya I was just using this as an example. I kind of often stumble on scenarios where I don't care about the input, and would like a short way to wrap my thing in a side-effect. Like when I use agents for example, sometimes I don't actually care what the current agent value is.

raspasov02:01:54

Right… yea I don’t know of a way to ignore it in those cases; accidentally, on ClojureScript your example works, but that’s very much by accident because of the different way JS works 🙂

👍 3
seancorfield03:01:33

@U0K064KQV (mapv #(#_% rand-int 10) (range 100))

seancorfield03:01:11

You can use it to ignore multiple anonymous arguments too:

user=> (mapv #(#_%1 #_%2 rand-int 100) (range 10) (range 10))
[47 57 65 16 15 28 56 72 10 82]

didibus03:01:28

Wow, neat, haha, that's the exact kind of trick I was looking for.

dpsutton04:01:47

that's super neat and crazy and phenomenal to know, but if this is for real code and not playing around, that is absolutely not the trick you are looking for 🙂

didibus04:01:56

Yes, I've still not decided if I'm that kind of madman or not haha

dpsutton04:01:14

its 100% madman. and anyone reading it will be quite confused

didibus04:01:23

But I will def use it when messing around at the REPL

seancorfield04:01:50

@U11BV7MTK Have you seen my "trick" with discard forms in deps.edn so you can embed code forms that can be eval'd from an editor?

dpsutton04:01:07

no. is it in your deps.edn repo?

seancorfield04:01:49

No, I showed it in my RDD talk to Clojure Provo. I'll show it in my London talk too.

dpsutton04:01:41

i couldn't make the provo one. if london is convenient for my time zone i'm gonna definitely be there

seancorfield04:01:55

Because I use add-libs from t.d.a to add deps to a running REPL, and I don't want deps.edn to get out of sync, I add an ns with a :require of t.d.a's repl namespace and then put a repl/add-libs call between :deps and the hash map. Then with just a minor edit, you can run add-libs with all your deps, and then with a minor edit, turn it back into valid EDN.

seancorfield04:01:16

@U11BV7MTK January 12th, 10:30 am Pacific time.

dpsutton04:01:48

that's not so terrible. 8:30 here in central. i'll be there with some coffee

seancorfield04:01:35

Surely it's 12:30 Central?

dpsutton05:01:19

You are totally correct. Had that backwards. Thanks

seancorfield03:01:59

I'm actually a little shocked that #_% works in this case. The presence of #_% (or #_%2, #_%3, etc) cause the anonymous function to have that many arguments, but #_ then "throws away" the form that follows it: (mapv #(#_% rand-int 10) (range 100))

👍 9
zendevil.eth15:01:21

I have the following code that I’m trying to access a webpage with on the route about/something, but I’m getting 405 error:

(ns humboiserver.routes.home
  (:require
   [humboiserver.layout :as layout]
   [ :as io]
   [humboiserver.middleware :as middleware]
   [ring.util.response]
   [ring.util.http-response :as response]))

(defn home-page [request]
  (layout/render request "home.html" {:docs (-> "docs/docs.md" io/resource slurp)}))

(defn about-page [request]
  (layout/render request "about.html"))

(defn home-routes []
  [""
   {:middleware [middleware/wrap-csrf
                 middleware/wrap-formats]}
   ["/" home-page]
   ["/about"
    ["/something"
     (ring.util.response/response {:something "something else"})]]])
.Home page is rendered, but I expect to see the map returned on accessing localhost:3000/about/something. How to fix this error?

noisesmith15:01:32

@ps pardon my ignorance, but I can't tell from that snippet what your router is - you have a function that returns a data structure that is clearly meant to describe routes, but no indication of what program is using that structure

noisesmith15:01:20

405 indicates that the request method is wrong, but nothing in your route description indicates what methods are valid

noisesmith15:01:37

@ps the reitit examples I see don't use data structure nesting for child routes, they use it for enumerating request methods

noisesmith15:01:18

they would have ["about/something" ...] and [about/something-else] as separate entries

zendevil.eth15:01:47

but it works if I have a layout/render with that route

noisesmith15:01:08

OK - I'll let someone that knows reitit help

zendevil.eth15:01:59

I was thinking that it had something to do with incorrectly using ring.util.response

noisesmith15:01:44

I would be very surprised if that caused a 405, a 405 has a precise meaning, and to me that points to giving something else where reitit thinks it's getting data describing request methods

zendevil.eth15:01:47

what should I do to diagnose and fix this?

zendevil.eth15:01:28

i think that the response map has to be wrapped in something, but I don’t know what specifically

lukasz15:01:31

Just a guess - but your home-page and about-page need to return {:status 200 :body <html>}

noisesmith15:01:35

well, the other route is taking as an argument a function that takes a request and renders a response

noisesmith15:01:51

your broken route is rendering a response inline with the data, before seeing a request

noisesmith15:01:39

@lukaszkorecki that's what ring.util.response/response is doing - it doesn't do much else actually

zendevil.eth15:01:12

I made it an anonymous function

noisesmith15:01:25

@ps a hunch - it doesnt' error since a response map is a callable, but it just fubars when it gets passed a request

zendevil.eth15:01:41

but that gives wrong number of arguments

noisesmith15:01:52

@ps well it should take a request

noisesmith15:01:46

standard ring always uses a function of one argument (request map) returning a result map (or some other data which ring will then coerce into a response map)

lukasz15:01:00

@noisesmith Right, but the root route ("/") is using home-page function directly, in the snippet, ring.util is used in only one place. That said, I'm just guessing here - not sure what that router is

noisesmith15:01:48

ring is very smart about coercing results, the error here is happening on the layer of route dispatch

noisesmith15:01:46

standard ring always uses a function of one argument (request map) returning a result map (or some other data which ring will then coerce into a response map)

zendevil.eth15:01:14

it works when wrapped in (fn [req] …)

noisesmith15:01:29

that's what I'd expect, cheers :D

souenzzo16:01:57

Hello. My task is process an indefinitely long reader My first approach was a simple loop/recur But then I had the idea of implement as a lazy-seq I have some questions 1. read-all use non-tail call recursion. it may run into "stackoverflow" problem? 2. read-all will cause some GC issue? Just clean nodes at end or something like that? 3. There is any advantage of loop/recur approach?

(letfn [(read-all
          [rdr]
          ;; [clojure.data.json :as json]
          (let [v (json/read rdr
                             :eof-error? false
                             :eof-value rdr)]
            (when-not (identical? rdr v)
              (cons v (lazy-seq
                        (read-all rdr))))))
        (proc-all-loop [rdr]
          (loop []
            (let [v (json/read rdr
                               :eof-error? false
                               :eof-value rdr)]
              (when-not (identical? rdr v)
                (my-proc v)
                (recur)))))
        (proc-all-lazy [rdr]
          (run! my-proc (read-all rdr)))]
  ;; which is "better"
  (proc-all-loop *in*)
  (proc-all-lazy *in*))

noisesmith16:01:23

@souenzzo the classic problem with laziness is resource usage, here you can't really know when to close the reader / the stream the reader is built on

noisesmith16:01:35

(in the lazy version that is)

noisesmith16:01:07

if your intent is to eagerly consume, and you throw away the produced values (via run!)I don't know why you are using laziness

👍 3
emccue16:01:55

proc-all-loop

emccue16:01:37

semantics are clear, no laziness other than waiting on the reader

noisesmith16:01:06

lazy-seqs don't cause stack overflow unless you nest large numbers of unrealized lazy transforms, which is caused by mixing lazy and eager code sloppily

noisesmith16:01:46

(usually - I mean you could have (->> coll (map a) (map b) (map c) ...) until the stack blows up but your code would be a huge mess before that happend...)

emccue16:01:19

If you want to be able to handle the whole thing in sequence, you can make an IReduceInit from the reader that will be invalid when the reader is closed

emccue16:01:58

Since your semantics really aren't the same as a lazy-seq - 32 elements at a time will block and maybe deadlock your program

noisesmith16:01:43

right, but using lazy-seq directly won't impose that chunking

emccue16:01:45

just an iterator

emccue16:01:54

oh it wont?

emccue16:01:00

nvm ignore me

noisesmith16:01:28

other ops that take multiple collections could take that lazy-seq and return a chunking one, but that's more convoluted

noisesmith16:01:08

@emccue and the root point is a good one - lazy-seqs are bad for situations where realizing an element blocks or changes the state of some resource

emccue16:01:24

(defn reducible-json-rdr [rdr]
  (reify IReduceInit
    (reduce [f start]
      (let [v (json/read rdr :eof-error? false :eof-value rdr)]
        (if (identical? rdr v)
          start
          (recur f (f start v))))))

emccue16:01:48

^ just because it always feels neat to write out

dpsutton16:01:08

IReduceInit needs an init value

emccue16:01:01

public interface IReduceInit{
Object reduce(IFn f, Object start) ;
}

dpsutton16:01:22

(reduce [_ f start] ...)

emccue16:01:34

oh yeah this

emccue16:01:18

(defn reducible-json-rdr [rdr]
  (reify IReduceInit
    (reduce [_ f start]
      (loop [value start]
        (let [v (json/read rdr :eof-error? false :eof-value rdr)]
          (if (identical? rdr v)
            value
            (recur (f value v)))))))

souenzzo16:01:50

it's the simple loop/recur wrapped on reify reduceinit wrapped on a function

😁 3
ghadi18:01:39

Don’t forget IReduceInit implementations need reduced? handling

Célio19:01:07

Hi all. I need some sort of bounded queue that automaticaly drops elements older than X (seconds, minutes, hours, whatever). In general terms, every time I add a new element to the queue, I want it to remove elements that don’t satisfy a certain predicate. I was trying to accomplish that using sorted-set and disj but I’m not sure if this is optimal, something roughly like this:

(let [queue (sorted-set 5 3 4 1 2 9 6 8 7 0)]
  (println queue)
  (println (apply disj queue (take-while #(< % 3) queue))))
The console output is this:
#{0 1 2 3 4 5 6 7 8 9}
#{3 4 5 6 7 8 9}
So what’s the best approach to this problem? Is there anything like that available in Clojure?

hiredman19:01:49

There are many many options for this kind of thing, but you may need to work on your requirements to figure out what you actually want

hiredman19:01:22

e.g. doing stuff on a time limit is much easier then doing stuff for an arbitrary function

hiredman19:01:06

do you need immutable data structures? is this building some kind of cache? etc

Célio19:01:37

@U0NCTKEV8 Imagine an in-memory collection of maps, each containing a :timestamp field whose value is a ZonedDateTime. Every time a new element (ie: a new map) is added to the collection, I need to remove all elements older than, say, 24 hours.

hiredman19:01:24

that is a bounded cache with ttl eviction

Célio19:01:22

Correct (Thanks, I was also looking for the terminology 🙂)

dpsutton19:01:03

strange to see the ttl requirement enforced solely on addition and not retrieval

Célio19:01:44

@U11BV7MTK In my case the eviction on retrieval would be a nice bonus.

dpsutton19:01:58

seems a necessity

Célio19:01:24

For my purposes, not strictly necessary.

dpsutton19:01:28

if you don't add anything for 7 days, everything is evicted, but if that's only enforced on addition you'll get bad data

hiredman19:01:45

the reason arbitrary function vs. time matters is you can build an index based on a known field, but not on an arbitrary function

Célio19:01:59

@U11BV7MTK Not a problem for my application.

👍 3
dpsutton19:01:18

then i think you can use clojure.core.cache. the caches let you get a seq or iterator of the underlying hashmap that keeps the values. and when getting the iterator or seq, the cache invalidation is not respected (ie could have expired things in there)

Célio20:01:34

Thanks @U11BV7MTK I think that’s what I need.

dpsutton20:01:54

i think they are composable. ie, you can wrap a ttl around a bounded queue one.

Célio20:01:02

that’s awesome

jjttjj16:01:22

This came up a few days ago, but now I need something similar: just a ring-buffer with a ttl (expired items are evicted on new conj). I'm currenlty using the ordered-map library + clojure.core.cache which gets the job done just curious if there's a more direct implementation out there?

Lone Ranger22:01:53

Does anyone have any recommended best practices for doing remote REPL work in a sensitive data environment (HR/accounting etc)? Permissions, policies, technologies, ACL? I think auditability, monitoring, and permissions are the primary concerns here.

noisesmith22:01:30

a good baseline is ssh access, with the same user as the app runs under - and don't provide access to anyone you wouldn't provide a root shell on that machine to

👍 6
noisesmith22:01:19

I think going finer grained would just be a mess - there's too many ways to get permissions in a jvm, and no way to truly hide data once you have vm access

Lone Ranger22:01:58

How about something like auditability or monitoring of REPL sessions?

Lone Ranger22:01:04

Ever worked with anything like that?

noisesmith22:01:05

(by ssh access, I mean tunneling on an ssh connection, and the standard logging of ssh access)

Lone Ranger22:01:20

would ssh access log remote REPL stuff?

noisesmith22:01:36

beyond the layer logging of when connection happens, I think there's too many ways to undermine it

noisesmith22:01:29

of course you could take clojure.main and make a logged version, then make a policy of "always use the logged repl"

Lone Ranger22:01:52

interesting, I didn't know that was a possibility. Makes total sense!

noisesmith22:01:19

but that's not very easy to enforce - it's so easy to get a repl once you have a connection

noisesmith22:01:54

@U3BALC2HH yeah, at the root REPL is just a loop, you could say "only use this specific repl", and then check the logs, but there's still some layer of honor system there surely

Lone Ranger22:01:21

do you suppose most folks just use the honor system? I mean, surely someone out there uses the REPL on sensitive systems

dpsutton22:01:48

if its nrepl you could have a middleware that logs all messages back and forth

noisesmith22:01:57

sure - I guess I'm no expert, I'm just reasoning first principles on what one can do once you have a repl for the most part

noisesmith22:01:18

@U11BV7MTK right, sure - that's easy to attach to any repl, the hard part is actually enforcing that that repl is used and not modified

noisesmith22:01:41

clojure.main is not a lot of code, a logged version is an afternoon project at most

dpsutton22:01:27

remote repl sounds like something else manages the process. you have whatever repls it exposes. seems like a logging nrepl server exposed on that would be the easiest

🔥 3
noisesmith22:01:58

I guess there's always "some things are logged, if it looks like you are doing something shady you better have a good explanation", but that's nearly implicit on remote hardware

dpsutton22:01:06

if its socket repl it might be even easier to have the functions spit their ins and outs to a file. dunno. just thinking of ease of use tooling wise. nrepl is pretty standard to work with

noisesmith22:01:15

@U11BV7MTK that kind of sandboxing is fragile and illusory

noisesmith22:01:20

with clojure that is

noisesmith22:01:24

"something else manages the process" - until you run a single line of code that starts a new unlogged repl

noisesmith22:01:15

for example, someone on #clojure IRC found a one liner that turned the number 3 into the number 5

Lone Ranger22:01:38

it's like that Jimi Hendrix song

noisesmith22:01:42

of course that didn't affect cached unboxed values

noisesmith22:01:03

but otherwise, the cached Long instance of 3, was changed to now contain 5

noisesmith22:01:15

it was remarkable that some things didn't break lol

Ben Sless22:01:16

Why expose a remote repl to being with? Can it be avoided? If not, what about exposing a sci session instead?

😮 3
noisesmith22:01:30

sci session?

hiredman22:01:44

remote repls are great

Ben Sless22:01:44

Small Clojure Interpreter

Ben Sless22:01:01

That way you can't modify the running program

hiredman22:01:17

but you have to trust whoever has access to the repl

💯 9
noisesmith22:01:18

that's a big assertion to be making

Ben Sless22:01:23

They are, but with great power and all. They're a security nightmare

noisesmith22:01:27

people can modify running C programs lol

Lone Ranger22:01:53

I think "trust but verify" would be an acceptable solution (in my particular situation... not dealing with nuclear missiles or anything)

Ben Sless22:01:37

Well, sci is a better sandbox than just giving someone repl access. You can control which functions are exposed, for example

Lone Ranger22:01:46

@UK0810AQ2 I'd take sci over nothing! As long as there's a jdbc connector

hiredman22:01:59

I would be so annoyed with sci

Ben Sless22:01:22

There is everything you would expose to the sandboxed environment

hiredman22:01:28

the crazy stuff I've down with a repl over the years just would not be possible

Lone Ranger22:01:02

(that sounds like it should be its own channel... #hiredmanstories 😄 )

Ben Sless22:01:16

It's a compromise. For every crazy creative programmer you have a stumbling newbie who can break production

Lone Ranger22:01:02

So does everyone either just roll the dice or not provide a production REPL...?

Ben Sless22:01:41

We don't expose production repls

Lone Ranger22:01:51

Even for data analysis?

hiredman22:01:52

"oh, this doesn't have as much instrumentation as I would like, but I want to monitor it when it gets first deployed" -> write a program to connect the remote repl, execute code that reflective walks some objects and pulls out numbers and sends it back so I can stick it in a locally running graphite

dpsutton22:01:56

would love to read that blog post

hiredman22:01:16

oh, we lost a bunch of customer data and need an error recovery process -> write it up, stick it a (company controlled) pastebin, use pssh to load it into the repl of every server via slurp

😮 6
👀 3
Ben Sless22:01:19

A lot can be achieved with running locally with production data. Another compromise is running in a staging environment which mirrors or reads production data but can't change anything

hiredman22:01:39

but yeah, repl access is the keys to the kingdom, so if you can't trust people then don't give it to them

Lone Ranger22:01:54

Yeah, fair enough. I guess that's the bottom line

noisesmith22:01:23

I think this is all dev-trust complete. The most sensitive thing (liability wise in particular) is the customer data. Either a dev can be trusted to access it responsibly or not, the rest introduces a lot of work and frustration with little established benefit.

Lone Ranger22:01:31

So I guess I need to figure out a way to find a small, read-only, sand castle kingdom 😛

noisesmith22:01:10

@U3BALC2HH there's a lot you can do with configurable loggers plus an environment where a repl can process those logs

noisesmith22:01:36

then security obfuscation / monitoring can be introduced as a middleware - you have a lot more control

Lone Ranger22:01:30

@noisesmith so you're saying something like, use a REPL to consume/transform obfuscated logs (as opposed to directly consuming data?)

hiredman22:01:05

some of production servers don't always run a repl server now, you have to restart them with a special flag to turn on a repl, and even that bums me out

😢 3
noisesmith22:01:34

@U3BALC2HH right, logs as data (or even db entries used as if they were logs, ordered by timestamp), plus a separate process (not the production app) to consume and manipulate that data

noisesmith22:01:27

for example all sensitive customer info (everything identifiable) can be UUIDs / numeric ids, whatever pointing to a separate table you shouldn't need for dev

noisesmith22:01:07

you can debug app logic / data flow using the id to id correlation without exposing anything especially important (at least not in an easy to extract way...)

Lone Ranger22:01:48

Yeah makes sense. Hopefully someone listens!

seancorfield23:01:03

hiredman has said most of what I was going to say -- yes, we run socket REPLs in several of our production processes; yes, we ssh tunnel into production and connect a client to production (I connect VS Code on my desktop to production sometimes 🙂 ); since we AOT-compile our uberjars with direct-linking enabled, there's a limit to what we can redefine dynamically -- except in a couple of legacy processes that load Clojure from source at runtime (for reasons) and those can be live-patched all day long. Perhaps one of the most important considerations here is that any process that runs Clojure can be told to start a REPL via JVM properties at startup -- no code is needed inside the Clojure codebase, so anyone who has access to how a Clojure-based process is (re)started can enable a socket REPL in it, and then you have unlimited access, assuming you can get network access to that socket!

didibus23:01:51

I'm no ssh expert, but I'd be surprised if there was no way to log what is transferred through the ssh connection

didibus23:01:11

At the very least, ssh should have access logs.

didibus23:01:27

The thing is, the REPL won't give you more power than the SSH itself. Once I'm SSHed in, I can simply replace the service with another one, change the class files or source files, I can read the computer memory, steal the credential files, etc.

didibus23:01:42

Well root ssh

didibus23:01:59

So if that's allowed, the REPL through SSH isn't any riskier

didibus23:01:58

You could argue the data to steal is made more obscure without a REPL, but :man-shrugging:

noisesmith23:01:49

@U0K064KQV sure, but once I am in a jvm with a clojure process I can open up any method I find convenient to communicate - I'm not limited to the repl I first connected to

noisesmith23:01:21

access logs are a great start, and maybe even logging what comes across the wire in that first connection - just don't pretend it's especially limiting

didibus23:01:03

Maybe I explained myself wrong. I meant, once ssh with sudo is compromised, you're f***ed REPL or no REPL.

didibus23:01:42

So if your company allows ssh with sudo, and they deem they have secured that to allow it, the REPL doesn't add to the threat vector

noisesmith23:01:55

right - I don't think sudo / root access is a given (we can and should drop app privleges when running)

noisesmith23:01:24

but you have at least the privs of the process running the jvm, if you can repl in that jvm

noisesmith23:01:28

and if there are operating systems in production without local privilege escalations, they aren't used often

didibus23:01:00

I thought most places gave dev ssh sudo access to prod hosts

didibus23:01:42

So what I mean is, if you are already granted that permission, they trust you with a lot, the REPL doesn't let you do more things than ssh + sudo

didibus23:01:51

So I don't see why they'd be against it

didibus23:01:21

Now, if you only get ssh with some restricted user permissions, and those permissions are less then the user of the JVM, that's different

didibus23:01:58

But as long as your ssh user has the same or more permissions as the user of the JVM, the REPL does not expose more things to you, its just a nicer UX

didibus23:01:03

For example, your DB credentials are going to be stored in some file which the user of the JVM has permission to read, so I can easily ssh, read the file, get the creds, ssh tunnel my SQL Workbench and connect to your DB

seancorfield23:01:43

If you have ssh access and no permissions, you can still tunnel to the server and connect to a socket REPL.

seancorfield23:01:17

The socket connection on the loopback isn't restricted to just certain user accounts.

seancorfield23:01:25

We tunnel in via a low-privilege user and the JVM runs under a separate user to which that tunneling account has pretty much no access, yet it can still connect to the REPL's port.

seancorfield23:01:53

So a socket REPL is more access that just what ssh allows, in that respect.

didibus23:01:46

Yes, when your ssh user has less permissions. I'm saying, if your InfoSec department lets you have SSH access with equal or more permissions than your JVM user, than the REPL isn't doing anything worse.

didibus23:01:28

I don't know if that's the case for OP, but they should check. If they are already allowed to SSH with a user of similar or more permissions to the user running their app, then they shouldn't need to do anything more to "secure" the use of the REPL, since all data that can be accessed by the REPL, and all commands the REPL can execute on the machine can also be accessed and executed through other means.

didibus23:01:56

Otherwise, and something I've done in the past is that our app does not run with a REPL open. Instead, you ssh into the host, and you start a second instance of your app with a REPL in it, that second app instance is thus launched with your ssh user, and is restricted to those permissions, then you can REPL into that.

didibus23:01:35

We also do this to protect ourselves from accidentally reloading some buggy or broken code and causing prod issues

seancorfield00:01:51

True, running a local REPL instance on a production server, that can spin up the same DB connections etc, is "safer" in terms of not letting you accidentally blow up a running production process. Well, at least not letting you redefine the code -- you can still blow up production via the DB at that point 🙂

seancorfield00:01:11

With great power (REPL) comes great responsibility!

Lone Ranger00:01:17

Hey everyone, this is a wonderful discussion! Glad to come back and find so many comments. I think that's a great point @U0K064KQV -- the argument for a remote Clojure REPL (if I'm understanding this correctly) -- and the argument for ssh are effectively the same argument. (also @U04V70XH6 your notable insight is really helpful. Always good to see/hear how the pros like you and @U0NCTKEV8 do it!)

hiredman23:01:20

apropos of repls, for some reason a while ago I wanted the feature I think a number of clojure ide kind of enviroments have, where you can can get a repl running the context of some other running code, I didn't have that feature so I wrote this code to "throw" a repl over tap https://gist.github.com/a2630ea6153d06840a2723d5b2c9698c

Alex Miller (Clojure team)23:01:23

if I understand what that does, that's cool

hiredman23:01:12

you put a call to start-repl somewhere in your code, then run your code, then in a repl connected to the same process (if your code runs in another thread you can use the same repl that you used to run your code) you call wait-for-repl, and once execution hits start-repl the repl where wait-for-repl is running is taken over and inputs and outputs are forwarded to and from a repl running where the call to start-repl is

hiredman23:01:58

yeah, PrintWriter-on looks handy, I need to remember it next time I write one of these

borkdude23:01:42

Reminds me a bit of https://github.com/technomancy/limit-break (although it's different)

Alex Miller (Clojure team)23:01:51

well it was born out of doing this kind of stuff from prepl :)

vlaaad09:01:20

Not sure what this REPL does, can you explain?

vlaaad09:01:39

Ah, so it blocks until repl is tapped, interesting! I'm still not sure what's the purpose of tapping it and waiting on tapped REPL...

vlaaad09:01:04

Why not just start a repl with lexical eval?

hiredman23:01:40

it is kind of neat