Fork me on GitHub
#clojure
<
2020-09-01
>
vemv07:09:59

Looking for a defn that does the following: given n hashmaps which have deeply nested entries and may have an heterogeneous shape between them, determine a "sub hashmap", possibly nested, that all n items are a "super hashmap" of Example:

(let [corpus [{:foo {:a 1 :b 2 :c 3}}
              {:foo {:a 1 :b 9}}]]
  (is (= (desired-fn corpus)
         ;; :b is discarded because the :b values aren't equal
         ;; :c is discarded because it's only present in one item of the corpus
         {:foo {:a 1}})))

Daniel Stephens09:09:39

(defn keys-in
  "Returns a sequence of all key paths in a given map using DFS walk."
  [m]
  (letfn [(children [node]
            (let [v (get-in m node)]
              (if (map? v)
                (map (fn [x] (conj node x)) (keys v))
                [])))
          (branch? [node] (-> (children node) seq boolean))]
    (->> (keys m)
         (map vector)
         (mapcat #(tree-seq branch? children %)))))

(defn find-submap [ma mb]
  (let [find-in (fn [m ks]
                  (find (get-in m (drop-last ks))
                        (last ks)))
        ks (keys-in ma)]
    (reduce
      (fn [m k] (let [va (get-in ma k)
                      eb (find-in mb k)]
                  (if (and eb (= va (val eb)))
                    (assoc-in m k va)
                    m)))
      {}
      ks)))
this basically takes the first map and gets all of the vectors that would have some answer for (get-in m v) and then builds a map where the answer to the get in is the same for both maps. keys-in code taken from https://dnaeon.github.io/clojure-map-ks-paths/ which is the important bit really

👍 3
🙌 3
vemv09:09:10

Very nice one! Yes, I had thought of the same algo after posting the question. Had seen keys-in somewhere, not sure if I could have googled it anyway.

vemv09:09:13

Thanks much!

😊 3
Daniel Stephens14:09:36

ooh that's a really nice idea 👍 (comp last clojure.data/diff)

Daniel Stephens14:09:47

I guess it acts a little differently as map values will also get diffed if they are sequences or other handled things

dpsutton13:09:41

@ghadi a while ago you posted an implementation of pipeline that did not preserve order. Just dotting i's but wanted to be explicit if we could use that code in our codebase at work

ghadi13:09:53

go for it

👍 3
emccue14:09:19

I'm getting some wacky behavior from transit

emccue14:09:28

don't know if anyone has run into anything similar

emccue14:09:32

(editscript/diff [:div {:class "Apple"}
                  [:p "I hate pears!"]]
                 [:div {:class "Apple" :id "aaa"}
                  [:p "I hated pears!"]])
=> [[[1 :id] :+ "aaa"] [[2 1] :r "I hated pears!"]]
(let [out (ByteArrayOutputStream. 4096)
      writer (transit/writer out :json)]
  (transit/write writer [[[1 :id] :+ "aaa"] [[2 1] :r "I hated pears!"]])
  (.toString out))
=> "[[[1,\"~:id\"],\"~:+\",\"aaa\"],[[2,1],\"~:r\",\"I hated pears!\"]]"
(let [out (ByteArrayOutputStream. 4096)
      writer (transit/writer out :json)]
  (transit/write writer (editscript/diff [:div {:class "Apple"}
                                          [:p "I hate pears!"]]
                                         [:div {:class "Apple" :id "aaa"}
                                          [:p "I hated pears!"]]))
  (.toString out))
Execution error (NullPointerException) at com.cognitect.transit.impl.AbstractEmitter/marshalTop (AbstractEmitter.java:203).
null

emccue14:09:01

If I pass it the result of calling the function directly I get a null pointer exception

emccue14:09:20

but if I pass it via copy pasting the value from the repl it works

Joe Lane14:09:49

What is the type of the result from editscript/diff ?

emccue14:09:02

EditScript

emccue14:09:06

oh, its a gnarly deftype

emccue14:09:18

(deftype ^:no-doc EditScript [^:unsynchronized-mutable ^PersistentVector edits
                     ^boolean auto-sizing?
                     ^:unsynchronized-mutable ^long size
                     ^:unsynchronized-mutable ^long adds-num
                     ^:unsynchronized-mutable ^long dels-num
                     ^:unsynchronized-mutable ^long reps-num]

Joe Lane14:09:43

Yeah. It's printing nice because it has

#?(:clj (defmethod print-method EditScript
          [x ^java.io.Writer writer]
          (print-method (get-edits x) writer))
   :cljs (extend-protocol IPrintWithWriter
           EditScript
           (-pr-writer [o writer opts]
             (write-all writer (str (get-edits o))))))

emccue14:09:11

get-edits seems to be the key

👍 3
rutledgepaulv14:09:12

get-edits returns a plain data representation that you could transfer over the wire and then turn back into a EditScript instance. I've done that before

emccue14:09:25

(let [out (ByteArrayOutputStream. 4096)
      writer (transit/writer out :json)]
  (transit/write writer (edit/get-edits (editscript/diff [:div {:class "Apple"}
                                                          [:p "I hate pears!"]]
                                                         [:div {:class "Apple" :id "aaa"}
                                                          [:p "I hated pears!"]])))
  (.toString out))
=> "[[[1,\"~:id\"],\"~:+\",\"aaa\"],[[2,1],\"~:r\",\"I hated pears!\"]]"

emccue14:09:29

yeah that works

💯 3
emccue14:09:25

Maybe editscript should have a less pretty representation

emccue14:09:03

EditScript([[[1 :id] :+ "aaa"] [[2 1] :r "I hated pears!"]])

emccue14:09:24

It would have been enough of a clue for me to search for get-edits

wombawomba17:09:21

What’s an easy way to split a vec of n items into a vec of vecs, where the number of items in each ‘split’ is given by another vec? E.g. (get-splits (range 10) [5 2 3]) => [[0 1 2 3 4] [5 6] [7 8 9]]

wombawomba17:09:44

The nicest solution I can come up with is

(loop [[n & ns] [5 2 3], xs (range 10), acc []]
  (if n
    (recur ns (drop n xs) (conj acc (vec (take n xs))))
    acc))
…but I’m thinking there’s probably a more clever way :)

schmidt7317:09:50

look at this:

schmidt7317:09:01

you map over the indices of subsequences you want and extract those out

schmidt7317:09:28

yeah you could use subvec instead of drop take

Chris O’Donnell18:09:50

Your loop recur implementation is the most readable IMO.

jcf18:09:38

Use of reductions is super elegant, @UA5LS8DMJ!

schmidt7318:09:05

i don't know if it would be the best to put in production, loop-recur is more standard and anyone can understand that, but I am fond of the elegant solutions myself 🙂

jcf18:09:42

Legibility and maintainability are important to me too, but so is composition. Loop/recur doesn’t compose well with other operations.

jcf18:09:41

Good answers on the topic of loop/recur here for anyone who’s interested: https://stackoverflow.com/questions/32641902/clojure-loop-recur-pattern-is-it-bad-to-use

👍 3
edwaraco18:09:16

Wdyt about this (using reductions and subvec):

(let [elements (vec (range 10))
      indexes [5 2 3]]
  (map (fn [from to]
         (subvec elements from to))
       (reductions + (cons 0 indexes))
       (reductions + indexes)))

👍 3
schmidt7317:09:47

I am hoping to be able to put my webserver in production by packaging it as an uberjar, running the uberjar, and then put it behind a nginx reverse proxy for SSL support.

schmidt7317:09:12

I'm using the component framework, and I've been having some troubles with having a graceful exit.

schmidt7317:09:25

Namely, I'd like for components to be stopped when I hit Ctrl+C

schmidt7317:09:14

I figured I could do something simple with Runtime/getRuntime and addShutdownHook, but it does not seem to be working and I am getting rather frustrated.

schmidt7317:09:17

None of the "not here"s are triggered...

schmidt7317:09:15

I've had great success with component when developing in CIDER, I

schmidt7317:09:30

I'm wondering if I am going about this the wrong way...

emccue17:09:51

my only guess is that the way you have been testing it somehow doesn't count as an "orderly shutdown" for the jvm

schmidt7318:09:47

@emccue I've been hitting Ctrl+C

emccue18:09:05

Also, if the O/S gives a SIGKILL () signal (kill -9 in Unix/Linux) or TerminateProcess (Windows), then the application is required to terminate immediately without doing even waiting for any cleanup activities. In addition to the above, it is also possible to terminate the JVM without allowing the shutdown hooks to run by calling Runime.halt() method.

emccue18:09:10

I'm truly just guessing

emccue18:09:31

your code seems right based on all the tutorials ive read

emccue18:09:28

maybe component/start runs forever

emccue18:09:33

like if it starts a web server

schmidt7318:09:49

yeah that isn't the case though because okay... gets printed

schmidt7318:09:53

and because ik it doesn't

schmidt7318:09:59

it does it in a seperate thread

schmidt7318:09:01

and returns instantly

schmidt7318:09:22

sometimes the first "not here" is printed

schmidt7318:09:45

where the first not here is the first "mot here" that appears when reading the snippet from top to bottom

schmidt7318:09:56

but never any others...

schmidt7318:09:55

@emccue I think I figured it out... it was occuring because I was testing with "lein run"

schmidt7318:09:14

so if I send a sigint, it is sent to the lein process that is running my process...

emccue18:09:43

I've been noodling around today trying to get some imitation of live view working

emccue18:09:01

the biggest roadblock for me right now is just the process model

emccue18:09:20

clojure agents use a thread pool, which means they probably aren't the best fit

emccue18:09:40

considering this approach probably only gets scaleable with virtual threads

emccue18:09:50

but right now my thought is

emccue18:09:53

some internal state per user -> ... -> rendering function -> ... -> Hiccup -> Editscript Diff that Hiccup -> Send over websocket -> Reagent render that hiccup

emccue18:09:54

... -> events come from users somehow -> events fed to the "process" that maintains that state -> ... -> update state -> ... -> GOTO START

Joe Lane18:09:56

Encode that process model as data, let me put it in a database, then I can be stateless, then servers and connections don't matter.

emccue18:09:17

sure, but then every user is served directly from the db

emccue18:09:33

and unlike normal db queries i don't see an opportunity for caching

emccue18:09:57

so that feels like bottleneck that would start to matter

emccue18:09:16

and in terms of what I am trying to emulate - I think elixir has in memory processes

emccue18:09:29

though i really don't know enough to say

emccue18:09:54

I just want to steal a cool feature and show a use case for virtual threads

Joe Lane19:09:50

use core async go-loops / agents with some process loop then?

emccue19:09:13

@joe.lane Nah, I'd rather not

emccue19:09:29

go blocks impose explicit scheduling points

emccue19:09:59

working with virtual threads seems a lot more fun

schmidt7319:09:20

anyone have experience with having user.clj along with dependencies in a separate development directory?

schmidt7319:09:24

i tried adding creating dev/user.clj so that user.clj wouldn't be under src/ and then modifying the :dev profile appropriately, but now i'm getting that my tests can't find any code namespaces in src when I reload everything...

emccue19:09:25

reading through this there are a lot of things about the JVM that get closer to BEAM with the addition of virtual threads

emccue19:09:32

is my main driver/motivation

phronmophobic19:09:35

what's the difference between a thread pool and virtual threads?

emccue19:09:14

thread pool is N operating system threads

emccue19:09:25

and you can run N tasks concurrently

emccue19:09:41

virtual threads is N operating system threads

emccue19:09:49

and you can run M tasks concurrently

emccue19:09:52

where M >> N

emccue19:09:40

to run more than N tasks concurrently on N threads, you need to structure those tasks so that they yield control at specific scheduling points

emccue19:09:09

which is what core.async does, and its what any async/await style syntax does

dpsutton19:09:34

http://cr.openjdk.java.net/~rpressler/loom/loom/sol1_part1.html is a good overview if its the document i think it is

👍 3
emccue19:09:06

for virtual threads, the scheduling points are added by the JVM and not the programmer, so your code isn't "infected" by needing explicit async-style constructs

emccue19:09:49

same idea as goroutines

emccue20:09:30

clojure gets as close as it can to goroutines via rewriting code around puts and takes in a macro

emccue20:09:53

but fairly widespread and fundamental stuff like JDBC will never be able to participate

emccue20:09:35

so there are inevitably blocking tasks who will not yield back control

phronmophobic20:09:12

are virtual threads possible on the jvm without changes to the jvm?

emccue20:09:31

I think there were some attempts using bytecode manipulation

emccue20:09:27

like rewriting the whole program to be a state machine

emccue20:09:17

there was even a library for clojure

emccue20:09:26

but if you notice by the latest commit

emccue20:09:44

the person who last worked on that is the person who is the public face of project loom

emccue20:09:29

which tells me at least that the old approach probably won't be well maintained

emccue20:09:20

and probably had some issues

phronmophobic20:09:17

is the big difference between M virtual threads and M OS threads just the amount of memory consumed per thread? iirc, most OS's have improved their schedulers to work better with high numbers of threads

emccue20:09:21

I'm personally not super sure about the properties of OS threads that requires a larger stack space

emccue20:09:36

this article seems to cover it

emccue20:09:21

...each OS thread has its own fixed-size stack. Though the size is configurable, in a 64-bit environment, the JVM defaults to a 1MB stack per thread. You can make the default stack size smaller, but you tradeoff memory usage with increased risk of stack overflow. The more recursion in your code, the more likely you are to hit stack overflow. If you keep the default value, 1k threads will use almost 1GB of RAM! 

ghadi20:09:55

even if the kernel scheduler is "improved" to juggle many threads, parking and unparking OS threads is not cheap

ghadi20:09:32

I remember testing 2 million threads with core.async a long time ago

ghadi20:09:43

Loom could give 10s of millions

emccue20:09:57

one cool thing that could maybe be done is eventually make shutdown-agents a no-op

emccue20:09:21

and just spawn those threads as one off virtual threads without a pool

hiredman20:09:43

you can't do that

hiredman20:09:54

we could get rid of shutdown-agents today

hiredman20:09:41

but that would change the behavior of old programs

emccue20:09:27

hmm, wasn't the main issue with shutdown-agents just that if you don't do it your program won't halt for 30 seconds or something?

emccue20:09:46

even if there are no tasks running

hiredman20:09:56

the agent threadpool creates non-daemon threads by default

hiredman20:09:24

and the jvm won't exit while non-daemon threads are running

emccue20:09:08

so the behavior old programs would expect is

emccue20:09:20

if I sent off a task, I won't exit until it is complete

emccue20:09:26

because it is on a non-daemon thread

hiredman20:09:36

that the jvm doesn't exit while the agent threadpool is running

emccue20:09:03

what could be built off of that assumption?

hiredman20:09:23

futures run on the same pool

hiredman20:09:09

we could mark the pool as daemon threads, which would cause the jvm to exit as soon as the main thread exits (assuming no other non-daemon threads are created)

emccue20:09:58

hmm, so there are 3 possible behaviors we are talking about, 2 of which could be made true today

emccue20:09:12

the first is that until the threadpool is killed, the jvm is alive

emccue20:09:35

xxxx^x----------|
xxxx^xx---------|

emccue20:09:52

so if a task is running on any one of those threads it will finish

emccue20:09:59

and then the jvm will hang

emccue20:09:19

the second is that we make those threads daemons

emccue20:09:33

xxxx|
xxxx|

emccue20:09:37

and work is cut off early

emccue20:09:11

and the one we can't do because it would break expectations is, we have any running threads be marked non-daemon, but they aren't pooled so whenever their tasks finish they die

emccue20:09:23

xxxx^x|
xxxx^xx|

Alex Miller (Clojure team)20:09:58

so programs can "fix" this now if they really want to

hiredman20:09:34

the jvm also runs at least one default threadpool now (for the completablefuture stuff) which maybe agents should be running on, but that is also a compatibility issue

emccue20:09:21

yeah, the reason I am thinking of it at all is because I was trying to make my own agents

emccue20:09:32

(defn process-create [initial-state update-function]
  #:process{:state (atom initial-state)
            :error (atom nil)
            :message-queue (ArrayBlockingQueue. 10)
            :update-function update-function})

emccue20:09:52

just so I could use virtual threads for this mess-around experiment

emccue20:09:21

still not super-duper happy with a global option, but it works for now

emccue20:09:58

(set-agent-send-executor! (Executors/newUnboundedExecutor (-> (Thread/builder)
                                                              (.daemon false)
                                                              (.virtual)
                                                              (.factory))))
(set-agent-send-off-executor!  (Executors/newUnboundedExecutor (-> (Thread/builder)
                                                                   (.daemon true)
                                                                   (.virtual)
                                                                   (.factory))))

emccue20:09:24

which matches the descriptions of the semantics

hiredman20:09:00

it does not

hiredman20:09:44

both are non-daemon as of now

hiredman20:09:27

and the send executor is bounded in size, the send-off is not

emccue20:09:37

I'm like the living embodiment of why ben franklin made that quote

emccue20:09:40

wait internet says thats king james bible

ghadi20:09:04

dynamic vars may have some friction with Loom virtual threads

ghadi20:09:11

remains to be seen

ghadi20:09:25

dynvars rely on ThreadLocal

emccue20:09:38

ThreadLocals work afaik

ghadi20:09:47

I didn't say they don't work

emccue20:09:00

and there have at least been writeups about a JVM native Scoped<T> type

ghadi20:09:08

loom is looking at their own lighterweight scoped vars

emccue20:09:13

which seems to match the semantics

emccue20:09:42

so In the far future clojure could steal it

Alex Miller (Clojure team)21:09:59

I'll be living on Mars by then

😆 3
Mark Gerard12:09:04

Is there space on the craft taking you to Mars?