Fork me on GitHub
#clojure
<
2021-04-22
>
craftybones06:04:34

Hello, @magnars and I are working on an issue I recently found on stasis. https://github.com/magnars/stasis/issues/30

craftybones06:04:00

Does getting class path entries differ between using Clojure as a repl and a lein repl?

craftybones06:04:03

It might simply be the approach to getting the classpath entries

craftybones06:04:27

the code used is:

(defn class-path-elements []
  (->> (s/split (System/getProperty "java.class.path" ".") #":")
       (remove (fn [#^String s] (.contains s "/.m2/")))))

craftybones06:04:36

Clearly lein does stuff since it launches differently and loads libs etc

seancorfield07:04:01

@srijayanth The local Maven repo can be in a different directory, at least with the Clojure CLI / deps.edn.

seancorfield07:04:35

Also, the separator is different on Windows than macOS/Linux

craftybones07:04:19

@seancorfield - I think the local Maven repo being in a different directory is certainly one part of it

craftybones07:04:30

however, it doesn’t respect a few other things too

seancorfield07:04:33

(System/getProperty "path.separator") is what you need for the separator. It isn’t : on Windows.

craftybones07:04:35

for instance, my deps.edn has

{:paths ["src" "examples"]}

craftybones07:04:14

But when I run the class-path-elements fn on a plain old clojure repl, I don’t see the “examples” dir listed at all

seancorfield07:04:21

How are you starting the REPL?

craftybones07:04:44

just plain old clojure

seancorfield07:04:37

(! 1316)-> clojure 
Downloading: org/clojure/clojure/1.10.3/clojure-1.10.3.pom from central
Downloading: org/clojure/core.specs.alpha/0.2.56/core.specs.alpha-0.2.56.pom from central
Downloading: org/clojure/spec.alpha/0.2.194/spec.alpha-0.2.194.pom from central
Downloading: org/clojure/pom.contrib/0.3.0/pom.contrib-0.3.0.pom from central
Downloading: org/clojure/pom.contrib/0.3.0/pom.contrib-0.3.0.pom from central
Downloading: org/clojure/spec.alpha/0.2.194/spec.alpha-0.2.194.jar from central
Downloading: org/clojure/core.specs.alpha/0.2.56/core.specs.alpha-0.2.56.jar from central
Downloading: org/clojure/clojure/1.10.3/clojure-1.10.3.jar from central
Clojure 1.10.3
user=> (System/getProperty "java.class.path")
"src:examples:/Users/sean/clojure/r/libraries/org/clojure/clojure/1.10.3/clojure-1.10.3.jar:/Users/sean/clojure/r/libraries/org/clojure/core.specs.alpha/0.2.56/core.specs.alpha-0.2.56.jar:/Users/sean/clojure/r/libraries/org/clojure/spec.alpha/0.2.194/spec.alpha-0.2.194.jar"
user=> ^D

Thu Apr 22 00:07:54
(sean)-(jobs:0)-(~/clojure/r)
(! 1317)-> cat deps.edn 
{:mvn/local-repo "libraries"
 :paths ["src" "examples"]}
See that deps are downloaded into the local libraries folder as a Maven repo.

seancorfield07:04:48

Also see that examples is on the class path.

dpsutton07:04:35

> Does getting class path entries differ between using Clojure as a repl and a lein repl? I updated that issue with the extra classpath roots that lein includes by default: test, target, dev-resource, etc

dpsutton07:04:02

so the answer to that question is kinda

dpsutton07:04:31

> Downloading: org/clojure/clojure/1.10.3/clojure-1.10.3.jar from central @seancorfield how in the world did you not have clojure 1.10.3 on your computer already. I figured you'd have had the alphas and the release on day 1

seancorfield07:04:43

It wasn’t in the custom :mvn/local-repo for that deps.edn project. That was what I was demonstrating.

seancorfield07:04:56

We’re already on 1.11.0-alpha1 in production 🙂

craftybones07:04:16

The core of the problem in that issue seems to be that under lein, the class path entries are listed as full paths, whereas when run as just Clojure, it uses a relative path

craftybones07:04:35

@magnars assumed the way it works under lein would be the right way, not a bad assumption, but turns out that this doesn’t quite work that way. But thanks for your help, I’ve figured out how to deal with it

craftybones07:04:02

I sometimes wonder if @seancorfield is some sort of a hyperintelligent chat bot. I’ve no idea how he responds to every query

🙌 2
seancorfield07:04:56

I don’t sleep very well, and between my macOS desktop, my Windows/WSL2 laptop, and my Android phone, I’m online pretty much the entire time I’m not actually asleep 🙂

Yehonathan Sharvit10:04:06

A question related to lazy sequences: Is it possible to count the number of elements in a lazy sequence that doesn’t fit in a program memory? For instance a sequence of 1e6 elements where each element is 1MB. Not sure if the question is well formulated.

borkdude10:04:30

not possible since sequences are linked lists (conceptually) and to count them you have to realize all the elements, traversing them from left to right

p-himik10:04:05

What if you keep calling next to get rid of the head?

thheller10:04:07

but you don't have to hold on to the elements

borkdude10:04:49

if you don't want to hold on to the head, then yes, that is possible, but this is quite an expensive operation to count items if that's all you do with them

borkdude10:04:13

there is probably a better solution

Yehonathan Sharvit10:04:24

I need to count the rows on a HBase/Bigtable table

Yehonathan Sharvit10:04:48

As far as I know, it’s not possible to do it in the server side

Yehonathan Sharvit10:04:18

One has to count all the row keys

Yehonathan Sharvit10:04:02

cbass (a HBase Clojure lib) provides an API that returns a lazy sequence of rows

Yehonathan Sharvit10:04:38

I want to know what to expect in terms of memory when I count the rows

borkdude10:04:44

@viebel The cbt command line seems to have a count option so maybe it's possible after all somehow?

restenb10:04:05

is there a point to providing chanwith a buffer size, if I have no idea how many things will have to be on it?

Yehonathan Sharvit10:04:07

No @borkdude. cbt scan rows and count

Yehonathan Sharvit10:04:51

Let’s continue the discussion on a thread

jumar10:04:06

What is it again exactly what you need to do? Just count the elements?

Yehonathan Sharvit10:04:35

I think that I got the answer with the help of babashks (thanks @borkdude) It takes a while but doesn’t explode the memory:

bb -e "(count (map (fn [x] (into [] (range 1e3))) (range 1e6)))"

Yehonathan Sharvit10:04:51

while this one explodes the memory:

bb -e "(count (mapv (fn [x] (into [] (range 1e3))) (range 1e6)))"

delaguardo10:04:56

(reduce #(inc % #_ %2) 0 (range 1e6)) as a possible alternative

Yehonathan Sharvit11:04:23

Why would it be better than count @U04V4KLKC?

delaguardo11:04:11

it is the same from memory consumption or performance perspectives but can be composed with other transducers like filter

delaguardo11:04:22

also I remember cli tool hbase has a rowcounter command

usage: hbase rowcounter <tablename> [options] [<column1> <column2>...]
Options:
    --starttime=<arg>       starting time filter to start counting rows from.
    --endtime=<arg>         end time filter limit, to only count rows up to this timestamp.
    --range=<arg>           [startKey],[endKey][;[startKey],[endKey]...]]
    --expectedCount=<arg>   expected number of rows to be count.
For performance, consider the following configuration properties:
-Dhbase.client.scanner.caching=100
-Dmapreduce.map.speculative=false

delaguardo11:04:41

not sure if it help in your case though

Yehonathan Sharvit11:04:39

I am pretty sure that hbase CLI does the counting on the client side

restenb10:04:59

hm. i guess unbuffered channels are meant for information exchange only, not actual queues

thiru20:04:19

With leiningen the library author's version would usually be in the defproject of project.clj. What are people typically doing for tools.deps projects? I guess putting it in deps.edn doesn't make sense?

seancorfield20:04:13

I keep it in pom.xml and update it with depstar when I build a library JAR. My deployment process is documented in the depstar README.

borkdude20:04:33

@thiru0130 You are allowed to put data in deps.edn, but don't use unqualified keywords on the top-level since they may conflict with tools.deps in the future.

borkdude20:04:17

having said that, I more or less adopted a convention of putting a file in resources/MY_LIB_VERSION so users can do (io/resource "MY_LIB_VERSION") and I can also use it in scripts. Not saying this is a good convention. I was already doing this in lein-managed libs.

notbad 2
thiru20:04:25

Cool, thanks for the tips guys

Eric Auld23:04:30

Hi, y’all. Have a question about some code I’m writing. The problem I was given as a sort of kata was to take two Clojure maps (`m1` and `m2` representing “the same” map at two different times) and find the “minimal transaction” taking `m1` to `m2`. I’m going to restrict “minimal transaction” to mean “using only simple ‘add’ and ‘remove’ commands”, since this is sort of a baby version of doing the same thing for a Datascript database, where those really are the only two things you can do. I thought it would be nice if it were fairly easy to go from `(minimal-transaction m1 m2) -> (m1 |-> m2)` …in other words, if I represented the minimal transaction in such a way that it could be easily converted into something executable. And then when that executable thing were called on `m1`, it would would produce `m2`. (Like a list of Datascript transactions can easily have datascript/q called on it, and update the database from old state to new state.) https://www.codepile.net/pile/Z1KREX8b was a bit hacky; it produces things like

example-map1
=> {:a 1, :b 2, :c {:d 3, :e #{7 6 9 8}, :f 10}}
example-map2
=> {:a 1, :b 11, :c {:d 3, :e #{13 6 12 8}, :g 14}}
(minimal-transactions example-map1 example-map2)
=>
((update-in [:b] dissoc)
 (update-in [:c :f] dissoc)
 (update-in [:c :e] disj 7)
 (update-in [:c :e] disj 9)
 (assoc-in [:b] 11)
 (update-in [:c :e] conj 13)
 (update-in [:c :e] conj 12)
 (assoc-in [:c :g] 14))
So if you could “thread-first” `example-map1` through that list of commands, it would become `example-map2`. I can accomplish this threading via
(eval `(-> example-map1 ~@(minimal-transactions example-map1 example-map2)))
=> {:a 1, :b 11, :c {:d 3, :e #{13 6 12 8}, :f 10, :g 14}}
But I think I should avoid `eval`. There’s probably a better way to do it. One thing that occurred to me was to make the output of minimal transaction a function, although then it’s not so easily readable by the user. The only way I could think to build that function iteratively (as I constructed the above transaction list iteratively, using `mapcat`, in the code I linked above) was to build the function with `reduce` and `compose`, turning each step into a lambda, and composing it at each stage to end up with the giant function which is the composition of all the transactions. That didn’t seem too great to me, either. Any suggestions? Possibilities: 1. Who cares about it being executable? It’s not explicitly stated in the problem, and someone can always figure out how to do that if they need to. 2. Who cares if the function is immediately readable? Someone can always go read the source code if they want to view the series of transactions. 3. Using `reduce` and `compose` that way to build a huge function is ill-advised because (I don’t know why). 4. People use `reduce` and `compose` together all the time to put a big function together out of little pieces; don’t be scared. 5. ? Thanks Clojurians, happy to provide more info (on top of this post which is already a bit long 🙂) if you want. Let me know if this is better placed in the “code review” channel. Didn’t seem like there was much activity there.

Michael Gardner00:04:38

my first inclination would be to represent the diff as plain old data, like what clojure.data/diff produces

👍 3
Eric Auld00:04:31

They asked for both the diff and the “minimal transactions”, but if that is your instinct, that’s informative, because maybe it means that the data is what’s important, and executing it is a minor variation.

Michael Gardner00:04:44

I don't like the way he put it, but Linus Torvalds' quote has always stuck with me:

Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

👍 3
Huahai02:04:36

Does that

👍 3