clojure 2021-07-09 | Slack Archive

Does clojure -P exit with a non-zero status code?

For context, I'm trying to employ it to do some caching inside a Docker image, and it seems to stop the build.

It does. When the command line arguments are incorrect or when there was an exception building the classpath.

https://github.com/clojure/tools.deps.alpha/blob/master/src/main/clojure/clojure/tools/deps/alpha/script/make_classpath2.clj#L164-L174

👍 4

Jim Newton10:07:51

I have a question about a statement in the clojure documentation. Can someone help me understand the sentence: I am referring to https://clojure.org/reference/metadata . The 2nd sentence in the 3rd paragraph says: One consequence of this is that applying metadata to a lazy sequence will realize the head of the sequence so that both objects can share the same sequence. I don't understand both here. What are the two objects? Is it saying that the meta data and the sequence share the same meta data?

Jim Newton10:07:01

or is it saying the sequence and its head share the same meta data?

p-himik10:07:54

The first sentence in that paragraph: > That said, metadata and its relationship to an object is immutable - an object with different metadata is a different object The two objects are: • A lazy seq that you pass to with-meta • The result of with-meta

p-himik10:07:26

At least, that's how I understand it.

Jim Newton10:07:16

ahhh, that makes sense. I think this is a case of ambiguous antecedent.

Jim Newton10:07:15

That said, metadata and its relationship to an object, A, is immutable - an object, B, with different metadata is a different object. One consequence of this is that applying metadata to a lazy sequence, A, will realize the head of the sequence so that objects A and B can share the same meta-data.

Jim Newton10:07:47

perhaps it also should be "can share the same meta data" NOT "can share the same sequence" ?

p-himik10:07:14

They do share the sequence but do not share the same metadata.

Jim Newton10:07:12

what two things share the same sequence? Sorry that I am confused.

p-himik10:07:31

(def s some-lazy-seq)
;; `s` is an instance of `LazySeq` that has a private member `s` of type `ISeq`.
(def s+m (with-meta some-lazy-seq {:hello :there}))
;; `s+m` is an instance of `LazySeq` that has the very same member that points to the very same data. But `s+m` has a different metadata.

p-himik10:07:50

Here's the implementation of LazySeq/withMeta:

public Obj withMeta(IPersistentMap meta){
	if(meta() == meta)
		return this;
	return new LazySeq(meta, seq());
}

p-himik10:07:31

And of LazySeq/seq:

final synchronized public ISeq seq(){
	sval();
	if(sv != null)
		{
		Object ls = sv;
		sv = null;
		while(ls instanceof LazySeq)
			{
			ls = ((LazySeq)ls).sval();
			}
		s = RT.seq(ls);
		}
	return s;
}

Jim Newton10:07:51

Sorry, that bit of java is a bit beyond my skill-level. I'm not a java programmer. In my opinion, I would like to be able to understand the the documentation (at least the simple documentation) without having to understand the java implementation. I think I'll just figure out how to file a bug/issue report about a confusing grammar error in the doc which leads to an ambiguity.

👍 4

p-himik11:07:54

A lazy seq by default doesn't realize anything - it's a wrapper around a function that does that. Upon iterating over it, it stops wrapping the function and starts wrapping a seq that results from that function. Adding a metadata to a lazy seq has to create another lazy seq that points to the same data. But you can't share that function that exists in unrealized lazy seqs because the function wraps a mutable Java iterator - calling the function again will yield different results. That's why Clojure has to call the function once, but use its result twice - first in the original lazy seq, and then in the lazy seq with new metadata.

Jim Newton11:07:34

I"ve added this issue: https://ask.clojure.org/index.php/10756/which-two-objects-share-the-same-sequence

Jim Newton11:07:44

feel free to comment

Fredrik11:07:29

"what two things share the same sequence?" the answer is the original lazy-seq and the new lazy-seq returned by with-meta

Fredrik11:07:32

They share the same underlying sequence, but have different meta-data. @U2FRKM4TW’s answer above explains this well. I see the point about confusing antecedent, however it doesn't make sense for the head of a sequence to have the same underlying sequence as the sequence of which the head is a part. Hence "both" must be the original lazy-seq and the one returned by with-meta

nate sire14:07:26

I could really be off on this one... but when I read "immutable"... I immediately think of how Clojure uses a "trie" to copy the data to a new instance... but will share a head in the trie.

nate sire14:07:59

it could possibly be referring to how it is implemented underneath... ?

nate sire14:07:14

vectors use "trie"... trying to find if sequence uses trie...

p-himik14:07:27

They do not, they use chunks.

nate sire14:07:23

since sequence is a higher level abstraction over collections... like a vector... and vectors use tries (trees)... which share nodes (heads) when a copy is made in memory (persistant)... I think this is what the docs are getting at?

nate sire14:07:39

can someone correct me?

p-himik14:07:17

A lazy seq is not an abstraction, it's a specific thing.

p-himik14:07:31

clojure.lang.LazySeq

nate sire14:07:23

sequences aren't abstractions over collections?

nate sire14:07:11

hmmm... ok... then my idea is wrong... I will read up on chunks

nate sire14:07:56

https://medium.com/@pwentz/laziness-in-clojure-3d83645bf7f3

nate sire14:07:28

I am reading articles all over the web explaining sequences as "abtractions for collections"

nate sire14:07:25

"I like to think of lazy sequences in Clojure as anonymous collections" quoted

p-himik14:07:17

TBH I care very little about high-level API digests. :) I just look at the implementation most of the time. The sentence in OP talks specifically about lazy sequences. A lazy sequence is a specific thing with a specific implementation. That statement draws its content from that implementation. Vectors and abstract "collections" have nothing to do with this - they only complicate things.

nate sire15:07:04

you stick with the concrete working implementation... not the theory

p-himik15:07:42

Because a lazy sequence is a concrete thing?.. When I say "a signed 32 bit integer 2147483647 becomes -2147483648 when you add 1", you can't say "that's not how numbers work". A "signed 32 bit integer" is a very specific implementation, with its specific traits. Just how a lazy seq is a very specific implementation of the collections abstraction, with its own specific traits - one of which is that peculiarity about metadata.

nate sire16:07:11

ahhhh... so sequence is actually the implementation of the abstract concept of a collection... I was thinking just the opposite... makes more sense.

p-himik16:07:40

Yes. A lazy sequence is a concrete sequence. A sequence is a more concrete collection.

👍 2

didibus19:07:07

#Also sent to the channel

People often confuse what an "abstraction" is. In programming, an abstraction is any partially defined construct. A function is an abstraction for example, because it partially defines a computation, where the function parameters are missing their value. You can't run it yet until you call it and pass it some arguments. You would say the function is an abstraction, but once provided argument it is concrete and can now be executed. Basically, everything that is a template of some sort is an abstraction. Another such abstraction are Interfaces. An interface is an abstraction, because it defines a set of function definitions which are missing their implementation body. In Clojure, sequence is such an abstraction, it is defined by the ISeq interface. Thus the Clojure sequence is an abstraction defined by the interface ISeq. It defines four function definitions: first, next, more and cons , but their implementation body is undefined, so it's a partial definition for a type which support those four operations. In order to make it concrete (that is fully defined, no longer partial), you need to provide an implementation for their bodies. This is done by the seq function. Each collection have their own implementation of ISeq and seq will return the appropriate one for the given collection. But this doesn't need to be an inheritance, so collections aren't necessarily sequences, there is not an inherent IS-A relationship. You could use composition as well, so seq could return a type which wraps a collection and implements first, next, more and cons in way that uses the collection, but the collection itself wouldn't implement those functions. This is kind of what LazySeq does. A LazySeq is a sequence, it implements first, next, more and cons. LazySeq is not an abstraction, it is a concrete type, not an Interface. (Side note, a class is an abstraction as well, since it's a template for creating concrete instances of Objects, so LazySeq is a class abstraction for concrete Objects, but it's not an Interface abstraction which is what we mean here). What LazySeq does is that it wraps another sequence in a way that makes it lazy. So LazySeq is a concrete sequence implementation that takes any abstract sequence and can make them lazy. So sequence is an abstraction. LazySeq is a concrete implementation of that abstraction. And each Clojure collections have a corresponding implementation for the sequence abstraction which is returned by seq

didibus20:07:59

As for the original question. The paragraph is saying that two objects of equal value but different metadata are considered to be equal, but changing the metadata of an object will result in an equal copy of that object. Which is just to say that metadata on an object is immutable, so changing the metadata of an object actually creates a copy of it with different metadata. One consequence of this is that for LazySeq, where sequence are supposed to cache the realized value, it is saying that the original LazySeq object and its copy with altered metadata are meant to have the same value. To do so, they need to share the same sequence, otherwise they'd be double realizing things. Which means that before creating the copy with altered metadata, the LazySeq has to create the sequence by realizing the head so it can pass it to the copy.

didibus20:07:03

It's definitely a caveat of the implementation of LazySeq, but also, if Clojure allowed to mutate metadata this would not be an issue, so it's also a consequence of metadata being immutable.

didibus21:07:52

An example to demonstrate:

(def ls (lazy-seq (println 1)))
(def ls2 (with-meta ls {:foo :bar}))
 ;;=> 1
(first ls)
;> nil
(first ls2)
;> nil

If with-meta didn't realize before creating ls2, then when you would call first on ls and ls2, both of them would print 1. But conceptually with-meta should return the same sequence, one where things don't get realized twice, because that's not what sequences are supposed to do. Unfortunately because of the way lazy-seq works, you can't guarantee that if you copy the LazySeq without having realized the head. So it needs to first realize the head as soon as with-meta is called so that both ls and ls2 behave the same afterwards.

didibus21:07:18

An even better example:

(def ls (lazy-seq (println "realized head") [(rand-int 10)]))
(def ls2 (with-meta ls {:foo :bar}))
;;=> realized head
(first ls)
;> 3
(first ls2)
;> 3

If with-meta didn't realize the head, then when you called first on ls and ls2 you could get a different value back, but this would be wrong, because ls2 should be equal to ls and has the same value, only differ by the meta. And in order to achieve this, lazy-seq has to realize the head prior to returning an equal copy with different meta.

Jim Newton10:07:54

I have yet another question about https://clojure.org/reference/metadata . Paragraph 3 says: metadata and its relationship to an object is immutable Yet there is a function alter-meta! which is documented to: Modify or reset the metadata respectively for a namespace/var/ref/agent/atom. So my question is: I the metadata mutable or immutable?

dominicm10:07:17

@jimka.issy alter-meta! is for stateful containers, like swap! is.

Jim Newton10:07:26

so should I understand that metadata on a sequence is immutable but metadata on stateful containers is mutable?

dominicm10:07:43

The metadata itself is an immutable hashmap, but irefs (the name for a stateful container apparently) allow that immutable value to atomically change to another.

Jim Newton11:07:21

I think/hope I understand. What I have is a program implemented in Python, Scala, and Clojure. The clojure program seems to be much slower than the other two. I think this is due to memoization. In Scala and Python I have objects (of application specific classes) which memoize (per object) the return values of certain functions. The clojure version doesn't use application defined classes, but rather just s-expressions (sequences, sometimes lazy). I'd like to similarly memoize, and I was hoping to put meta data on the sequences. I don't know what data to memoize at object creation time, but only when certain multi-methods are called. I considered, but rejected the idea of simply using the memoize function because it memoizes forever. I'd like the information be to GC'd when the object dies.

nate sire14:07:04

I am reading along your thoughts...

nate sire14:07:08

Because Clojure uses persistant data structures... it copies a lot of memory...

nate sire14:07:21

and it does not have tail call optimizations...

nate sire14:07:51

so while Clojure can be faster... it can use a lot of memory... so I would monitor your memory (space complexity)

nate sire14:07:27

make sure you have enough memory for the work you are doing

nate sire14:07:40

comparing Python to Clojure... you must consider at least a few factors.... yes, speed... but also memory space complexity, work load (concurrent or not)

nate sire14:07:57

Python has a GIL... it cannot offer the concurrency like Clojure can... and Clojure by default is for vertical scaling (large memory) and multiple CPUs

nate sire14:07:26

but you can also speed Python up with Cython or Jython in many cases (compatible foreign function interface)... I hope that helps.

dominicm11:07:26

Sorry, my answer was confusing as it's a subtlety. If you're trying to do memoization against object keys which will be GC'd at some point note that: • Hashmaps won't do the job, you'll need to use one of the weak kinds of reference things • You probably want a java-y thing which supports the weak stuff such as https://github.com/ben-manes/caffeine • https://www.baeldung.com/java-weakhashmap is OK if your values won't be held onto after the key object is GC'd. tl;dr, use caffeine 🙂

Jim Newton11:07:12

I've had bad experiences using these weak-hash-maps in the past. once bitten twice shy.

Jim Newton11:07:18

@dominicm is there some reason you are warning me against using meta data on s-expressions to mimmic mutable fields in the corresponding class instances from Python or Scala?

Jim Newton11:07:44

it has been a while since I used them, but I seem to recall that they tripled the memory usage of my previous application (at least a Scala wrapper around the java weak-values-hash-map) I suspect a reason for 3xing the memory size is to avoid what would otherwise be an additional quadratic complexity to the garbage collector.

Ed11:07:10

There's also https://github.com/clojure/core.memoize which provides mechanisms for cleaning out the cached calls, if that's helpful

Jim Newton11:07:13

I mentioned memoize in my original post, including the justification why I can't use it.

Jim Newton11:07:12

for the most part, the problem with memoize is that it memoizes until the end of time. If you use this function in a recursive function of large complexity it can demand huge resources, and there's not API for the programmer to free the memory.

Ed11:07:44

that's not the the clojure.core/memoize fn, it's a library that implements memoization whilst solving the "memoizes forever" problem

Jim Newton11:07:50

It would be great if there were optional arguments to memoize which told the implementation to use a weak hash table, which would would allow things to be un-memoized if memory fills up.

Jim Newton11:07:00

ahhhh!!!!

Jim Newton11:07:07

nice. I didn't know about that.

Ed11:07:13

👍

Ed11:07:19

I've used it in the past to get past the forever part of memoize ... no idea if it'll be helpful in your situation, but I'm kinda with dominicm in that I'm not sure I get how your metadata strategy thing will work

Ed11:07:59

memoization seems like the correct answer, just hoping the lib will solve your problem 😉

Jim Newton11:07:42

one way my meta data idea would work is as follows.

Jim Newton12:07:02

Imagine that you have a Scala/Python class with a method named foo, for simplicity assume that foo has no arguments. If foo is expensive to compute, then I sometimes make a private field called foo whose default value is None. when foo is called, it first checks foo, and returns its value if not none. otherwise computes the values and sets _foo

Jim Newton12:07:48

Therefore when the object is GC'ed, then the value in object._foo goes away with it.

Jim Newton12:07:51

I don't see a comment about the stored-forever problem in the readme of https://github.com/clojure/core.memoize

Jim Newton12:07:00

Did I overlook it?

Ed12:07:01

There are a few strategies for clearing out the cache, for example: https://github.com/clojure/core.memoize/blob/ff20137720a36e0e1ded75ebcdd53d4d76fbe6eb/src/main/clojure/clojure/core/memoize.clj#L354 ... First in, first out

Ed12:07:16

there's also a least recently used, etc

Ed12:07:03

I'm not sure I understand the relationship between the python object and the lazy-seq/metadata. Are you saying that you want to cache the generation of each element in a seq by attaching metadata to it? If so, the seq will do that for you, once a seq has got it's first element, all calls to first on that seq will be a cache lookup and it won't recalculate the first element ...

Jim Newton12:07:26

no that's not what I mean.

Jim Newton12:07:30

imagine in python I create an object as follows: x = CreateMyObject(...), and later call y = x.canonicalize(), and then call z = y.canonicalize() . It's a silly example, but pedagogical . my code for canonicalize takes self and allocates a new object, and also sets self.saveCanonicalize to the new object, and also sets newObject.saveCanonicalize = newObject

Jim Newton12:07:10

the result is that even though I've never called canonicalize on y, when I do call it, it is already memoized. This is possible in Scala and Python because I have an object where I can put fields, and I can make sure the eq function ignores that field.. But in clojure I'm using a sequence, perhaps lazy, perhaps not, to store the information. I was thinking about putting metadata about the sequence using the clojure metadata facility. -- to me that seems like the logical place to put it.

Jim Newton12:07:33

http://with.regard.to least-recently-used, is there a way to tell it to use a weak hash table, and clear automatically the way the weak hash would? i.e., don't bother unless getting short of memory, then remove unreferenced items. I didn't see that option.

Ed13:07:13

core.memoize is built on core.cache which has a soft reference cache, so you could probably try something like

(require '[clojure.core.memoize :as m]
           '[clojure.core.cache :as c])

  (let [f (m/memoizer identity (c/soft-cache-factory {}))]
    (f {:test 123}))

which should allow the cached memoized values to be gc'd

Jim Newton13:07:03

that certainly looks easy.

Ed13:07:09

I'm still a bit confused by your python example. Does y.canonicalize() return x.canonicalize()?

Jim Newton13:07:40

yes it returns the same (eq) value which has already been memoized

Jim Newton13:07:56

since canonised is a idempotent function

Ed13:07:57

and in the clojure version x and y are maps?

Jim Newton13:07:16

no, x and y are sequences

Jim Newton13:07:44

expressions such as (and (or (and a b) (not b)) (or c (not (and a b))))

Jim Newton13:07:03

you can imagine what canonicalize does to such a sequence

Ed13:07:16

sure ...

Jim Newton13:07:48

so in python its not just a sequence but an object of type SAnd or SOr or SNot etc.

Ed13:07:12

and canonicalize is slow

Ed13:07:42

so you only want to run it if you've not seen that value before

Jim Newton13:07:51

yes, it applies a long list of recursive searches to see if it can reduce to a canonical form. the leaf elements are types, so it also checks for disjointness and subtype-ness

Ed13:07:51

but not run out of ram

Jim Newton13:07:08

well, I guess I'd say I want to experiment and see whether a bit of memoization makes the clojure version roughly the same speed as the python and scala versions

Ed13:07:19

seems fair

Jim Newton13:07:49

the unit tests in python take about 30 seconds to run and the unit tests in clojure haven't finished since I started them this morning

Ed13:07:02

I guess I'm confused by the metadata thing, because to add metadata to a seq, you're going to have to create a new object

Ed13:07:24

so in the recursive case, the metadata won't be there in the sub-trees

Ed13:07:38

(I once worked on a project where the tests took over 8 hours to run ... we just ran the tests every night and emailed everyone the results ... it wasn't the best 😉 )

Jim Newton13:07:41

about creating a new object, yes, I see that as well. that is indeed an issue which I'd have to figure out, such as always going through a factory function.

Jim Newton13:07:56

thus my question about whether metadata is immutable. ideally I'd like to attach meta data to an existing sequence. I guess there's no api for that.

Ed13:07:18

no - metadata is immutable

Ed13:07:45

if you add more metadata to a value, you get a new value that's =

Jim Newton13:07:51

my factory function could always create a metadata as an atom when it allocates a new sequence

Ed13:07:25

I think that the python structure is using mutable objects to combine the data with the canonicalize function

Jim Newton13:07:33

if, as you suggest, the core.memoizer cooperates with the GC, that's probably even easier

Ed13:07:10

have you just used clojure.core/memoize and seen if the clojure version runs any faster?

Jim Newton13:07:16

yes my python and scala code are written in a very functional style, but does uses mutation for this memoization process. just as clojure, under the hood, does the same thing.

Jim Newton13:07:40

BTW, before this project I didn't know much about python. I still don't know much about the destructive functions, but I do quite like the python object model and writing functional style python via map, flat_map, list-comprehensions etc.

Jim Newton13:07:28

One thing I really like about python as well as common-lisp which I miss in both clojure and scala is the ability to immediately return from a named block.

Ed13:07:30

👍 ... just with clojure's data structures you don't really have the option to add that type of cacheing 😉 ...

Ed13:07:28

I've never been a fan of returning early from a block even in java

Jim Newton13:07:37

def conversion8(self):
        from genus.s_not import notp
        # (or A (not B)) --> STop if B is subtype of A, zero = STop
        # (and A (not B)) --> SEmpty if B is supertype of A, zero = SEmpty
        for a in self.tds:
            for n in self.tds:
                if notp(n) and self.annihilator(a, n.s):
                    return self.zero()
        return self

(defn conversion-C8
  "(or A (not B)) --> STop if B is subtype of A, zero = STop
   (and A (not B)) --> SEmpty if B is supertype of A, zero = SEmpty"
  [self]
  (if (exists [a (operands self)]
              (exists [n (operands self)]
                      (and (gns/not? n)
                           (= true (annihilator self a (operand n))))))
    (zero self)
    self))

I've written my own exists macro

Ed13:07:38

I'm not really a python expert either

Jim Newton13:07:56

I've been using it for 2 weeks, so I'm not an expert either

Jim Newton13:07:05

my python code looks like lisp code.

Jim Newton13:07:16

oops, I'm glad I posted that, because the python code has a bug. I need self.annihilator is True like in clojure

Ed13:07:42

yeah ... I worked on a 4 month project in python a year or two ago, and i've written the odd ansible script ... my python isn't very pythonic either 😉

Ed13:07:36

:rubber_duck:

Jim Newton13:07:48

what's rubber duck?

Ed13:07:25

https://en.wikipedia.org/wiki/Rubber_duck_debugging ... I was just making a bad joke ... feel free to ignore me 😉

Jim Newton13:07:56

OIC

Ed13:07:06

I just meant that I'm glad you spotted your python bug ... glad I could help 😜

🎉 4

Jim Newton13:07:15

about returning early. I don't use it often, but I find that using it conservatively obviates lots much boiler plating.

Jim Newton13:07:51

think of it as a kinder-and-gentler exception

Jim Newton13:07:41

Clojure's reduce has the very useful reduced which is sort of the same thing.

Jim Newton13:07:20

I just wish it had named-reduced because reduced can get confused in reentrant code.

Jim Newton13:07:07

OK, back to work. I'll try out your memoized suggestion. Thanks for that. strange that that important feature isn't mentioned in the readme.

Ed13:07:56

yeah ... I tend not to use reduced very much either ...

Jim Newton13:07:54

OH by the way. when using

(require '[clojure.core.memoize :as m]
           '[clojure.core.cache :as c])
  (let [f (m/memoizer identity (c/soft-cache-factory {}))]
    (f {:test 123}))

how can it put something into the cache without calling f?

Ed13:07:51

the {} arg to c/soft-cache-factory is an initial value ... is that what you mean?

Ed13:07:55

otherwise, you can keep a reference to the cache, which implementis this protocol

(defprotocol CacheProtocol
  "This is the protocol describing the basic cache capability."
  (lookup [cache e]
          [cache e not-found]
   "Retrieve the value associated with `e` if it exists, else `nil` in
   the 2-arg case.  Retrieve the value associated with `e` if it exists,
   else `not-found` in the 3-arg case.")
  (has?    [cache e]
   "Checks if the cache contains a value associated with `e`")
  (hit     [cache e]
   "Is meant to be called if the cache is determined to contain a value
   associated with `e`")
  (miss    [cache e ret]
   "Is meant to be called if the cache is determined to **not** contain a
   value associated with `e`")
  (evict  [cache e]
   "Removes an entry from the cache")
  (seed    [cache base]
   "Is used to signal that the cache should be created with a seed.
   The contract is that said cache should return an instance of its
   own type."))

Ed13:07:37

but why do you want to put something into the cache without calling the function?

Jim Newton13:07:02

why? because I want to always put the return value of canonicalize into the cache, associating the value with itself. So that (canonicalize (canonicalize something)) only does work at most once.

Ed13:07:13

ah ... yeah ... geddit 😉

Ed13:07:46

you want to cache the return value as an input as well as the input

Jim Newton13:07:13

for the special case of canonicalize, yes. of course I have other memorable functions which are not idempotent.

Ed13:07:08

👍

Jim Newton13:07:14

does this mean I need to call hit from within canonicalize ?

Jim Newton13:07:02

or maybe miss that's the only one with the necessary 3 arguments, cache, input-value, output-value.

Ed13:07:29

off the top of my head, I don't remember ... probably ... there's a good blog post about core.cache (by dpsutton I think) cos it's a bit tricksy

Ed13:07:42

let me see if I can find it

Ed13:07:19

https://dev.to/dpsutton/exploring-the-core-cache-api-57al ... that I think??

Ed13:07:16

yeah ... miss sounds like the badger

Jim Newton13:07:46

I don't know anything about protocols. How can I, the application programmer, call one of the protocol functions? Aren't those functions only used internally, and are not part of the API?

Jim Newton13:07:14

anyway, enforcing idempotency is a problem I can attack later. its not 100% necessary for the first pass solution.

Jim Newton13:07:43

this conversation was really helpful. Gotta go now. sorry to cut off abruptly

👍 3

Ed14:07:31

np ... get some real work done 😉 ... protocols just create functions in the namespace they're defined in, so you can call c/miss or whatever ... but they're usually considered an internal thing in a lib, so you'll often found them wrapped by a normal clojure fn

Jim Newton16:07:50

With gc-friendly-memoization, my tests finish in a few seconds rather than hours

🙂

ftw

good work 😉

hi l0st3d, I'm having what looks like a problem in core.memoize. I'm investigating. It looks really similar to a problem I had when using the java weak hash tables from Scala a year or so ago. The problem is that when the underlying java function removes the value from the hash table, it fails to immediately remove the key. I'm not sure if this is a bug or a feature. but the consequence is that accessing the key returns the java null as a value.

Jim Newton07:07:14

does this sound in any way familiar? perhaps this is a bug in my program and only coincidentally looking familiar to me.

Jim Newton07:07:07

the consequence for clojure is that when I a function is memoized and you call it with some arguments, it SHOULD thereafter return the same thing with given the same arguments. However, if the corresponding value gets gc-ed then calling with the same value will result in they hash value being java-null and the clojure function consequently return nil rather than re-computing the function in question the hard way

Ed12:07:50

that doesn't sound immediately familiar, but I've generally not relied on gc to control eviction from these sorts of caches ... I can see how that might happen with a weak reference

Ed12:07:32

can you rely on nil punning to get around it?

Ed12:07:44

it does look like the memoize lib puts a derefable thing in the cache though, so if it should know the difference between your function returning nil and the gc process having cleared out the soft reference ...

Ed13:07:11

although, it looks like the implementation of softcache may have a bug in???

Ed13:07:26

maybe?

Ed13:07:13

@jimka.issy can you write a minimal test case of the soft-cache that proves it's a problem?

Ed13:07:33

I'm not sure that it is a bug in the soft cache

Ed13:07:46

it seems to behave correctly

Ed13:07:58

(require '[clojure.core.memoize :as m]
           '[clojure.core.cache :as c])
  (import '(java.lang.ref ReferenceQueue SoftReference)
          '(java.util.concurrent ConcurrentHashMap)
          '(clojure.core.cache SoftCache))
  (let [c (SoftCache. (doto (ConcurrentHashMap.)
                        (.put (list 1) (SoftReference. nil))) (ConcurrentHashMap.) (ReferenceQueue.))
        f (m/memoizer #(do (prn '>> %) %) c)]
    [(f 1) (f 1)]
    (mapv #(vector (key %) '-> (.get (val %))) (seq c)))

Ed13:07:24

I think that ^ forces the cache into the state where it contains the equivalent of a gc'd reference and it seems to run the function only once and return the correct val ... right?

Ed13:07:14

hmmm ... no ignore me... I think I've misread the code ... but I need to go now ... can try and have another look in a bit

Jim Newton16:07:08

I debugged it. I think it was a problem in my own code. it just looked curiously like a problem I had in Scala some time ago.

Ed16:07:49

was the problem that you were using scala? 😜 .... but seriously, glad you got it sorted ... I think it looks like the soft cache does the right thing

dominicm11:07:01

@jimka.issy I don't fully understand your suggestion with metadata and sexprs for solving this problem tbh 🙂. The tooling around cleaning stuff up on GC is pretty complicated, so it's better to lean into something on the JVM than try to roll your own as a v1 imo. But your use-case might have easier reference tracking so you can easily know when to clear the cache keys out.

Jim Newton12:07:03

one trick I do in the Python and Scala versions is the following. I have a function called canonicalize which I call on an arbitrarily large expression tree. The function applies a large suite of simplification rules, and returns a canonicalized expressions tree. So in this case I need to memoize TWO things, not ONE. I need to mark the original tree has having a certain canonicalization. but I need to mark the new tree has having itself as canonicalization. I.e., I want to avoid someone trying to canonicalize the new object, having that compute a long time and return an isomorphic object. I don't know how to do this using the memoize function, I don't know yet whether the core.memoize replacement allows the programmer to manipulate the cache through the API.

Jim Newton11:07:37

my idea was just to put the meta on the object which they concern. then when the objects are cleaned up, the meta data will be cleaned up as well.

Alexis Vincent12:07:32

Are there any examples of clojurescript working with aero and clip

Alexis Vincent12:07:18

Storing the system config in an edn file that gets loaded and passed to clip. Clojurescript targeting node

dominicm13:07:50

@mail024 There're some examples at the bottom of the clip readme of aero/clip, none targetting node afaik but maybe you can adjust them easily enough?

Alexis Vincent13:07:39

Do you mean https://github.com/juxt/clip#edn--aero

dominicm13:07:26

@mail024 Just under that 🙂 https://github.com/juxt/clip#example-application

Alexis Vincent14:07:01

Have hit a case where clip doesn’t seem to resolve if the form passed in uses a symbol to refer to a function. Aero then doesnt resolve this to the actual function, so clip then doesn’t seamlessly start a system if the config comes from aero in cljs. Will investigate later and pick this up in an issue on clip.

dominicm14:07:26

@mail024 Thanks, I'll look out for your issue later. symbols definitely resolve in all the tests, so it must be something specific to your context.

rs20:07:23

Is there a way to tell deps to also download all sources.jar for each dependency?

dpsutton20:07:17

you can use clojure -P to download all the deps it needs. This would fetch all jars that are specified

rs20:07:38

It seems u are saying I need to specify the sources jar explicitly

rs20:07:44

in deps.edn file

dominicm20:07:26

@U11BV7MTK I think you've missed that @U0AHTPQBG is after the "sources" dependency. As you can download a pom or javadoc for a dependency. Afaik @U0AHTPQBG there's no solution to this atm.

dpsutton20:07:51

I'm not following what "sources" dependency means here

rs20:07:23

Every artifact published also has a source jar also published which has the source code bundled. Which helps with debugging

dominicm20:07:39

Mostly java dependencies

dominicm20:07:45

I'm forgetting the word for it now

dominicm20:07:02

deps.edn uses $ to mark it out, e.g. foo.bar/baz$pom would be for the "pom" <thing> of foo.bar/baz

dpsutton20:07:06

oh i see. i misread this from the beginning 🙂

dominicm20:07:36

Maven lets you say "hey, get me the sources for my deps", so you can get the java code and jump to source

dominicm20:07:43

There's a lein plugin that does that, but nothing for deps that I know of

rs21:07:19

I found a workaround.

clj -Spom
mvn dependency:sources

vemv21:07:58

Feel free to keep track of https://github.com/clojure-emacs/enrich-classpath/issues/2 (I plan to implement it this weekend) it will be a solution used by cider and anything else that wants to

clj839420:07:29

I'm trying to refine the way I write clojure code to be closer to the way the community and experienced clojure devs would write there code. Can someone take a look at this function I wrote and refactor into a form more typical of clojure code or suggest some improvements:

(defn get-numbers-from-words
  "Creates a vector of number values from number strings in the argument
   string."
  [s]
  (let [;; Creating a map of numeric words to numbers.
        number-map
        {"one" 1
         "two" 2
         "three" 3
         "four" 4
         "five" 5
         "six" 6
         "seven" 7
         "eight" 8
         "nine" 9}

        ;; Lowercasing the entire string.
        lowercase-str
        (.toLowerCase s)

        ;; Splitting the string into separate words.
        words
        (.split lowercase-str " ")

        ;; Replacing all of the numeric words with numbers, and filtering out
        ;; all of the unique numbers into a vector.
        number-values
        (->> words
             (map #(get number-map %))
             (filter (comp not nil?))
             (set)
             (vec))]
    number-values))

hiredman20:07:19

set is going to potentially shuffle things around (changing the order)

hiredman20:07:13

(comp not nil?) => (complement nil?) => some?

clj839420:07:08

thanks, those are good points

hiredman20:07:14

the whole ->> would maybe be better as a transducer (into [] ... worlds)

Derek20:07:17

(def numbers-by-word
 {"one" 1
  "two" 2
  "three" 3
  "four" 4
  "five" 5
  "six" 6
  "seven" 7
  "eight" 8
  "nine" 9})

(require '[clojure.string :as str])

(sequence
 (comp
  (map str/lower-case)
  (map numbers-by-word)
  (remove nil?)
  (distinct))
 (str/split "THree point one four one five nine" #"\s"))

clj839420:07:25

would this be a more clojure-like way of writing this code, or is the above code better:

(def number-map
  "A map of word numbers to numeric numbers."
  {"one" 1
   "two" 2
   "three" 3
   "four" 4
   "five" 5
   "six" 6
   "seven" 7
   "eight" 8
   "nine" 9})

(defn get-numbers-from-words
  "Creates a vector of number values from number strings in the argument
   string."
  [s]
  (as-> s $
    (.toLowerCase $)
    (.split $ " ")
    (map #(get number-map %) $)
    (filter some? $)
    (set $)
    (vec $)
    (sort $)))

jsn20:07:25

@dpassen1 I like this one; also, map + remove == keep

Derek20:07:34

I always forget to do that 🙂

quoll20:07:43

I like transducers, but I don’t always use them. For simple things like this, I still use threading macros

quoll20:07:56

(ns clj8394.example
  (:require [clojure.string :as string]))

(def numbers ["zero" "one" "two" "three" "four" "five" "six" "seven" "eight" "nine"])

(def number-map (zipmap numbers (range)))

(defn get-numbers-from-words
  "Creates a vector of number values from number strings in the argument string."
  [s]
  (let [lowercase-str (string/lower-case s)]
    (->> (string/split lowercase-str #" ")
         (keep number-map)
         distinct
         vec)))

Derek20:07:09

@jetrepilto maps are functions, so it’s unnecessary to wrap number-map in #(get number-map %)

Derek20:07:33

I think generally (or in my head) (remove nil?) is clearer than (filter some?)

quoll20:07:44

I added “zero” for the fun of it, but the zipmap could work with (range 1 10)

hiredman20:07:54

use distinct for making things distinct

👍 6

quoll20:07:28

oops! Yes! The set will mess up the order, and the vec is useless then

clj839420:07:53

@dpassen1 @quoll thanks for the examples! These will be good to learn from

👍 2

quoll20:07:50

I copied the code basically verbatim, but I’ll correct as per @hiredman

quoll20:07:47

for static structures (like your map of words to numbers) I typically put them into a def. I know that not everyone would

quoll21:07:11

If I need it hidden/inaccessible, then the function definition can be in a let block. Though it makes me feel icky to say:

(let [number-map ....]
  (defn get-numbers-from-words [s] ...))

So then I would usually use:

(def get-numbers-from-words
  (let [number-map ....]
    (fn [s] ...)))

But for this sort of thing? I define the map with def

clj839420:07:01

I usually put these static structures in let to keep them within the same scope as the function, but it looks like typically in clojure these are at the top level of the files. Is there a reason for this? more performant?

quoll21:07:25

so they don’t get executed each time the function is called

dpsutton21:07:45

the set would be compiled once, not on each invocation

dpsutton21:07:13

er, map, but still

quoll21:07:37

map, but yes you’re right. But it looks like it’s run every time. And it’s a bad habit that can occasionally lead to things that DO get run each time (ask me how I know)

quoll21:07:21

Besides, I have found that things that never change will often be useful in more than one context. Not always, but :woman-shrugging:

clj839421:07:12

I see, this is good to know, I'll stick with this approach then

Derek21:07:03

One of the reasons we can move them out to a top level def is that the map is immutable

clj839421:07:12

good point, wasn't even thinking of that

dpsutton21:07:54

(alter-var-root #'number-map (constantly :is-it?))

Derek21:07:03

haha thanks

ghadi21:07:27

break your function into three parts: a function that turns a single word into a number (a literal map will do nicely as a function) a function that splits a string into words (returned as a seq or vector) then something that uses the other two functions

dpsutton21:07:54

that's a cheeky reply 🙂 . And i know what you mean (about immutability)

didibus19:07:07

replied to a thread:or is it saying the sequence and its head share the same meta data?

2021-07-09

Channels