Fork me on GitHub
#clojure
<
2023-08-14
>
pyr08:08:07

Hi everyone, I am working with a java dependency that needs to load native code (via System.load). In java this is done in a static { System.load("..."); } directive. With Clojure though, how would I approach this? doing a naïve System/load does not work (likely due to how class loaders are handled in Clojure).

Alex Miller (Clojure team)08:08:27

"does not work" is what

Alex Miller (Clojure team)08:08:41

what did you do? what did you get?

Alex Miller (Clojure team)08:08:23

are you setting java.library.path JVM property

pyr08:08:25

System.load succeeds, but when using the classes in the dependency that access the native symbols, I get an unsatisfiedLinkError this is regardless of whether java.library.path is set (which only matters for System.loadLibrary not System.load). It seems as though the call to load loads the libraries in a context clojure does not have access to if that makes any sense

Alex Miller (Clojure team)12:08:52

I know it’s tricky to line everything up, but there’s no reason you shouldn’t be able to make something like this work in Clojure. Can you get the equivalent call in a Java app to work?

pyr12:08:52

With the static block Java worked as expected. I'll scratch more (but good to know there's no supposed barrier)

Joshua Suskalo13:08:39

When I ran into this problem when writing #C02EEAUHSJJ, I put the System.load call in a single java class which gets compiled and then used from Clojure. It avoids the issues with classloaders, at least for my case where I also do the lookup from inside another static method in the same class.

pyr13:08:39

@U5NCUG8NR Thanks I actually went and read your code as well since I considered dropping the intermediate java wrapper and going straight for coffi. For some reason, this doesn't do it for me (but I might be holding it wrong). As a small tangent, sadly Java 20 moved the API around (dropping MemoryAddress and Adressable and going for all pointers as long) so it doesn't work on this version

pyr13:08:37

All signs point to this being possible and me doing something wrong so I'll dig further

Joshua Suskalo13:08:43

Yes, I was intending to update to Java 20 but work got in the way. Next month JDK 21 comes out and stabilizes the FFM api so I'll be updating to that fairly rapidly and cutting a 1.0 release.

pyr13:08:01

Java 21 is the safer bet yes

Joshua Suskalo13:08:39

In the mean time I'm happy to lend thoughts and experience with writing coffi to help you with your ffi needs.

🙏 2
Karol Wójcik09:08:22

I love the idea of clojure.core.reducers. Is there a library that provides a Foldable implementation of sort, sort-by, distinct-by?

Karol Wójcik14:08:42

Perfect! Thanks!!

chrisn18:08:24

Another option is ham-fisted - I gave a talk about this at the Conj - https://www.youtube.com/watch?v=ralZ4j_ruVg.

👍 2
Karol Wójcik08:08:30

I have just watched the talk and it is pure gold! Do you plan to add support for EDN in charred?

chrisn13:08:27

There is some interest but not currently enough. People don't use edn in performance sensitive places as far as I can tell so I am not sure there is a need.

chrisn13:08:02

thanks btw - I am proud of that talk.

❤️ 2
Lidor Cohen09:08:03

Hello everyone 🙂 I noticed that in the doc of hash it says:

"Returns the hash code of its argument. Note this is the hash code
   consistent with =."
And then I looked at =:
"Equality. Returns true if x equals y, false if not. Compares
  numbers and collections in a type-independent manner.  Clojure's immutable data
  structures define -equiv (and thus =) as a value, not an identity,
  comparison."
and saw that it uses -equiv which doesn't seem to be strongly related to hash So I wondered: what is the relation of hash to equivalence checking(if there is one)? and if not, what is the use of hash?

p-himik09:08:49

As far as I understand it, "consistent" here means that if two things are = then their hashes are also =.

6
p-himik09:08:45

> what is the use of hash? To be used in data structures that rely on hash, e.g. hash maps/sets.

Lidor Cohen09:08:13

does that mean if I have a type and it is not implementing -hash there's some kind of ramification on using it as key in hash-map?

p-himik09:08:41

Only if you implement equality checking without implementing hashing, yes.

Lidor Cohen09:08:11

What would happen?

Lidor Cohen09:08:22

Is this the same for cljs?

p-himik09:08:55

It's the same regardless of the language.

🙏 2
pauld18:08:09

Having been forced to make a hash-table by hand (in comp-sci for example) in C makes this relationship clear. Also the stack overflow answer is pretty good.

itaied18:08:24

what are the cons or limitations of creating an object and using its hash when implementing equiv? for example:

IEquiv
(-equiv [this other]
  (= (-hash this) (-hash other)))
IHash
(-hash [this]
  (-hash [(internal-comp this)]))

p-himik19:08:37

Hash equality does not mean regular equality.

p-himik19:08:12

(Unless you can implement it that way for your specific type - then it's alright.)

valerauko13:08:07

hashes are for the case when comparing the entire inner state (all the data) of an object is too expensive, so you pre-compute a hash that can be quickly compared (and efficiently sorted etc)

p-himik15:08:34

That's not entirely correct, especially the sorting part.

zimablue11:08:32

iiuc, if you want to dynamically 'add' a protocol to an existing object, whilst keeping the existing ones, you can use specify! in clojurescript, but how does one do it in clojure? reify iiuc only allows one to expose interface methods specified in the body of the reify, but cannot merge protocols from an existing object (?)

zimablue11:08:23

just thought of a way, extend-via-metadata and then manually add the metadata to the object? is that the best way?

zimablue12:08:42

Thanks but that's not what I want, I want the protocol associated with the object, not the type of the object, eg to an instance of {:a 1} not to the type of hash-map

Ed12:08:09

The JVM doesn't have value based dispatch the same way JS runtimes do, so the extend via metadata mechanism is the only way to specify an implementation of a protocol on a specific object, and you won't be able to do that for objects that can't have metadata (like strings)

teodorlu17:08:30

Hi! 🙂 Is there a Clojure function to list all loaded namespaces? I want to iterate over namespaces and vars at runtime to look for vars with certain metadata within namespaces in my app.

2
igrishaev17:08:55

all-ns

2
❤️ 2
teodorlu17:08:50

Exactly what I was looking for. Thank you! 🙌

teodorlu18:08:01

Having this access at runtime is amazing. I figured, what clojure vars have been deprecated? Why not ask.

👍 6
2
Alex Miller (Clojure team)18:08:31

Good old replicate :)

😄 2
flowthing19:08:28

Also (loaded-libs), although I can’t rightly say what the difference is.

👀 2
teodorlu20:08:49

Interesting. (all-ns) returns namespaces, (loaded-libs) returns symbols. Somehow (all-ns) returns cheshire.core on my project, but it’s not present in the list returned from (loaded-libs). But (loaded-libs) still returns more libraries. Personal hypothesis: (loaded-libs) returns only things that have been required, ie is already in memory. But the count of loaded-libs is still higher than the count of all-ns. :thinking_face:

Alex Miller (Clojure team)21:08:26

the sources of data are different for these. all-ns will list all namespaces created in the runtime. loaded-libs is everything that has been loaded. Files can be loaded that don't necessarily create a namespace (via ns).

👍 2
💯 2
dpsutton18:08:09

does anyone know of a java embedding solution? I see word2vec but it’s quite heavy and uses a strange version of a pom so it’s difficult to use the whole thing. Wondering if anyone knows of some easy examples

respatialized19:08:17

https://github.com/stanfordnlp/CoreNLP/blob/main/src/edu/stanford/nlp/neural/Embedding.java CoreNLP supports embedding, but its license (GPLv3) may not be compatible with your intended use case

jjttjj21:08:37

I've used https://github.com/jelmerk/hnswlib: [might be a small bug or two I quickily extracted this rom some larger code] EDIT: ah I immediately see that you meant getting the actual embeddings, not a db/store for them. I need to read better

(import [com.github.jelmerk.knn.hnsw HnswIndex]
         [com.github.jelmerk.knn.hnsw HnswIndex]
         [com.github.jelmerk.knn DistanceFunctions]
         [com.github.jelmerk.knn Item]
         com.github.jelmerk.knn.ProgressListener
         com.github.jelmerk.knn.util.VectorUtils
         
     )



(defrecord Word [word float-array md5]
  Item
  (id [this] word)
  (vector [this] float-array)
  (dimensions [this] (count float-array)))

(defn ->word [text embeddings]
  (map->Word
    {:word        text
     :float-array embeddings}))


(defn new-index [num-dimensions num-items]
  (->
    (HnswIndex/newBuilder num-dimensions DistanceFunctions/FLOAT_COSINE_DISTANCE num-items)
    (.withM 36)
    (.withEf 400)
    (.withEfConstruction 400)
    .build))

(defn add-to-index [index word]
  (.add index word))

(defn build-index [texts emb-fn]
  (let [words (mapv ->word texts (map emb-fn texts))
        index (new-index 1536 (count texts))]
    (doseq [word words]
      (add-to-index index word))
    {:index index}))

(defn from-words [words]
  (let [index (new-index 1536 (count words))]
    (doseq [word words]
      (add-to-index index word))
    {:index index}))

(defn result-data [result]
  (let [i (.item result)]
    {:distance (.distance result)
     :item     {:id     (.id i)
                :vector (.vector i)}}))

(defn -nearest [{:keys [index]} embeddings k]
  (->> (.findNearest index embeddings k)
       (map result-data)))

jaide22:08:50

If Clojure were invented now, would it contain more popular features like pattern-matching, or does that work against Clojure's goals and\or host platform constraints?

jaide22:08:20

Awesome! Thanks

dpsutton22:08:48

> Type errors, pattern matching errors and refactoring tools are venerated for facilitating change instead of being recognized as underscoring (and perhaps fostering) the coupling and brittleness in a system. from https://dl.acm.org/doi/pdf/10.1145/3386321

seancorfield22:08:11

@U8WFYMFRU Pattern-matching isn't a new feature (it dates back many decades) but I'm genuinely curious why you think it might be "more popular" now than, say, fifteen years ago?

jaide22:08:44

I understood it wasn’t new, but seems like it’s common in more recent languages like Elixir, Rust, and ReScript. The languages I’ve primarily worked in like Python, Ruby, JS, and now TS don’t have them so I don’t really use it. However, in TS’ case I can see where it would be useful. Something tedious about writing if (typeof arg === “string”) { … } every time you have a union type. That then made me wonder why they were not included in Clojure.

jaide22:08:35

Fennel has some pattern matching capabilities, but have not really needed it much.

seancorfield23:08:28

Ruby has structural pattern matching, yes? I thought Python did too but it looks like that's new in 3.10. Scala, Swift, C#, F#, Prolog, Haskell (and nearly all its predecessors back into the '70s) all have it. I'm not sure I'd agree that it's becoming more common but perhaps it's just hyped more these days... Interesting observation/position tho'... thanks.

seancorfield23:08:26

(Forgot that Java got pattern matching in 20 too... 🙂 )

jaide23:08:56

When showing one of our backend Go devs how to write TS, they asked about pattern matching and I had to show them how to settle for that runtime conditional type check. Somewhere along the way it seems that concept is on the radar of a wider spread of developers. Not saying I need it, or it’s better, just like Python and Java getting it seems like there’s a growing number of people asking for it

seancorfield23:08:02

Hmm, yes, Go... Kotlin too... Hahaha... I guess maybe languages without pattern matching are in a minority already!

seancorfield23:08:45

There's always core.match I guess 🙂

seancorfield23:08:04

The joy of Lisp: you can add nearly any language feature yourself...

jaide23:08:34

I’m aware, and that is a super awesome lisp capability, but was curious from a language design perspective why it was not included

jaide23:08:54

The links people shared above were helpful

seancorfield23:08:52

It's nearly always felt like a case statement to me (and in some languages it actually is the case statement or expression) that's a closed set of options, and aside from my FP experience "back in the day", I guess it was always drummed into me that case is a "bad" choice in OOP which I did for two decades...

jaide23:08:26

Multimethods, and capturing that kind relationship as data in an extensible hash-map have a lot of advantages imo as the first link pointed out.

seancorfield23:08:27

...I wonder if there will be a backlash against pattern matching at some point?

jaide23:08:19

Yeah case and cond have covered my similar use-cases so far

jaide23:08:30

I keep waiting for the greater programming community to start realizing OOP is not a great default approach to most application dev problems but I just had to fight over a PR that implemented a Signer and Verifier class when a simple sign and verify function would do 😔

seancorfield23:08:50

I am reminded of that every time I review a PR at work for our frontend JS team 🙂

jaide23:08:34

Oof that’s rough. In this case, our frontend team mostly comes from a FP background or motivation without even enforcing it, have been using it as a default given the direction react and remix have headed. It was a Go dev from the backend team that created a frontend PR that was designed that way. Amusingly when I pushed back that it could be simpler, they lectured me about how our team should design programs given how common and widely adopted OOP is. Even so, the PR didn’t match our codebase conventions, and didn’t have a good enough reason to break them.

vemv23:08:25

> https://gist.github.com/reborg/dc8b0c96c397a56668905e2767fd697f#why-no-pattern-matching I tried to follow this guideline for my first few years of Clojure but I quickly found out that most pros are happily using cond which is the worst of both worlds IMO 🙃 at some point I stopped caring about this particular point, sadly

jaide00:08:21

What makes you say that?

vemv00:08:01

cond couples every condition with the other one, you cannot understand them independently

jaide00:08:11

Trying to follow. You're saying you have to understand every conditional test to understand any of them?

vemv00:08:48

yes, in a cond, a condition is not independent, it depends on where it is placed, so in a way, understanding one means understanding all of them, which isn't exactly scalable in terms of maintainability

jaide00:08:27

(defn normalize-filename
  [filename basename]
  (cond (and filename (s/includes? filename ".")) filename
        filename (str filename (.extname path basename))
        :else basename))
Found an example of where I used it, how would you solve that instead?

Joshua Suskalo00:08:15

My experience so far in the code I've seen in production at my company at least is that cond isn't generally used for branching on structure of data, which is the main argument I see in that link above. Mostly cond is used for actual fixed, closed sets of conditions which are independent of the structure of the argument.

Joshua Suskalo00:08:06

A common example I've seen of cond is to use it as a chain of argument validation in an http request handler. I think cond is a reasonable choice in a place like that, you have a logical sequence of things to check which each result in a different http code returned. It's not based on structural types of your input except for a general validity check.

Alex Miller (Clojure team)00:08:37

My understanding hearing Rich talk about pattern matching is that he did look seriously at it (certainly he was familiar with Erlang) but did not think the performance tradeoffs were worth it. Can’t say I’ve heard him say anything has changed his mind.

👍 2
jaide00:08:16

Interesting! Thanks for the insight

vemv00:08:35

Probably normalize-filename couples validation with a fix for it Something like this would seem more decoupled:

(defn valid-filename [filename]
  (when (s/includes? filename ".")
    filename))

(defn process-filename []
  (or (valid-filename filename)
      (valid-filename (str filename (.extname path basename)))
      (valid-filename basename)))
Worth noting that in your example, order is intrinsic to the logic. cond, or, reduce are all "sequential" so the choice doesn't matter as much. As I see it, (or ,,,) makes this ordering relationship very evident. With multimethods (or hashmaps), one can see how the author was very explicit about making things order-independent. with cond it could be anything. One may not even know what the original author intended.

dpsutton00:08:17

I’ve had terrible experiences using or as control flow like that when false ends up being a value that flows through. I much prefer a cond over the fall through usually

vemv00:08:04

It depends, in codebases where most defns have some sort of spec/schema behind, one is more strict about return values, enabling more techniques to be used. (A good example being predicates: I always make sure that they return a boolean)

seancorfield00:08:14

And your "decoupled" code isn't even equivalent to the original...

vemv00:08:13

It's 2am here so I'll leave you miss the forest for the trees 😴

seancorfield00:08:12

You took simple, easy-to-read, correct code and turned it into more complex, harder-to-read, incorrect code. Not entirely sure you're making the point you think you are 🙂

seancorfield00:08:53

I'm with @U11BV7MTK that using or for control flow is often a case of being far too "clever" for your own good...

dpsutton00:08:09

Especially if there’s some condition that explicitly rejects and then it falls through checking the next condition

dpsutton00:08:06

I’m not saying never use an or. But I’ve been burned before and I alwayskeep that in mind when I see that construct

jaide01:08:46

Perhaps interestingly, fennel does not have cond, its (if…) form accepts multiple condition & value pairs where last arg is the default everything above failed. In my mind, cond communicates an elseif from imperative languages. To that end, I can see where if you have a lot of different kinds of checks across many different branches that it becomes difficult to parse those and keep that in your head

jaide01:08:42

In that case, I’d probably work towards something more data driven like:

{:intention :option-a :payload {:foo “bar”}}
Then can use case against the intention field. But that assumes you can encode the intention from the start, so there may be times where that’s not viable.

Darrick Wiebe04:08:53

core.match was mentioned and is indeed very good if it covers your use case. For a solution that is more oriented toward expressiveness and flexibilty, you can also try out #C05BLM27PUM.

Ben Sless04:08:20

(Not flaming just interested in counter arguments)

Terje Dahl10:08:54

Seems to me condp pretty much allows you to do pattern matching, You just need to provide the predicate function yourself - be it a standard function or some amazing new thing you just created. Obviously it isn't "compile time" like case , but you would get the readability and ease-of-use you might want for some set of conditions.

✔️ 2
Ed10:08:50

seeing as this discussion was sparked by a conversation about typescript pattern matching, it might be worth pointing out that there are several options for pattern matching in ts, such as : https://github.com/gvergnaud/ts-pattern

Mario G11:08:34

I feel like case / cond / condp (for easiest cases / quick fixes) and multimethods (for when things get serious) is a rich menu list to pick from, for those scenarios where pattern matching would be a fit in other languages. So I don't miss/need pattern matching in Clojure, I'm more likely to miss multimethods in other languages 🙂

jaide14:08:01

@U03TNNMS3N1 yeah I’ve seen that, but mixed feelings on it