clojure 2021-04-03 | Slack Archive

Hi! Working my way through Java interop and method overload. I have this Java class:

package pets;
import java.util.Arrays;
public class Pets {
    public interface Pet { }
    public static class Dog implements Pet { }
    public static class Cat implements Pet { }
    public static String isA(Pet p) { return "pet"; }
    public static String isA(Dog d) { return "dog"; }
    public static String isA(Cat c) { return "cat"; }
    public static void main(String[] args) {
        System.out.println(isA(new Dog()));
        System.out.println(isA(new Cat()));
        for (Pet x : Arrays.asList(new Dog(), new Cat()))
            System.out.println(isA(x));
    }
}

and I get this as expected: overload-selection via type Pet at compile-time.

dog
cat
pet
pet

In Clojure we can do better:

(import (pets Pets Pets$Dog Pets$Cat))
(println (Pets/isA (Pets$Dog.)))
(println (Pets/isA (Pets$Cat.)))
(doseq [x [(Pets$Dog.) (Pets$Cat.)]]
  (println (Pets/isA x)))

We get this: dynamic overload-selection at runtime.

dog
cat
dog
cat

But when I add this to the Java class:

public static String isA(Object x) { return "thing"; }

I get this:

dog
cat
thing
thing

I'm sure this question must have come up many times: can someone briefly explain why this is so or point to some documentation? I assume that the Clojure compiler makes this happen and I tried to use partial and apply to somehow enforce the dynamic overload-selection but with no success. Using (println (Pets/isA ^pets.Pets$Pet x)) gives me the same as for Java but that's not what I'm looking for.

alasdair14:04:11

I think your doseq is equivalent to:

alasdair14:04:12

for (Object x : Arrays.asList(new Dog(), new Cat())) System.out.println(isA(x));

alasdair14:04:01

and I think (that's two thinks in a row) the clojure compiler only tries to match on interfaces if it can't find a direct match - your isA(Object x) can be called without any more inspectrion of type heirarchies so it goes for that

alasdair14:04:32

maybe post this on the Clojure FAQ site - it's an Alex Miller-ish type of question

thegeez14:04:27

@henrikheine When you provide isAt(Object x) in the java code then the Clojure compiler will at compile time insert a call to that method. If you leave out isAt(Object x) then at compile time the Clojure compiler cannot find a matching method isAt(Object x) and will insert a reflection call to find a matching isAt method at run-time (you can check this by setting (set! *warn-on-reflection* true)). You could invoke the reflector to find and call the most specific match:

(doseq [x [(Pets$Dog.) (Pets$Cat.)]] 
 (println (Pets/isA x))
 (println "REF: " (clojure.lang.Reflector/invokeStaticMethod Pets "isA" ^objects (into-array [x]))))

;;=> thing ;;=> REF: dog ;;=> thing ;;=> REF: cat Because I'm not sure of the definition of 'most specific match' in all cases I would recommend to use instance? checking and calling the method you want with type-hints such as (Pets/isA ^pets.Pet$Cat x) as you would with casting in Java code.

👍 3

henrik4208:04:52

Thx. I understand that the compiler opts for isA(Object) when it's there but still wonder why. Is it for performance? It bugs me to have something that breaks when adding a static method. Using instance? would force me to know all types in advance, no? So what I'm looking for is a solution that uses reflection to select the 'most specific' overloaded method like you said. I would probably go the way the Java compiler does it. I would use reflection to find the match and then memoize/dispatch on class. Does that make sense?

henrik4209:04:13

Of course the idea is not new: https://groups.google.com/g/clojure-dev/c/X3-CkPrSLM0

henrik4210:04:41

@U06D9RGQM I just realize that your solution does just what I asked for. Great. Thank you. The code can still break when adding more overloaded methods to Pets but that's due to the ambiguity and will break the java compile also.

nilern17:04:27

I always do (set! *warn-on-reflection* true) and add type hints because calling methods reflectively is slow and going through the Reflector makes it even slower. You could use the double dispatch pattern to get non-reflective dynamic dispatch for the argument (see e.g. https://github.com/metosin/jsonista/blob/master/src/clj/jsonista/core.clj#L166-L216). Protocols make it more powerful than in Java but it still is quite a bit of boilerplate.

henrik4206:04:19

In my case I have to deal with static methods and in that case I cannot use a protocol. Yes, reflection makes it slower but I first want to make it correct. And the way I see it the compiler does produce a reflective invokation when isA(Object) is missing which I think is correct and then suddenly ignores all types once I add isA(Objekt) which I would argue is wrong albeit performant. So using the Reflector makes it correct for me. So how can it be made fast? At load/compile-time the compiler knows about all overloaded methods isA(<type>). So it could produce a memoized-cond-on-type-dispatch to non-reflective invokations. That should then be correct and fast.

nilern08:04:56

You can still use a protocol for static method arguments.

nilern08:04:38

(defprotocol PetIsA
  (pet-isa? [pet]))

(extend-protocol PetIsA
  Pets$Dog
  (pet-isa? [dog] (Pets/isA dog)))

nilern08:04:13

Incidentally I am reading Effective Java and it says to avoid overloads with the same arity. It is especially confusing if the arguments are in subtype relation.

nilern08:04:08

The compiler is not going to help you. But if you really want you can use the Reflector or the underlying Java reflection in your own dispatch-generating macro. But I would change the confusing overloads if possible or else use the protocol trick. Or both.

henrik4220:04:42

Thanks a lot. Cool. I didn't know about the protocol trick. And yes, one should not overload with same arity. Usually people use theit IDE to tell them which implementation is invoked ...😆

nilern20:04:11

I learned the protocol variation from those Jsonista sources but double dispatch is a longstanding OO pattern, especially as Visitor

mokr15:04:22

Hi, in an effort to improve my coding it would be nice to read some best practices on the use of keywords (plain, namespaced and the auto namespaced). Considering topics like: • Destructuring (:person/name easily collides with :pet/name, :street/name, …) • Store and retrieve from DB (typically a string based name) • Interop with Javascript/JSON • Records (no ns support) • Specs (namespaced keys are automatically checked) • Library code and collisions. • Domain data vs other data Maybe it is too much to hope for a resource that takes all of this into account? To elaborate: While I don’t enforce any hard rules I tend to use namespaced keywords (:person/name) for domain data, auto namespaced keys for “private” data (e.g. when storing in re-frame’s app-db, I store under ::/people to indicate that this key should only be retrieved by the namespace that stored it. I’m not a big fan of the (:require [my.app.foo :as foo]) (::foo/something some-map) syntax as it’s a bit noisy for my taste with all those :: prefixing everything. But I also find myself breaking especially the domain data “rule” when JS/DB interop is involved and worst case I get an inconsistent codebase with more than one key for a given piece of data.

flowthing15:04:05

There’s some discussion on that topic here: https://ask.clojure.org/index.php/10380/when-to-use-simple-qualified-keywords Doesn’t have an answer for all your questions, though.

mokr15:04:35

Thanks, @U4ZDX466T, that’s a nice resource to add to my list.

jjttjj16:04:47

There's also this other in progress discussion: https://clojureverse.org/t/dont-quite-understand-rules-for-namespacing-keywords/7434?u=jjttjj

mokr17:04:00

That was a nice discussion I’ll keep an eye on. I really struggle with accepting namespace qualified keywords for domain data as a good idea. Too me that feels like it introduces some unattractive coupling between namespaces that needs to work on domain data and a noisy syntax. A kind of “belt and suspenders” reaction to the fear of key conflicts. Completely ignoring data lifespan, localisation and the likelihood of conflicts in a given use case. But, very experienced developers are proponents of it so I get the feeling I’m missing something bigger here. Apart from the obvious that it won’t ever collide. Possibly we just don’t agree on the trade-offs involved though… 🙂

didibus23:04:16

I think namespace keywords "appear" strongly encouraged but are not

didibus23:04:43

They're a useful feature to have when you need it, but most of the time you'll be fine with unqualified keywords

didibus23:04:59

Like I answered in that latest ClojureVerse question. If you're going to collocate keywords with a risk of collision, then they are a good way to avoid that, such as Spec's use of them or Datomic. They also allow collocated keywords to still be grouped, which is what Datomic does.

didibus23:04:42

And they're nice too if you want contextual lineage.

didibus23:04:40

But those are the less common use cases, in practice you will most often just have keywords inside map that represent entities, or as a way to club function inputs/outputs together, and unqualified keywords is the norm

didibus00:04:45

But for domain data I will say I do find them quite nice. But you need to use them like:

#:user{:name "john" :email ""}
{:order/id 1234, :user/name "john", :wallet/balance 456}

Basically you just go :entity-name/property-name

didibus00:04:56

What gets syntactically heavy is when you try to start using namespace aliases and fully package qualified namespaces

marciol19:04:51

Hi @U1S4F3M4M, someone started this same question days ago: https://app.slack.com/client/T03RZGPFR/C03S1KBA2/thread/C03S1KBA2-1616759029.465400

Célio20:04:21

Hi all. I thought I knew how laziness works in functions like map and filter but I was just caught by surprise with this code:

(->> [nil "hello" nil nil]
     (map #(do (println "processing " %) %))
     (filter (comp not nil?))
     first)

When eval’d in the REPL it outputs this:

processing  nil
processing  hello
processing  nil
processing  nil
"hello"

I was expecting it to print only the first and second elements, but instead it printed all of them. What’s going on here? Also, any tips on how to make it stop processing elements after the first element returned by filter?

pavlosmelissinos21:04:53

not really what you asked but fyi you could use some? instead of (comp not nil?) and (some identity) instead of

(filter (comp not nil?))
     first

seancorfield21:04:49

(some some?) -- identity will not "match" false but (comp not nil?) will.

👍 3

seancorfield21:04:02

Also (comp not f) == (complement f)

👍 3

pavlosmelissinos22:04:31

Right, missed the false case, thanks! (some some?) returns true/false instead of the actual item though, which I don't think is the desired behaviour here (on the other hand there's probably no "desired behaviour", since this is just an example anyway)

seancorfield22:04:20

Ah, good point. You can't actually use some if you want false as a match.

👍 3

seancorfield22:04:02

(since some only returns "logical true" values)

seancorfield22:04:59

So we're both wrong, for different reasons 🙂

🤲 3

seancorfield22:04:49

(keep identity coll) will at least return false so (filter some?) and (keep identity) are the same I believe.

❤️ 3

Gang Liang23:12:10

I am late to this thread... laziness in Clojure is to take 32 elements in chunks. Your starting vector is short. If you put more than 32 elements there, you can see the map will process 32 items and ignore the rest.

👍 2

Célio20:04:19

Hah! The lazy seqs produced by map and filter are chunked.

hiredman20:04:32

The lazy seq produced by calling seq on a vector is chunked

hiredman20:04:04

map and filter return chunked seqs if given one

👍 6

2021-04-03

Channels