Fork me on GitHub
#clr
<
2023-02-02
>
bobcalco21:02:43

Struggling with the <https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/january/csharp-all-about-span-exploring-a-new-net-mainstay%7CSystem.Span&lt;T> struct type> - creating, calling overloaded methods, etc. In a hurry so any quick pointers appreciated. Otherwise I'll figure it out when I have more time.

bobcalco21:02:18

It does not seem possible to do anything with it:

user ~> (import 'System.Span)
Bad type
Execution error (NullReferenceException) at System.Diagnostics.StackFrame/NameForType (NO_FILE:0).
Object reference not set to an instance of an object.

bobcalco21:02:04

it's implemented as a read only ref struct so that' s the issue trying to call it as a CLR class I guess; but it's quite common in modern libraries so definitely needful to support.

bobcalco21:02:50

ah |Span`1[T]|

bobcalco21:02:21

but still struggling with correct incantations

dmiller21:02:36

I'm not sure the compiler can handle it. I'll take some time over the weekend to look it. The problem is the stack allocation. Clojure likes to box structs, and boxing stack allocated objects is a no-no. I'm just basing it on the recent conversation we had on here about overloads involving Span on http://System.IO classes. I'll generate some C# code and look at the IL and see if I can figure out what techniques will work. Sigh.

dmiller00:02:11

IronScheme handles it?

bobcalco12:02:51

Yes. When I get to the office in a bit I'll send you the code I was trying to port when I ran into this.

bobcalco13:02:01

@U45FQSBF1 The following code returns a delegate for interning strings. This is for use with the Sylvan.Data.Csv library, using Ben.StringIntern's InternPool for an implementation optimal for very large datasets:

(define make-intern-pool
    (lambda ()
      (clr-new InternPool)))

  (define intern-pool/string-factory
    (lambda (pool)
      (lambda (buffer offset length)
        (let* ((span (clr-new (Span char) buffer))
               (slice (clr-call (Span char) Slice span offset length))
               (result (clr-call (Span char) ToArray slice)))
          (clr-call InternPool (Intern char[]) pool result)))))
The signature of the delegate expected by CsvDataReaderOptions class is:
public delegate string StringFactory(char[] buffer, int offset, int length);
Here's how I use it:
(define dt-reader/open-with-defaults
    (lambda (file-name)
      (let* ((opts (make-dt-reader-options))
             (string-pool (make-intern-pool))
             (factory (intern-pool/string-factory string-pool)))
        (begin
          (dt-reader-options/delimiter-set! opts (default-delimiter))
          (dt-reader-options/string-factory-set! opts factory)
          (let ((csv (make-dt-reader file-name opts)))
            csv)))))

bobcalco13:02:55

@U45FQSBF1 Ditto ArraySegment<T> - another generic value type I can't seem to use as an alternative to Span<T>...

bobcalco14:02:40

All that said, the canonical Clojure way to do this is: (->> buffer (drop offset) (take length))

bobcalco14:02:44

I do not however know the performance trade off of using that vs the Span or ArraySegment types.

dmiller14:02:05

I'll be taking a look this afternoon. Obv need to handle Spans and the like in some way. Do you have any little piece of code that blows up? Is there a way to see what IronScheme generates for it? (I'll also translate into C# and see how it is handled.) I'm going to have to do some serious reading on Span and its implementation details and the restrictions on dealing with stack-allocated objects.

bobcalco14:02:24

well fresh from my repl ArraySegment travails looked like this (wasn't sure how to try to instantiate it properly):

user ~> (import '[System |ArraySegment`1|])
System.ArraySegment`1
user ~> (new |ArraySegment`1[int]|)
Syntax error (TypeNotFoundException) compiling new at (REPL:1:2).
Unable to find type: ArraySegment`1[int]
user ~> (new |ArraySegment`1(int)|)
Syntax error (ParseException) compiling new at (REPL:1:2).
Unable to resolve classname: ArraySegment`1(int)

bobcalco14:02:26

Note in the above snippet I forgot to include the original buffer (char array) and the constructors include one that also takes an offset and length. But adding those didn't help.

user ~> (new |ArraySegment`1[char]| charray offset length)
Syntax error (TypeNotFoundException) compiling new at (REPL:1:2).
Unable to find type: ArraySegment`1[char]
user ~>
where charray was defined as (def charray (chars (char-array "you heard me right")))

bobcalco14:02:25

also while we're at this can you remind me how to implement a delegate using (gen-delegate ...) - it seems to be missing from https://github.com/clojure/clojure-clr/wiki/Defining-types

bobcalco14:02:54

in this case i am trying to implement StringFactory described above

dmiller15:02:11

Generics in ClojureCLR overall have problems, as you know. I hope for comprehensive solution as we go through ClojureCLR.Next, and I'll backfill whatever I can into the current ClojureCLR. If you just (import '[System |ArraySegment1|)` you are importing the generic base class. Not helpful. If you (import '[System |ArraySegment1[System.Int32]|)` it will work, but bizarrely now |ArraySegment1|` will refer to the int version. And you can't do a second import. Need to fix that. At any rate:

user=> (import '[System |ArraySegment`1[System.Char]|])
System.ArraySegment`1[[System.Char, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]
user=>  (def charray (chars (char-array "you heard me right")))
#'user/charray
user=>
user=> (new |ArraySegment`1| charray 2 5)
System.ArraySegment`1[System.Char]
user=> *1
System.ArraySegment`1[System.Char]

bobcalco15:02:19

interesting

bobcalco15:02:03

I imagine type erasure was a feature and not a bug of the JVM as far as Rich was concerned. lol.

😂 1
dmiller15:02:38

Obv need to fix the wiki re gen-delegate . A few examples;

(gen-delegate EventHandler [sender args]
	        (let [c  (Double/Parse (.Text tb)) ]
	          (.set_Text f-label (str (+ 32 (* 1.8 c)) " Fahrenheit")))
(doto (System.Threading.Thread.
           (gen-delegate System.Threading.ThreadStart []
             (binding [*bind-me* :thread-binding]
               (send a (constantly *bind-me*)))
             (await a)))
      (.Start)
      (.Join))
General format:
(gen-delegate delegate-type [argumentlist...] body)

dmiller15:02:28

Generics are a major pain. Nothing in Clojure is designed to deal with them. As you say, type erasure was a blessing.

bobcalco16:02:00

when calling an instance method of a class with tons of overloads, remind me plz the incantation that I can use to ensure it resolves to the right one. In this case I'm trying to call Intern(char[] value). Is the (type-args ...) macro also usable here? And which syntax is preferred - special dot syntax of some kind or what?

bobcalco16:02:41

(.Intern InternPool pool segment) is what I'll be testing shortly in the optimistic supposition it will "just work" but if it doesn't, what should I try? If specifying the type of buffer, which is char[], what's the correct type annotation? ^:char[] ?

dmiller16:02:20

Where is InternPool? (Need to look at the overloads and experiment)

bobcalco16:02:35

most of them involve ReadOnlySpan lol

bobcalco16:02:37

but there is one that takes char[] at line 594

bobcalco16:02:43

that's the one I'm calling

dmiller16:02:01

chars is short-hand for the char array type.

bobcalco16:02:48

probably good to have a table of basic type mappings at the wiki

bobcalco16:02:24

how do I invoke Intern so it knows which overload to resolve to

dmiller16:02:25

I didn't bother because chars and the others are basic Clojure.

dmiller16:02:06

(.Intern x ^chars charaary)

bobcalco16:02:15

so the translation from Scheme so far is:

(defn make-intern-pool-string-factory
  [pool]
  (gen-delegate StringFactory
                [buffer offset length]
                (let [segment (->> buffer (drop offset) (take length))]
                  (.Intern pool ^chars segment))))

dmiller16:02:46

That seems reasonable (though I don't know the specifics of StringFactory). What I'm interested in from Scheme: does it handle directly passing a Span? If so, what is the IL that is generated to make that work?

bobcalco19:02:39

is byte[] also ^bytes ?

bobcalco19:02:19

(like char[] is ^chars I mean)

bobcalco20:02:51

When annotating arguments to a defn I'm getting "Only long and double primitives are supported" when I try to use, e.g., ^Boolean or ^int. Is there some place I can find the definitive list of primitive types one can annotate. I am able to annotate with class names except those wrapping primitives. A little confusing why this is the case.

bobcalco20:02:34

LOL, and now: "fns taking primitives support only 4 or fewer args" I'm wrapping a .NET API for crying out loud. LOL.

bobcalco20:02:30

@U45FQSBF1 is it better to annotate them in the call to the API being called (as opposed to in the args list)?

dmiller22:02:53

yes on ^bytes. Longs and doubles are the only primitives types allowed for typing defn args. Not my rule. (I'm thinking about ways to break that in ClojureCLR.Next). Type-annotate or ast them for interop calls. That's really the only way.

dmiller22:02:53

And it's one of the reasons dealing with ref structs (such as Span) is going to be a problem. They are structs, can't avoid boxing when passing them around, and you can't box a ref struct.