Fork me on GitHub
#clr
<
2023-02-18
>
nwalkr20:02:21

I feel like I found two small bugs but I'm a bit unsure in both cases. 1st - socket REPL, disconnecting client causes server process termination steps to reproduce server terminal:

> Clojure.Main -e "(do (require 'clojure.core.server) (import '[ IPAddress]) (add-tap println) (clojure.core.server/start-server {:port 4444 :accept 'clojure.core.server/repl :server-daemon false :name :server-repl :address (IPAddress/Parse ""127.0.0.1"")}))"
#object[TcpListener 0x181da58 "System.Net.Sockets.TcpListener"]
client terminal:
> telnet localhost 4444
user=> a^]
Microsoft Telnet> q 
back to server:
Unhandled exception. System.IO.IOException: Unable to write data to the transport connection: An established connection was aborted by the software in your host computer
 ---> System.Net.Sockets.SocketException (10053): An established connection was aborted by the software in your host computer
   at System.Net.Sockets.NetworkStream.Write(ReadOnlySpan`1 buffer)
   --- End of inner exception stack trace ---
   at System.Net.Sockets.NetworkStream.Write(ReadOnlySpan`1 buffer)
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.IO.StreamWriter.Write(String value)
   at clojure.core$eval8400fn__8405__8409.invoke(Object , Object )
   at clojure.lang.MultiFn.invoke(Object arg1, Object arg2) in C:\work\clojure-clr\Clojure\Clojure\Lib\MultiFn.cs:line 416
   at clojure.core$pr_on__3608.invokeStatic(Object , Object )
   at clojure.core$pr_on__3608.invoke(Object , Object )
   at clojure.core$pr__3616.invokeStatic(Object )
   at clojure.core$pr__3616.invoke(Object )
   at clojure.core$print__3640.invokeStatic(ISeq )
   at clojure.core$print__3640.doInvoke(Object )
   at clojure.main$repl_prompt__18694.invokeStatic()
   at clojure.main$repl_prompt__18694.invoke()
   at clojure.main$repl__18965.invokeStatic(ISeq )
   at clojure.main$repl__18965.doInvoke(Object )
   at clojure.lang.RestFn.invoke(Object arg1, Object arg2, Object arg3, Object arg4) in C:\work\clojure-clr\Clojure\Clojure\Lib\RestFn.cs:line 542
   at clojure.core.server$repl__19416.invokeStatic()
   at clojure.core.server$repl__19416.invoke()
   at clojure.lang.AFn.ApplyToHelper(IFn fn, ISeq argList) in C:\work\clojure-clr\Clojure\Clojure\Lib\AFn.cs:line 185
   at clojure.lang.AFn.applyTo(ISeq arglist) in C:\work\clojure-clr\Clojure\Clojure\Lib\AFn.cs:line 174
   at clojure.lang.Var.applyTo(ISeq arglist) in C:\work\clojure-clr\Clojure\Clojure\Lib\Var.cs:line 1092
   at clojure.core$apply__502.invokeStatic(Object , Object )
   at clojure.core$apply__502.invoke(Object , Object )
   at clojure.core.server$accept_connection__19283.invokeStatic(Object , Object , Object , Object , Object , Object , Object , Object )
   at clojure.core.server$accept_connection__19283.invoke(Object , Object , Object , Object , Object , Object , Object , Object )
   at clojure.core.server$start_serverfn__19299fn__19304fn__19309__19313.invoke()
   at lambda_method76(Closure )
   at System.Threading.Thread.StartCallback()
(process exited)
I think it's somehow caused by *out* trying to auto-flush or something else like this, and looks like it can be fixed easily in clojure.core.server/accept-connection
(let [accept-fn (resolve accept)]
        (apply accept-fn args)))
---    (catch SocketException _disconnect)
+++    (catch IOException _disconnect)
    (finally
      (with-lock lock 
I can't reproduce it on Linux machine and it makes me think there is kind of race between SocketException from reader and IOException from writer. SocketException is subclass of IOException so this quickfix should work.

dmiller14:02:41

Your diagnosis appears to be correct. The accept-connection code is catching SocketException, but the SocketException is being wrapped by an IOException and so is not caught. The current code matches the Java code, so this is a platform difference. I'm not sure how Linux might play into this other than some difference in .Net on the OS. And your proposed solution does solve the problem. I'll get it out in another alpha release, probably tomorrow. (I have a few other changes ready to go.) I think you also have solved a problem I was running into working on the port of nREPL. Exactly this error. The report coming out showed the SocketException by not that wrapping by IOException -- or I just missed it. It wasn't a show-stopper -- there was an encompassing try block catching all exceptions -- but it certainly had been bothering me. Thanks!

nwalkr21:02:44

2nd - incorrect class names in exception trace CLR

user=> (try (/ 1 0) (catch Exception ex ex))
#error {
 :cause "Divide by zero"
 :via
 [{:type System.ArithmeticException
   :message "Divide by zero"
   :at [System.Diagnostics.StackFrame divide "C:\\work\\clojure-clr\\Clojure\\Clojure\\Lib\\Numbers.cs" 1071]}]
 :trace
 [[System.Diagnostics.StackFrame divide "C:\\work\\clojure-clr\\Clojure\\Clojure\\Lib\\Numbers.cs" 1071]
  [System.Diagnostics.StackFrame divide "C:\\work\\clojure-clr\\Clojure\\Clojure\\Lib\\Numbers.cs" 1079]
  [System.Diagnostics.StackFrame invokeStatic "NO_FILE" 0]]}
JVM
user=> (try (/ 1 0) (catch Exception ex ex))
#error {
........
 :trace
 [[clojure.lang.Numbers divide "Numbers.java" 190]
........
It is (symbol (.FullName (.GetType stackframe)) in StackTraceElement->vec and I think GetType here is pretty misleading name. I was able to fetch class names via method info from stack frame, but it quickly got a bit more complex. Good case
user=> (import '[System.Diagnostics StackTrace StackFrame])
System.Diagnostics.StackFrame
user=> (-> (try (/ 1 0) (catch Exception t t))
    (StackTrace. true)
    (.GetFrames)
    (->>
      (map (fn [^StackFrame frame]
             (when-let [mi (.GetMethod frame)]
               (let [typ (or (.ReflectedType mi) (.DeclaringType mi))]
                 [(some-> typ (.Name)) (.ToString mi)])))))
    (clojure.pprint/pprint))
(["Numbers" "System.Object divide(System.Object, System.Object)"]
 ["Numbers" "System.Object divide(Int64, Int64)"]
 ["user$eval24264fn__24278__24282" "System.Object invoke()"])
And not so good
user=> (-> (try (clojure.lang.Compiler/load "" nil nil) (catch Exception t t))
    (StackTrace. true)
    (.GetFrames)
    (->>
      (map (fn [^StackFrame frame]
             (when-let [mi (.GetMethod frame)]
               (let [typ (or (.ReflectedType mi) (.DeclaringType mi))]
                 [(some-> typ (.Name)) (.ToString mi)])))))
    (clojure.pprint/pprint))
([nil   <-------------- THIS
  "System.Object CallSite.Target(System.Runtime.CompilerServices.Closure, System.Runtime.CompilerServices.CallSite, System.Object, System.Object, System.Object, System.Object)"]
 [nil   <-------------- THIS
  "System.Object CallSite.Target(System.Runtime.CompilerServices.Closure, System.Runtime.CompilerServices.CallSite, System.Object, System.Object, System.Object, System.Object)"]
 ["user$eval24288fn__24302__24306"
  "System.Object __interop_load24308(System.Object, System.Object, System.Object, System.Object)"]
 ["user$eval24288fn__24302__24306" "System.Object invoke()"])
In my own code I just stubbed this case with "UNKNOWN", but I doubt it would be good decision for general audience.

dmiller15:02:19

I'm not sure there is a good solution to this problem. Your code is a good substitute to improve the printout. The nil name problem is probably not solvable. Interop calls often end up being dynamic callsites and it is very hard to get any usable information out of it. The method in the top frame of the stacktrace in your Compiler/load example is an instance of System.Reflection.Emit.DynamicMethod+RTDynamicMethod. It has essentially no useful information in it that I can find. One could just filter out the callsites, I suppose. You'd see the interop_load on the third frame down at least.