Fork me on GitHub

So my understanding of AD is in terms of dual numbers, see this section Theres also a good blog post on it in haskell here It seems that its easier to implement if you can overload your operators to work on your newly defined "dual numbers" in addition to doubles and ints n everything. Could this by an argument in favour of static typing, which the clojure community is against? In this clojure-AD stub you can see that they exclude the imports of core operators +,-,* etc... and then define their new operators to work on dual numbers


The haskell implementation looks so clean, im quite jealous


now your functions maps dual numbers to dual numbers, if you evaluate the function at (x 1) you'll get (f(x) f'(x)) out, so just reading off the second number gives you the derivativve


I seem to remember that there isn't much flexibility to change the behaviour of basic operators like + - in clojure


That part I get. Also clojure has such a small lang core, that a dispatch operator table would be a fine solution


can you elaborate on the dispatch operator table?


The part where I get a bit confused on is the backwards part and then the tracing through so that you can do logic operators


Still haven’t found time to look at all the docs though, so I’m sure I’m still missing big chunks


It’s on my ever increasing todo list 🙂


I'm not sure how the logic operators work. The example I have in my mind is a piecewise linear function, say (defn abs [x] (if (< x 0) -x x)). This is not differentiable at zero


im not sure what AD would do here, perhaps what waas said earlier is that the logical statement cannot depend on the variable itself


@michaellindon In the docs here, it says that it can handle conditional branches and whiles


but the derivative of abs(x) does not mathematical exist at 0


anything it gave out would be incorrect


I guess not everything would work


@michaellindon: 'not everything is differentialble everywhere' - that's the point that Dragan made a while back. Also doing this with overloading is not optimal (or method dispatch which would be even worse). The 'right' way is via program transformation. I believe @sophiago is working on a version which does this.


@jsa-aerial what do you mean by program transformation?


@michaellindon just that - the original code is transformed to calculate the derivatives along with the original caluculation. There are quite a few papers on this. Actually, I just checked and I see it is even mentioned in that wikipedia article:


Deoesn't really say much there though. Another reason why the overloading approach can be very suboptimal is you will likely be allocating memory at a high rate (and then either manually cleaning or exercising the GC - either of which will add loads of operations to the basic operation you really want)


@jsa-aerial Its hard for me to imagine how code transformation is different to symbolic differentiation


i can imagine specifying a set of term rewriting rules, which accepts a form and spits out another for computing the derivative, but the lines between symbolic differentiation and source code transformation AD seem to blur


are you able to explain?


'Basically' you have to write a specialized mini compiler. This will take the original source, generate an AST and symbol table, then you need some phases for attribute synthesis and then finally a phase for code generation. It is a lot more than just a few rules as for symbolic differentiation. The 'mini compiler' aspect though is an indication of how lisp languages can shine here - code is data - as for example:


thanks for the link


I think I understand


I cant quite articulate my frustrating here


a compiler for the VLAD language? Nobody is using it. Why didn't they create something that serves a larger community like racket or clojure


Why do you think that nobody is using it?


Certainly somebody is using it, maybe just the authors. If they really were after the widest audience reach, they'd probably choose Java as a target, or .NET, or PHP (the papers are from the 2008), I guess...


i think its just a proof of concept


Well, there are also some C++ libs for this (from the 'Almost Anything is Possible Dept' 😱) if that helps


Even a MatLab lib


But even that should be ok. They seem to have had a NSF grant, they did the work, someone got the PhD, goals accomplished. I doubt they were in the game to serve Clojure and Racket community for free 🙂


Also, in 2008 Clojure was barely known and the work was probably mostly done in the 2-3 years prior


That's an overloading approach that 'works', but is not exactly at the optimal level...


by the same author


Well 'overloading' using methods...


It's a good example of how memory use can be a killer here. All those recs being allocated...


@gigasquid "Python control flow operations are invisible to Autograd. In fact, it greatly simplifies the implementation". Sounds like Autograd just ignores conditionals.


@jsa-aerial not optimal? I would say pessimal! 🙂


It seems to use Newton's method of approximation.


Well, we're being diplomatic here...


@blueberry thats just an illustration


That's why the 🙂 is there


I do not imply that the author did the wrong or worthles work.


Just that it's not something that leads to any viable solution to the topic at hand.


I did look at clj-auto-diff a fair amount and do think there is a middle ground here. One that isn't full code transformation, but which doesn't need to continually allocate memory, and which can 'resolve' method calls 'statically'. But it occurred to me that if you go that route, may as well go the Full Monty...


i dont know enough about these things to have an opinion yet 🙂