Fork me on GitHub
#data-science
<
2022-11-13
>
Niklas Heer21:11:03

Hi 👋 I’m trying out to switch from using Python and Crystal for one of my silly projects comparing the speed of different programming languages by calculating Pi through the Leibniz formula to using Clojure for all the result processing and data visualisation. The data processing part was straight forward, but for the data visualisation I’m a little bit overwhelmed what best to use for that. In Python I’m using pandas (basically only for DataFrames) and seaborn to generate a barplot as a .png . What would be the recommended tools to achieve a similar result with Clojure? I’ve seen Hanami mentioned, but I don’t really need interactive charts. (here is an example image of what I’m trying to generate)

genmeblog09:11:06

Generally most charting libraries rely on Vega/Vega-Lite so you can get exactly what is possbile in Vega. To get offline charts you can reach for https://github.com/applied-science/darkstar or https://github.com/generateme/cljplot

genmeblog10:11:16

btw. can you share your Clojure code generating PI? I'm pretty sure it can be done as fast as in Java 🙂

Daniel Gerson10:11:16

I'm using Echarts in https://github.com/scicloj/clay & Clerk. Clay's api is changing, so not quite sure the current state. Echarts is just incredible. https://echarts.apache.org/en/index.html

Rupert (All Street)11:11:36

Agree with @U1EP3BZ3Q - Clojure should be much faster. @U013WB3U13K do you allow for warm up of JIT first before timing?

Rupert (All Street)11:11:01

The poor performance appears to be coming from JVM cold start + JIT cold start being part of the timing

genmeblog11:11:34

I've made some tests and here is the small hint (and surprise discovery).

Rupert (All Street)11:11:42

One solution to JVM cold start + JIT cold start would be to have it run for much longer - eg make the languages run for at least 60 seconds each. the cold starts would still be there but they would be less dominant in the results.

genmeblog11:11:02

On Java 18, it's enough to add a type hint for rounds + change = to == (numerical) to get 10x boost. EDIT (change to == doesn't matter)

👍 1
genmeblog11:11:34

But (surprise!) it's not true on Java 11. Java 11 slows down around 2x.

genmeblog11:11:08

(verified with criterium)

Rupert (All Street)11:11:52

What's the type hint for rounds? I guess we should try to hint these as primitives/unboxed integers.

genmeblog11:11:48

(defn calc-pi-leibniz2 
  "Translation of Java solution to Clojure"
  [^long rounds]
  (let [end (+ 2 rounds)]
    (loop [i (long 2) x 1.0 pi 1.0]
      (if (== i end)
        (* 4.0 pi)
        (recur (inc i) (- x) (- pi (/ x (dec (* 2 i)))))))))

Rupert (All Street)11:11:04

int might be quicker if we don't need the extra length. We probably do need it though

genmeblog11:11:05

public static Object invokeStatic(final long rounds) {
        final long end = 2L + rounds;
        long i = 2L;
        double x = 1.0;
        double pi = 1.0;
        while (i != end) {
            final long n = i + 1L;
            final double n2 = -x;
            pi -= Numbers.divide(x, 2L * i - 1L);
            x = n2;
            i = n;
        }
        return Numbers.unchecked_multiply(4.0, pi);
    }

Rupert (All Street)11:11:16

How did you generate that?

genmeblog11:11:11

You can't use int as type hint for functions arguments

👍 1
genmeblog11:11:02

Java for original function looks like:

public static Object invokeStatic(final Object rounds) {
        final Object end = Numbers.unchecked_add(2L, rounds);
        long i = 2L;
        double x = 1.0;
        double pi = 1.0;
        while (!Util.equiv(i, end)) {
            final double x2 = -x;
            final long n = i + 1L;
            final double n2 = x2;
            pi += Numbers.divide(x2, 2L * i - 1L);
            x = n2;
            i = n;
        }
        return Numbers.unchecked_multiply(4L, pi);
    }
The difference is in Object end and equiv call. Java 11 does some magic here.

Rupert (All Street)11:11:39

Maybe try letting rounds to a new param that is a long (let[num-rounds ^long (+ 0 (long rounds))] - since rounds is an object

genmeblog11:11:22

It will work for sure. The main issue (kind of unexpected) is a difference between Java 11 and 18. Java 11 works faster with Object end and equiv and slower with long end and primitive comparison.

genmeblog11:11:20

My times were: Java 11

calc-pi-leibniz  "Elapsed time: 3.457612 msecs"
calc-pi-leibniz2 "Elapsed time: 4.133226 msecs"
Java 18
calc-pi-leibniz  "Elapsed time: 10.807806 msecs"
calc-pi-leibniz2 "Elapsed time: 0.942277 msecs"

Rupert (All Street)11:11:22

I think the primitive is the way to go - performance isn't that much worse on Java 11 and much better on Java 18. It also feels conceptually that it should be faster 🙂 .

Niklas Heer12:11:35

@U1EP3BZ3Q thank you for sharing the tools, I'll have a look 🙂

Niklas Heer12:11:34

For your questions around https://github.com/niklas-heer/speed-comparison: • Regarding cold starts: I'm testing the performance with https://github.com/sharkdp/hyperfine. It has a warm-up option, but since the command to run the program is executed per run it is still booting up the runtime. The warum-up option just optimizes cached files which are used during the execution. (see https://github.com/niklas-heer/speed-comparison/blob/master/Earthfile#L27) • There is an https://github.com/niklas-heer/speed-comparison/issues/59 without the cold start, but that isn't that easy to implement. • For Clojure I'm using the https://hub.docker.com/_/clojure:temurin-19-tools-deps-alpine image so it's https://adoptium.net/temurin/releases/?version=19 • You can also run the code for just Clojure locally with earthly --config earthly-config.yml +clj if you clone the repo. (see https://github.com/niklas-heer/speed-comparison/blob/master/Earthfile#L114) Generally I'm very open to pull requests to improve the speed 🙂 PS: https://niklas-heer.github.io/speed-comparison/#raw-results you can find the raw results in table form.

👍 1
genmeblog12:11:39

Type hint changes:

{
  "Language": "Clojure",
  "Version": "1.11.1.1189",
  "Command": "clj leibniz.clj",
  "CalculatedPi": "3.141592663589326",
  "Accuracy": 8.497170166376472,
  "Mean": "1.4111583370133332s",
  "Stddev": "0.023889375761580768s",
  "UserTime": "2.7031969466666665s",
  "SystemTime": "0.37826133333333334s",
  "Median": "1.40588699668s",
  "Min": "1.39034487068s",
  "Max": "1.43724314368s",
  "TimesPerRun": [
    1.39034487068,
    1.40588699668,
    1.43724314368
  ],
  "ExitCodesPerRun": [
    0,
    0,
    0
  ]
}
Into this:
{
  "Language": "Clojure",
  "Version": "1.11.1.1189",
  "Command": "clj leibniz.clj",
  "CalculatedPi": "3.141592663589326",
  "Accuracy": 8.497170166376472,
  "Mean": "0.8219248784400001s",
  "Stddev": "0.01017200819926966s",
  "UserTime": "1.9248804666666668s",
  "SystemTime": "0.2618478866666667s",
  "Median": "0.8221420704400001s",
  "Min": "0.8116460134400001s",
  "Max": "0.8319865514400001s",
  "TimesPerRun": [
    0.8221420704400001,
    0.8116460134400001,
    0.8319865514400001
  ],
  "ExitCodesPerRun": [
    0,
    0,
    0
  ]
}

Rupert (All Street)12:11:59

@U013WB3U13K thanks for the info. For most languages min/max numbers are highly consistent so hyperfine doesn't appear to be helping that much. I like the suggestion in https://github.com/niklas-heer/speed-comparison/issues/59 - but this won't totally remove JIT warm up time too. Another option would be to optionally time inside of each language and print it with the result.

Niklas Heer15:11:04

@UJVEQPAKS that was also suggested, but that isn't practical. The project is too far gone. I couldn't implement that for all the different languages. I think the suggestion is the most pragmatic solution which can be done without changing all the programs.

Rupert (All Street)15:11:50

Yup that's fair enough - no point adding extra work if not practical - I was thinking timing could be optional - the programs can print their recorded time to the command line or not (in which case the current timing logic applies). If going with the suggestion in https://github.com/niklas-heer/speed-comparison/issues/59 then it might be worth taking away a run with rounds = 1000 so that some JIT warm up time is also eliminated.

👍 2