Fork me on GitHub
#java
<
2020-03-05
>
orestis06:03:11

Was surprised to find that my ETL pipeline spends a considerable amount in SSL processing (JDK 8, for now). I don’t control the code that handles the connections (MongoDB and JDBC Postgres) - is there a way to speed things up other than throwing hardware at the problem?

aisamu12:03:57

This is a long shot, but does the flame-graph show wait time (e.g. waiting for I/O)? Have you checked the entropy levels on the machine that's running this (specially if it's a remote instance)

orestis13:03:51

I think that I/O is built in the various function calls. Not sure what entropy levels is about…

orestis13:03:03

It might be that I’m also saturating some I/O channel. Need to dig deeper into AWS to see the metrics.

seancorfield06:03:53

@orestis That sounds like your code is creating and tearing down a lot of connections -- perhaps look at ways to have longer-lived connections and more connection pooling?

orestis06:03:08

Sorry, I meant that the JVM is spending quite a lot of time doing SSL operations. I’ll paste the relative part of the flame graph here

seancorfield06:03:44

Sorry, I don't know how to read that.

orestis06:03:53

This is the mongo-specific part — sun.security.ssl.InputRecord.decrypt is taking half the time.

orestis06:03:15

Width is time spent in a specific function, bottom layers include the top.

orestis06:03:24

Effectively this looks like when I’m loading documents from mongo, 55% of the CPU is spent on doing the BSON decoding, and 45% on the SSL decoding.

seancorfield06:03:08

(this feels like yet more support for our decision to stop using MongoDB 🙂 )

orestis06:03:26

Heh, you’d think — but here’s the relevant part of for JDBC 😄

orestis06:03:57

again, sun.security.ssl.AppOutputStream.write is taking like 40% of the time

seancorfield06:03:44

Security is expensive 🙂

orestis06:03:52

(I’m loading documents from Mongo, doing light massage and dumping them into Postgres)

orestis06:03:38

Good thing I decided to profile because I naively thought that the massaging time would be my bottleneck, but turns out it isn’t.

seancorfield06:03:39

I guess I'd ask "Is the process fast enough?" rather than "Where is it spending its time?"

orestis06:03:06

Nope, it’s not 😄

seancorfield06:03:35

Normally, the big speed ups come from algorithmic changes -- and it doesn't seem like you've got much hope of those.

seancorfield06:03:51

So... hardware? Maybe that is your only option?

orestis06:03:12

There might be “free” performance gains by bumping to JDK 11, which I’ll try next. Googling about JVM SSL performance shows that it is a common issue (Netty says use OpenSSL)

seancorfield06:03:03

We moved from 8 to 11 a while back for nearly all our processes but I can't say we noticed much speedup.

seancorfield06:03:21

(I suspect we're bottlenecked on other stuff than SSL tho')

orestis06:03:35

Yeah this is a special case in that it syncs Mongo to Postgres and has to pretty much pipe the entire database. It wouldn’t be noticeable with normal day-to-day operations.

orestis07:03:26

I’ll try throwing hardware at it and see what changes

orestis07:03:53

Hardware does make a difference, and this is a nicely parallel problem so I think I can live with this for now.

jumar08:03:26

This is interesting, given that the SSL stuff takes at most 50% (or less), it seems, I wouldn't hope to get a huge performance gain from optimizing it (Ambdahl's law). Did you try to run it without SSL? Does it make a huge difference?

orestis08:03:17

No SSL is not an option so I didn’t even try :) trying to run this on production configurations so even disabling SSL is not something possible.

orestis08:03:24

The thing is that people claim that OpenSSL is 1 or 2 orders of magnitude faster so essentially I would get twice the speed if SSL becomes insignificant.

jumar09:03:34

Assuming 45% spend on SSL and 10x performance speedup (which would be really interesting to verify if that's possible) you get at most 1.7x speedup: https://www.google.com/search?q=1+%2F+(+(1+-+0.45)+%2B+0.45%2F10)+)&amp;rlz=1C5CHFA_enCZ836CZ837&amp;oq=1+%2F+(+(1+-+0.45)+%2B+0.45%2F10)+)&amp;aqs=chrome..69i57j6j69i64l2.11187j0j7&amp;sourceid=chrome&amp;ie=UTF-8 (https://en.wikipedia.org/wiki/Amdahl%27s_law) So it depends whether that is enough - I assume if you noticed it is slow, this speedup might not be enough 🙂