Fork me on GitHub
#jvm
<
2023-09-01
>
RAJKUMAR22:09:38

I've question related to java threads

RAJKUMAR22:09:49

We are using the library again https://github.com/liwp/again/tree/master for retry purposes

RAJKUMAR22:09:13

This library internally uses Thread/sleep for delay purposes https://github.com/liwp/again/blob/master/src/again/core.clj#L90-L92

RAJKUMAR22:09:50

The problem is facing is when Thread/sleep is invoked then the thread is not interrupted even it slept for enough delay time

RAJKUMAR22:09:04

I took thread dump with the span of 2 minutes

RAJKUMAR22:09:18

around 5 thread dump

RAJKUMAR22:09:35

In all those the same thread is always sleeping

RAJKUMAR22:09:07

what is the best solution in this case?

hiredman22:09:34

that is retrying

hiredman22:09:43

sleep -> try and fail -> sleep in a loop

hiredman22:09:09

depending on how fast the fails happen, how long the sleeps are, etc it is unlikely that at any point when you thread dump you will get a dump not sleeping

hiredman22:09:27

for example with this program

(defn work []
  (Thread/sleep 10))

(defn backoff-and-retry []
  (Thread/sleep 10000))

(loop []
  (if (zero? (rand-int 5))
    (do
      (work)
      (recur))
    (do
      (backoff-and-retry)
      (recur))))
The thread dump will pretty much always contain the back-of-and-retry function, and not the work function, because of the ratio of the amount of time it spends executing each

RAJKUMAR22:09:58

my concern is even after couple of hours it is sleeping

hiredman22:09:14

what makes you think it is the same sleep?

RAJKUMAR22:09:37

this is the retry-startegy

RAJKUMAR22:09:40

{:initial-retry-count            2
  :initial-delay-ms               10
  :exponential-backoff-multiplier 2.0
  :max-delay-ms                   1000}

hiredman22:09:23

ok, but is that the only retry? what is the behavior of the code that is driving this code

RAJKUMAR22:09:44

the operation is it is updating HSET into redis

hiredman22:09:08

sure, but this code with the retry around it doesn't exist in isolation right?

hiredman22:09:14

something else is calling it

hiredman22:09:28

like as a result of webrequests?

hiredman22:09:39

how often are web requests coming in?

hiredman22:09:03

are those being served on a threadpool that is re-using the same thread? is the hset always failing quickly?

RAJKUMAR22:09:04

req/s are very few in STG environment

RAJKUMAR22:09:31

like 1k req/hr

hiredman22:09:40

that would be enough

RAJKUMAR22:09:54

then it is idle for reminder 23 hours of the day

hiredman22:09:39

there could be some bug in the again library, where it sleeps too long I guess

hiredman22:09:54

but Thread/sleep itself is pretty solid, it is used day in and day out by basically every non-trivial jvm program in existence

hiredman22:09:50

"initial-retry-count" as a string doesn't seem to appear in the library, maybe check to make sure your configuration is correct

RAJKUMAR22:09:34

I think it is same sleep because some of the messages from the kinesis stream are not processed

RAJKUMAR22:09:53

and all the threads are idle in top -H -p $pid command

hiredman22:09:57

that would also be true if it was looping around a sleep

hiredman22:09:15

and infinite try fail retry loop

RAJKUMAR22:09:56

one more question

RAJKUMAR22:09:17

what if I used async/timeout instead of sleep

RAJKUMAR22:09:28

(defn- sleep [delay]
  (clojure.core.async/timeout delay))

hiredman22:09:46

timeout itself doesn't do anything

hiredman22:09:55

it returns a channel that will be closed after some delay

hiredman22:09:16

and you can block on the channel waiting for it to close

hiredman22:09:02

https://github.com/clojure/core.async/blob/master/src/main/clojure/clojure/core/async/impl/timers.clj#L43 is the loop that services timeout channels, it is built on top of a delayqueue

👍 2
hiredman23:09:04

the reason core.async has timeout channels is because 1. Thread/sleep blocks a thread, so you shouldn't use it in go blocks and 2. you can use timeout channels in things like alts which you cannot do with Thread/sleep