etaoin

Tomas Brejla 2025-08-27T13:28:03.399479Z

Hello. I'm using etaoin together with chrome (headless) & chrome webdriver (`139.0.7258.154/linux64` ) I made a docker image out of my little testing app and it's working fine for me (linux) and even my colleagues (macos), when they execute the image on their computers. But when I run the exactly same docker image in a CI pipeline, the same code actually times out after a few waiting attempts in e/wait-predicate . I never saw this happenning on my machine, but on that CI runner, it happens constantly. ๐Ÿงต

Tomas Brejla 2025-08-27T13:28:45.225829Z

The timeout exception goes trough this path:

{:type java.net.http.HttpTimeoutException
   :message "request timed out"
   :at [jdk.internal.net.http.HttpClientImpl send "HttpClientImpl.java" 953]}]
 :trace
 [[jdk.internal.net.http.HttpClientImpl send "HttpClientImpl.java" 953]
  [jdk.internal.net.http.HttpClientFacade send "HttpClientFacade.java" 133]
  [babashka.http_client.internal$request invokeStatic "internal.clj" 322]
  [babashka.http_client.internal$request invoke "internal.clj" 303]
  [babashka.http_client$request invokeStatic "http_client.clj" 125]
  [babashka.http_client$request invoke "http_client.clj" 100]
  [etaoin.impl.client$http_request invokeStatic "client.clj" 85]
  [etaoin.impl.client$http_request invoke "client.clj" 82]
  [etaoin.impl.client$call$fn__2226 invoke "client.clj" 121]
  [etaoin.impl.client$call invokeStatic "client.clj" 121]
  [etaoin.impl.client$call invoke "client.clj" 91]
  [etaoin.api$execute invokeStatic "api.clj" 264]
  [etaoin.api$execute invoke "api.clj" 238]
  [etaoin.api$get_url invokeStatic "api.clj" 533]

Tomas Brejla 2025-08-27T13:29:59.904549Z

Which comes from (e/get-url driver) call:

(e/wait-predicate (fn []
                        (let [current-url (e/get-url driver)]
                          (log/info ".. โณ awaiting redirect with access-token")
                          (str/includes? current-url "access_token=")))
                      {:timeout driver-wait-timeout})

Tomas Brejla 2025-08-27T13:31:52.584279Z

In the log output, I can see that awaiting redirect with access-token got logged several times .. and then timeout. I wonder what might have happened. Webdriver got stuck? Chrome got stuck? Any hint regarding how to debug similar issue, happening only at CI runner? Anyone dealt with anything similar in the past perhaps?

borkdude 2025-08-27T13:37:11.647269Z

> debug similar issue, happening only at CI runner? FWIW, I usually debug these things by SSH-ing into the CI runner

Tomas Brejla 2025-08-27T13:38:08.475749Z

that would be awesome. Not sure I can easily SSH there though. worth checking

borkdude 2025-08-27T13:38:52.074229Z

what is it, github actions?

Tomas Brejla 2025-08-27T13:39:05.732669Z

nope, custom self-hosted gitlab (not my own)

borkdude 2025-08-27T13:39:29.484659Z

ok, don't know about that one. with circleci and github actions (using some kind of tmux action) I had some luck in the past

cormacc 2025-08-27T14:59:18.165069Z

You can also install the gitlab CI runner locally and submit your docker job to run there - might be easier to debug. Vaguely recall this being possible without any additional tooling, but a web search also turns this up https://github.com/firecow/gitlab-ci-local

Tomas Brejla 2025-08-27T15:15:19.825169Z

that sounds promising, thank you

lread 2025-08-27T17:13:55.803269Z

By default Etaoin suppresses webdriver output. You might try setting https://cljdoc.org/d/etaoin/etaoin/1.1.43/doc/user-guide#opt-log-stderr to :inherit to see if that tells you anything.

lread 2025-08-27T17:16:32.822829Z

If the chromedriver webdriver does not match the chrome browser version, bad things can happen. That'd be a thing to check too.

lread 2025-08-27T17:19:09.205629Z

If it is truly a timeout (I know that GitHub Actions on Windows can be really slow), you can try adjusting the ETAOIN_TIMEOUT env var. The default is 60 seconds.

Tomas Brejla 2025-08-27T17:19:25.867139Z

the versions are matching. Regarding the

:log-stdout 
:log-stderr
I've already configured them to log to /tmp files & am slurping & logging them on exception. But there doesn't seem to be anything useful

Tomas Brejla 2025-08-27T17:22:05.175529Z

btw I don't insist on using chromedriver. Locally I'm using firefox driver and it works and is even as fast as the chrome version. But in CI pipeline, it was really unexpectedly slow. Thus I used chrome.

Tomas Brejla 2025-08-27T17:22:45.217969Z

I might try switching to some recent firefox+firefoxdriver and see if it works in CI or not ๐Ÿค”

lread 2025-08-27T17:25:41.987129Z

My only experience is with GitHub Actions CI. On GHA CI, Linux is by far the fastest and most reliable. Windows is slow and relatively unreliable (https://github.com/clj-commons/etaoin/blob/1db76f857dbe2bdc304fa11f083fa187e1a253c6/.github/workflows/test.yml#L158). MacOS is typically slow, and Safari can be a gong show.

Tomas Brejla 2025-08-27T17:26:57.482589Z

the pipeline we're using is running on linux

lread 2025-08-27T17:28:13.668999Z

Yeah, on GHA CI, Linux is the clear winner in terms of speed and reliability.

lread 2025-08-27T17:32:33.245719Z

Maybe the CI you are stuck with is dead slow. Does it seem sluggish while running your tests? If so, try bumping ETAOIN_TIMEOUT to something ridiculously big to see if that has an effect.

lread 2025-08-27T17:35:29.217219Z

Anyhow, if you remain stuck, keep sharing what you've tried. And if you find success, please do share that too!

๐Ÿ‘ 1
Tomas Brejla 2025-08-27T17:36:01.299729Z

not really, it seems to kick in and go relatively fast..

2025-08-27T12:28:15.636Z c2a9d8143280 INFO [core:318] - ๐Ÿฆ Initializing bank-id
2025-08-27T12:28:15.667Z c2a9d8143280 INFO [core:328] - ๐Ÿ” Browsing through BankId flow
2025-08-27T12:28:16.859Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:17.197Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:17.533Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:17.869Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:18.206Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:18.543Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:18.880Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
those awaiting redirect with access-token checks seem to get executed regularly, without issues, every ~300ms.. But after couple of those checks (7 in this case), there's a stop of log output and later the timeout occurs

Tomas Brejla 2025-08-27T17:37:01.541069Z

Thanks for the ideas. I'll try them and see if something makes any difference

lread 2025-08-27T17:38:37.805869Z

Long shot: I guess if you are hitting a live site, it could be reacting differently because requests are coming from a different IP address?

Tomas Brejla 2025-08-27T17:40:58.915749Z

good thinking, it can definitely be a difference. Perhaps it could help to enable some screenshot recording to be able to see the "what the browser sees" before the timeout ๐Ÿ‘

lread 2025-08-27T17:41:40.020169Z

Oh yeah, that's been helpful in the past too, good idea.

Tomas Brejla 2025-08-27T17:41:43.965269Z

it's still weird that the timeout happens on (e/get-url driver)

Tomas Brejla 2025-08-27T17:42:08.356499Z

I mean.. it returns instantly 6 times.. and then 7th attempt times out ? strange..

lread 2025-08-27T17:43:42.057129Z

Yeah, absolutely, the world of webdrivers can be a mysterious and confusing one!

1
Tomas Brejla 2025-08-27T19:57:01.568339Z

HA! Got it working ๐ŸŽ‰ ๐ŸŽ† ๐Ÿพ And guess what? IT'S ALWAYS DNS! @lee You were wery right to question the "live site". In fact "live site" itself wasn't the problem. The returnUrl leading from that 3rd-party "live site" back to "our site" was. The hostname of "our site" was not reachable by the CI pipeline.. in that nasty "it will take forever" way.

lread 2025-08-27T20:10:16.808189Z

Ah nice! Glad you are sorted out!

1
1