Hello. I'm using etaoin together with chrome (headless) & chrome webdriver (`139.0.7258.154/linux64` )
I made a docker image out of my little testing app and it's working fine for me (linux) and even my colleagues (macos), when they execute the image on their computers.
But when I run the exactly same docker image in a CI pipeline, the same code actually times out after a few waiting attempts in e/wait-predicate . I never saw this happenning on my machine, but on that CI runner, it happens constantly.
๐งต
The timeout exception goes trough this path:
{:type java.net.http.HttpTimeoutException
:message "request timed out"
:at [jdk.internal.net.http.HttpClientImpl send "HttpClientImpl.java" 953]}]
:trace
[[jdk.internal.net.http.HttpClientImpl send "HttpClientImpl.java" 953]
[jdk.internal.net.http.HttpClientFacade send "HttpClientFacade.java" 133]
[babashka.http_client.internal$request invokeStatic "internal.clj" 322]
[babashka.http_client.internal$request invoke "internal.clj" 303]
[babashka.http_client$request invokeStatic "http_client.clj" 125]
[babashka.http_client$request invoke "http_client.clj" 100]
[etaoin.impl.client$http_request invokeStatic "client.clj" 85]
[etaoin.impl.client$http_request invoke "client.clj" 82]
[etaoin.impl.client$call$fn__2226 invoke "client.clj" 121]
[etaoin.impl.client$call invokeStatic "client.clj" 121]
[etaoin.impl.client$call invoke "client.clj" 91]
[etaoin.api$execute invokeStatic "api.clj" 264]
[etaoin.api$execute invoke "api.clj" 238]
[etaoin.api$get_url invokeStatic "api.clj" 533]
Which comes from (e/get-url driver) call:
(e/wait-predicate (fn []
(let [current-url (e/get-url driver)]
(log/info ".. โณ awaiting redirect with access-token")
(str/includes? current-url "access_token=")))
{:timeout driver-wait-timeout})
In the log output, I can see that awaiting redirect with access-token got logged several times .. and then timeout.
I wonder what might have happened. Webdriver got stuck? Chrome got stuck?
Any hint regarding how to debug similar issue, happening only at CI runner? Anyone dealt with anything similar in the past perhaps?
> debug similar issue, happening only at CI runner?
FWIW, I usually debug these things by SSH-ing into the CI runner
that would be awesome. Not sure I can easily SSH there though. worth checking
what is it, github actions?
nope, custom self-hosted gitlab (not my own)
ok, don't know about that one. with circleci and github actions (using some kind of tmux action) I had some luck in the past
You can also install the gitlab CI runner locally and submit your docker job to run there - might be easier to debug. Vaguely recall this being possible without any additional tooling, but a web search also turns this up https://github.com/firecow/gitlab-ci-local
that sounds promising, thank you
By default Etaoin suppresses webdriver output. You might try setting https://cljdoc.org/d/etaoin/etaoin/1.1.43/doc/user-guide#opt-log-stderr to :inherit to see if that tells you anything.
If the chromedriver webdriver does not match the chrome browser version, bad things can happen. That'd be a thing to check too.
If it is truly a timeout (I know that GitHub Actions on Windows can be really slow), you can try adjusting the ETAOIN_TIMEOUT env var. The default is 60 seconds.
the versions are matching. Regarding the
:log-stdout
:log-stderr
I've already configured them to log to /tmp files & am slurping & logging them on exception. But there doesn't seem to be anything usefulbtw I don't insist on using chromedriver. Locally I'm using firefox driver and it works and is even as fast as the chrome version. But in CI pipeline, it was really unexpectedly slow. Thus I used chrome.
I might try switching to some recent firefox+firefoxdriver and see if it works in CI or not ๐ค
My only experience is with GitHub Actions CI. On GHA CI, Linux is by far the fastest and most reliable. Windows is slow and relatively unreliable (https://github.com/clj-commons/etaoin/blob/1db76f857dbe2bdc304fa11f083fa187e1a253c6/.github/workflows/test.yml#L158). MacOS is typically slow, and Safari can be a gong show.
the pipeline we're using is running on linux
Yeah, on GHA CI, Linux is the clear winner in terms of speed and reliability.
Maybe the CI you are stuck with is dead slow. Does it seem sluggish while running your tests? If so, try bumping ETAOIN_TIMEOUT to something ridiculously big to see if that has an effect.
Anyhow, if you remain stuck, keep sharing what you've tried. And if you find success, please do share that too!
not really, it seems to kick in and go relatively fast..
2025-08-27T12:28:15.636Z c2a9d8143280 INFO [core:318] - ๐ฆ Initializing bank-id
2025-08-27T12:28:15.667Z c2a9d8143280 INFO [core:328] - ๐ Browsing through BankId flow
2025-08-27T12:28:16.859Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:17.197Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:17.533Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:17.869Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:18.206Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:18.543Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
2025-08-27T12:28:18.880Z c2a9d8143280 INFO [core:346] - .. โณ awaiting redirect with access-token
those awaiting redirect with access-token checks seem to get executed regularly, without issues, every ~300ms.. But after couple of those checks (7 in this case), there's a stop of log output and later the timeout occursThanks for the ideas. I'll try them and see if something makes any difference
Long shot: I guess if you are hitting a live site, it could be reacting differently because requests are coming from a different IP address?
good thinking, it can definitely be a difference. Perhaps it could help to enable some screenshot recording to be able to see the "what the browser sees" before the timeout ๐
Oh yeah, that's been helpful in the past too, good idea.
it's still weird that the timeout happens on (e/get-url driver)
I mean.. it returns instantly 6 times.. and then 7th attempt times out ? strange..
Yeah, absolutely, the world of webdrivers can be a mysterious and confusing one!
HA! Got it working ๐ ๐ ๐พ
And guess what?
IT'S ALWAYS DNS!
@lee You were wery right to question the "live site". In fact "live site" itself wasn't the problem. The returnUrl leading from that 3rd-party "live site" back to "our site" was. The hostname of "our site" was not reachable by the CI pipeline.. in that nasty "it will take forever" way.
Ah nice! Glad you are sorted out!