[š Bug]: Hub hangs with "Unable to find session" after a while
See original GitHub issueWhat happened?
Iāve set up a Hub and 2 Nodes on three different Ubuntu servers. The Hub is on Ubuntu 16.04.7 LTS and the two nodes on Ubuntu 20.04.4 LTS. Selenium is driven by a Java program using standard libraries.
All works fine for a while (an hour or so), then suddenly the Hub stops responding. The symptoms are:
- when connecting to
http://myhub:4444/ui/index.html
I get the normal āSelenium Gridā page but it is stuck on the āLoadingā¦ā message - the hub output shows many messages containing this:
"exception.message": "Unable to execute request for an existing session: Unable to find session with ID: 98d1a9efa8b787398b29c2d6c5f2ddab
Build info: version: '4.1.2', revision: '9a5a329c5a'
System info: host: 'controller', ip: 'xx.xx.xx.xx', os.name: 'Linux', os.arch: 'amd64', os.version: '4.4.0-210-generic', java.version: '1.8.0_292'
Driver info: driver.version: unknown"
- the java program gets the exception shown in the āRelevant log outputā box.
If I restart the Hub, all goes back to normal: I donāt have to restart either the nodes or the Java program.
Command to start the hub:
java -jar selenium-server-4.1.2.jar hub --config hub.toml
Content of hub.toml:
[server]
port = 4444
registration-secret = "xxxxxx"
[sessionqueue]
session-request-timeout = 300
Command to start the node:
java -jar selenium-server-4.1.2.jar node --config node.toml
Content of node.toml:
[server]
port = 5555
registration-secret = "xxxxxx"
[node]
hub = "xx.xx.xx.xx"
override-max-sessions = true
max-sessions = 8
session-timeout = 60
detect-drivers = false
[[node.driver-configuration]]
display-name = "Chrome Latest"
stereotype = "{\"browserName\": \"chrome\"}"
webdriver-executable = '/mypath/chromedriver'
Output of top (last line is the hub):
top - 15:11:10 up 22:00, 3 users, load average: 0.20, 0.23, 0.19
Tasks: 208 total, 1 running, 207 sleeping, 0 stopped, 0 zombie
%Cpu(s): 12.5 us, 1.2 sy, 0.0 ni, 82.3 id, 0.2 wa, 0.0 hi, 1.7 si, 2.2 st
KiB Mem : 8175040 total, 1773696 free, 3951596 used, 2449748 buff/cache
KiB Swap: 8388604 total, 8348196 free, 40408 used. 3875604 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
908 tomcat8 20 0 4055844 1.196g 17808 S 21.9 15.3 115:44.34 java
1183 mysql 20 0 1670676 528960 15472 S 2.3 6.5 25:39.78 mysqld
15065 root 20 0 4685516 346108 19128 S 2.0 4.2 11:59.94 java
If I restart just the hub, all goes back to normal: nodes register and I see the queue growing in the Selenium Grid web console.
How can we reproduce the issue?
I know you need to reproduce the problem but I can't share my whole application: it's not just a script and it needs a lot of work to be set up (tomcat, mysql, user data etc.)
Relevant log output
org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
Build info: version: '4.1.2', revision: '9a5a329c5a'
System info: host: 'controller', ip: 'xx.xx.xx.xx', os.name: 'Linux', os.arch: 'amd64', os.version: '4.4.0-210-generic', java.version: '1.8.0_292'
Driver info: net.yadaframework.selenium.YadaSeleniumUtil$1
Command: [null, newSession {capabilities=[Capabilities {browserName: chrome, goog:chromeOptions: {args: [user-data-dir=/srv/xxx/..., user-agent=Mozilla/5.0 (Win...], extensions: []}, proxy: {httpProxy: yy.yy.yy.yy:9092, proxyType: manual, sslProxy: yy.yy.yy.yy:9092}}], desiredCapabilities=Capabilities {browserName: chrome, goog:chromeOptions: {args: [user-data-dir=/srv/xxx/..., user-agent=Mozilla/5.0 (Win...], extensions: []}, proxy: Proxy(manual, http=yy.yy.yy...}}]
Capabilities {}
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:576)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:245)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:161)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:143)
at net...
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.UncheckedIOException: java.io.IOException: Stream closed
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.makeCall(NettyHttpHandler.java:80)
at org.openqa.selenium.remote.http.RetryRequest.lambda$apply$6(RetryRequest.java:80)
at net.jodah.failsafe.Functions.lambda$get$0(Functions.java:48)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:66)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:66)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:66)
at net.jodah.failsafe.Execution.executeSync(Execution.java:128)
at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:379)
at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:68)
at org.openqa.selenium.remote.http.RetryRequest.lambda$apply$7(RetryRequest.java:80)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyHttpHandler.execute(NettyHttpHandler.java:51)
at org.openqa.selenium.remote.http.RetryRequest.lambda$apply$6(RetryRequest.java:80)
at net.jodah.failsafe.Functions.lambda$get$0(Functions.java:48)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:66)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:66)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:66)
at net.jodah.failsafe.Execution.executeSync(Execution.java:128)
at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:379)
at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:68)
at org.openqa.selenium.remote.http.RetryRequest.lambda$apply$7(RetryRequest.java:80)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:56)
at org.openqa.selenium.remote.http.netty.NettyClient.execute(NettyClient.java:110)
at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:55)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:102)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:84)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:62)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:156)
at org.openqa.selenium.remote.TracedCommandExecutor.execute(TracedCommandExecutor.java:51)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:558)
... 17 common frames omitted
Caused by: java.io.IOException: Stream closed
at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:159)
at java.io.BufferedInputStream.available(BufferedInputStream.java:410)
at java.io.FilterInputStream.available(FilterInputStream.java:168)
at java.io.PushbackInputStream.available(PushbackInputStream.java:277)
at io.netty.handler.stream.ChunkedStream.isEndOfInput(ChunkedStream.java:82)
at io.netty.handler.stream.ChunkedStream.readChunk(ChunkedStream.java:109)
at io.netty.handler.stream.ChunkedStream.readChunk(ChunkedStream.java:36)
at io.netty.handler.stream.ChunkedWriteHandler.doFlush(ChunkedWriteHandler.java:229)
at io.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:131)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:750)
at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:765)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:790)
at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:758)
at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1020)
at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:311)
at org.asynchttpclient.netty.request.body.NettyInputStreamBody.write(NettyInputStreamBody.java:77)
at org.asynchttpclient.netty.request.NettyRequestSender.writeRequest(NettyRequestSender.java:433)
at org.asynchttpclient.netty.channel.NettyConnectListener.writeRequest(NettyConnectListener.java:80)
at org.asynchttpclient.netty.channel.NettyConnectListener.onSuccess(NettyConnectListener.java:156)
at org.asynchttpclient.netty.channel.NettyChannelConnector$1.onSuccess(NettyChannelConnector.java:92)
at org.asynchttpclient.netty.SimpleChannelFutureListener.operationComplete(SimpleChannelFutureListener.java:26)
at org.asynchttpclient.netty.SimpleChannelFutureListener.operationComplete(SimpleChannelFutureListener.java:20)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:300)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:335)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 common frames omitted
Operating System
Ubuntu 16.04.7 LTS
Selenium version
Java 1.8.0_292, Selenium 4.1.2.
What are the browser(s) and version(s) where you see this issue?
Chrome 98.0.4758.102
What are the browser driver(s) and version(s) where you see this issue?
ChromeDriver 98.0.4758.80
Are you using Selenium Grid?
4.1.2
Issue Analytics
- State:
- Created 2 years ago
- Comments:20 (5 by maintainers)
Top GitHub Comments
Here is a jar I generated with this fix, you can try it while the release is done.
selenium-server-4.1.3-df41b58162.zip
I upgraded my grid to 4 and now have a similar issue. After about an hour of running tests, Iāll start getting āCould not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.ā when trying to get a new chrome instance. When this happens, I notice that the hub api stops responding to my requests when I call the graphql endpoint to get status details. 4.1.3 did not fix the issue for me.