Selenium standalone (grid) hangs from time to time (4.0 beta or rc1)
See original GitHub issue💥 Regression Report
Last working Selenium version
Worked up to version: 3.141.59 (selenium standalone) Stopped working in version: 4.0 (beta, rc1)
Steps to reproduce:
- Start selenium standalone:
java -Dwebdriver.chrome.driver=chromedriver.exe -Dwebdriver.edge.driver=msedgedriver.exe -jar selenium-server-4.0.0-rc-1.jar standalone --port 4444 --max-sessions 8 --override-max-sessions true --log ./server.log --session-timeout 3600
- Run several tests in parallel; execution status will be ok;
- Don’t stop/restart selenium standalone process;
- In the next days try to run tests again
Expected: the behavior is the same as in step#2 Actual: some times (not always in the next day after grid starting, could be in 2-3-4 days) webdriver can’t establish connection to selenium. If I open selenium console in browser by link http://<selenium-host>:4444 then it trying to display sessions, but can’t load.
The output logs are following:
09:38:06.467 WARN [SeleniumSpanExporter$1.lambda$export$0] - {"traceId": "1b8f54a543ea9542950248f1deb040cb","eventTime": 1631601486466429800,"eventName": "exception","attributes": {"exception.message": "Unable to execute request for an existing session: Unable to find session with ID: 3686a38bf7fed013fee68a9554aaa8f6\nBuild info: version: '4.0.0-rc-1', revision: 'bc5511cbda'\nSystem info: host: 'PC-GENAT01', ip: '10.98.74.219', os.name: 'Windows 10', os.arch: 'amd64', os.version: '10.0', java.version: '1.8.0_291'\nDriver info: driver.version: unknown","exception.stacktrace": "org.openqa.selenium.NoSuchSessionException: Unable to find session with ID: 3686a38bf7fed013fee68a9554aaa8f6\nBuild info: version: '4.0.0-rc-1', revision: 'bc5511cbda'\nSystem info: host: 'PC-GENAT01', ip: '10.98.74.219', os.name: 'Windows 10', os.arch: 'amd64', os.version: '10.0', java.version: '1.8.0_291'\nDriver info: driver.version: unknown\r\n\tat
org.openqa.selenium.grid.sessionmap.local.LocalSessionMap.get(LocalSessionMap.java:129)\r\n\tat
org.openqa.selenium.grid.router.HandleSession.lambda$loadSessionId$3(HandleSession.java:136)\r\n\tat
io.opentelemetry.context.Context.lambda$wrap$2(Context.java:219)\r\n\tat
org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:110)\r\n\tat
org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:373)\r\n\tat
org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat
org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat
org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat
org.openqa.selenium.grid.router.Router.execute(Router.java:91)\r\n\tat
org.openqa.selenium.grid.web.CheckOriginHeader.lambda$apply$0(CheckOriginHeader.java:66)\r\n\tat
org.openqa.selenium.grid.web.CheckContentTypeHeader.lambda$apply$0(CheckContentTypeHeader.java:70)\r\n\tat
org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)\r\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\r\n\tat
org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat
org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat
org.openqa.selenium.remote.http.Route$NestedRoute.handle(Route.java:270)\r\n\tat
org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat
org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat
org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat
org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:336)\r\n\tat
org.openqa.selenium.remote.http.Route.execute(Route.java:68)\r\n\tat
org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)\r\n\tat
org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\r\n\tat
org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\r\n\tat
org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\r\n\tat
org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:64)\r\n\tat
org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)\r\n\tat
java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)\r\n\tat java.util.concurrent.FutureTask.run(Unknown Source)\r\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\r\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\r\n\tat java.lang.Thread.run(Unknown Source)\r\n","exception.type": "org.openqa.selenium.NoSuchSessionException","http.flavor": 1,"http.handler_class": "org.openqa.selenium.grid.router.HandleSession","http.host": "pc-genat01:4444","http.method": "GET","http.request_content_length": "0","http.scheme": "HTTP","http.target": "\u002fsession\u002f3686a38bf7fed013fee68a9554aaa8f6\u002fscreenshot","http.user_agent": "selenium\u002f4.0.0-beta.4 (js windows)","session.id": "3686a38bf7fed013fee68a9554aaa8f6"}}
It repeats each time when webdriver trying to connect to selenium. Restart of selenium-stanndalone process help, but also not always and could work from 1, 2,…n attempts of restart.
Environment
OS: Windows 10 Browser: Chrome, Edge Browser version: 93 (reproduced also previously in 90, 91, 92) Browser Driver version: 93 (reproduced also previously in 90, 91, 92) Language Bindings version: Javascript Selenium Grid version (if applicable): 4.0 rc1
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (5 by maintainers)
Top GitHub Comments
@diemol I’ve found some regularity when such issue could occur. It seems if by some reasons webdriver session wasn’t quit properly (driver.quit() wasn’t called) and process exited with code 1 - then the next time selenium web console and grid will hang and new webdriver session couldn’t started as well. I still can’t find any useful info in logs. But, as far as I understand, if process of webdriver was killed by some unexpected reasons then selenium grid should close it by timeout as well?
This seems hard to troubleshoot from our side because the details and steps to reproduce are very ambiguos. One thing that I keep seeing in your comments is
Unable to execute request for an existing session: Unable to find session with ID:
, which means that you are trying to quey a session that already stopped.In the end, what we can do to help is just to give you hints on what to check (like monitor CPU and RAM, running processes to identify what is going on when the situation happens).
If have had Grids running overnight and I have not bumped into this issue. We would be happy to troubleshoot if you give us a step by step guide how to reproduce it, including the tests needed to create the same situation you are having on your end.