[Grid] Feature Request: Request Access Logs
See original GitHub issueš Feature Proposal
For all Grid Components (Hub, Node, Router, etc.) it would be beneficial to allow storing request access logs to assist in investigating unauthorized access. Per recommendation of security engineers, these access logs should include (apache combined log format & X-Forwarded-For):
- Client IP
- Timestamp
- HTTP Method
- Path
- HTTP Response Code
- User agent
- Number of byes returned to the client
- Referer
- X-Forwarded-For
These can either be:
- Stored within a file
- Embedded within the OpenTelemetry Data (though Iād need to research how to export it out and into its own dedicated file)
I noticed OpenTelemetry already has some of the data (method, path, status code, IP although unsure if thatās the client or the host machineās ip).
If Grid utilized Reactor Netty, access logs likely could be easily enabled.
My search for ānetty access logsā only show Reactor Netty, which is why I concluded itās not possible to do with Netty out of the box.
Motivation
Security Engineers may ask for services to log requests. While itās possible to put Nginx (with logs) in front of Grid Hub or Router, the requests to Grid Node are not logged. Itās important to see what kinds of requests were made across all nodes, in the event there are unauthorized requests.
In Se3 Grid, it was possible to put Nginx in front of a Node, and launch the node with -remoteHost "http://NODE_IP:80"
(port 80 so that the hub would pass requests through nginx, and nginx forward to 5555). The -remoteHost
flag appears to have been removed in Se4 Alpha.
Example
- Launch
java -jar selenium-server-4.0.0-alpha-7.jar hub --access-log-file access.log
or within the config something like:
[logging]
# Configure logging
# Type: boolean
enable = true
# Store Access Logs within a file
# Type: string
access_log_file = access.log
- Make a request to http://localhost:4444/status
- Within
access.log
a line like:127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /status HTTP/1.1" 200 2326 "http://localhost:4444/status" "selenium/3.141.59 (java unix)"
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
Thank you for providing the feature details in such great depth. I understand the need for the requested feature. We are leveraging Open-Telemetry to provide event-logs. More details are described here . The first iteration of adding event-logs added the mandatory fields for HTTP request as per the OpenTelemetry specification. The fields that are missing (and needed is request logs) are also a part of OpenTelemetry specification. As a result, adding those fields seemed like a viable solution. It would address the request logs feature and strengthen the HTTP event-logs context.
The changes are made as part of https://github.com/SeleniumHQ/selenium/pull/8902 . It includes all the fields except Referer (which might not be the valid field in case of Selenium Grid request) and response content-length (the way to do this currently does not seem efficient, since it involves writing all the bytes in memory and counting the length twice).
The logs can be written to a file using āālogs <file-name>ā flag while starting the Grid in Standalone mode or in a fully-distributed mode.
Example:
java -jar /Users/Puja/Projects/repos/selenium/bazel-bin/java/server/src/org/openqa/selenium/grid/selenium_server_deploy.jar standalone --log /Users/Puja/Desktop/LOG.log
Thank you for pointing that out. I am fixing that and re-checking to be sure in-case some other requests were missed out. The main aim is to log any incoming requests from the client.
However, I am afraid that request access logs would be the driving force to move to the Grid Reactor Netty for the next release. I have raised this to the maintainers to get a definitive answer about the future plans to move to Reactory Netty server.