With more than 200 employees and serving 1.3M developers and more than 20 Fortune 100 companies, WhiteSource has to effectively manage its R&D resources and ensure site reliability to its customers. WhiteSource is an agile company, deploying new software versions every two weeks.
Quickly Identifying an Error in Production
The error occurred in a method that collected results from several threads, but WhiteSource could not immediately identify which of the tasks had the problem. The stack trace showed that the last line of code running was for collecting the thread results.
The last WS line:
The last line in the stack trace:
WhiteSource was able to identify that the overall exception was an SQL syntax error. However,
the log made it very difficult to identify which query was throwing the exception and making it fail.
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘) group by projectinv1_.projectId’ at line 1
The logs were not informative and did not enable WhiteSource to identify the solution in the current version running in production. They had to find a way to identify the root cause.
In the past, WhiteSource had resorted to adding new logs to suspected lines of codes in the next deployment. Occasionally, they had had to go through several iterations of adding logs to new versions, until the issue was detected. They would remove the logs in the following deployment, to decrease overhead and logging costs.
This process would sometimes take weeks of iterations, and many hours of developer
time -for waiting for the changes to be deployed to production, for inspecting code behavior and re-exploring the issue. The developer would also have to deal with a lot of context switches – every iteration would require the developer to re-read the relevant code, recall the assumptions and continue from there. In the meantime, the version would be running with an error.
“Using Lightrun to debug an actual issue in production enabled us to react instantly. We were able to add the right logs and identify the root-cause in a real-time session, instead of waiting for redeployments”
Adding Logs with Lightrun On-demand
WhiteSource used Lightrun to dynamically add logs to each thread in production. They needed to identify the problematic query, among all the MySQL queries in their system.
With Lightrun, integrated into their IDE, WhiteSource was able to add these logs in real-time. The process was simple and only took them a few moments. They were then able to quickly identify where the problematic flow occurred and which lines were executed.