10-Oct-2023

Author Lightrun Team

Best Practices

Debugging Modern Applications: Advanced Techniques

Lightrun Team

10-Oct-2023

Today’s applications are designed to be always available and serve users 24/7. Performing live debugging on such applications is akin to doctors operating on a patient.

Since the advent of the “as a service” model, software is like a living, breathing entity, akin to an anatomical system. Operating on such entities requires more dexterity on the developer’s part, to ensure that the software application lives on while being debugged and improved continuously.

Let’s look at time travel debugging, continuous observability, and more advanced debugging and live debugging techniques that are available to developers working on modern applications.

1: Time Travel Debugging for Live Issue Analysis

Time travel debugging allows developers to reconstruct and replay the historical runtime state of a running application. The runtime state consists of logs, snapshots, and other metrics, and data is captured with timestamps. Therefore, it can be time-traversed by going back and forth in time to understand the series of events that led to a bug.

Replaying the runtime execution sequence makes it possible to understand the system behavior better. Visualization also plays an important role in this process. There are several ways of visualizing the runtime state for assisting in time travel debugging, such as:

Timeline view: a chart of logged events plotted along a timeline.
Object graphs: a graph that depicts objects, their properties, and references between objects as nodes.
Memory heat maps to illustrate memory allocation and access patterns.

Apart from these visualization approaches, it is also possible to reconstruct a visual illustration of the runtime behavior based on standard UML (Unified Modelling Language) diagrams, such as state diagrams and sequence charts.

2. Chaos Testing for Live Simulation of Disasters

Chaos testing is a technique that intentionally introduces various failures into a software system. The main goal of this test is to measure the resiliency of the software and its ability to recover from unpredictable conditions.

This is not a debugging technique to fix a specific problem. Instead, it is a strategic debugging approach for assessing software reliability in the face of extreme disasters.

Some of the primary approaches to performing chaos testing include:

Injecting failures. Failures like network delays, server crashes, expired certificates, etc., randomly simulated to trigger anomalous behaviors.
Exceeding thresholds. Deliberately increasing the load on the system to force a breach on certain technical thresholds, such as network bandwidth, data storage, computing power, etc., that cause resource exhaustion.
Global disruption. Disrupting essential services the system depends on, like databases, message queues, caches, APIs, etc., by stopping/killing processes or shutting down critical infrastructure like servers, availability zones/regions.
Forced security intrusion. Forced security breaches in the way of simulated attacks, access loopholes, and failed authentication procedures to validate system sanity and understand attack vectors.

3. Shift Right Testing for Live Performance Predictions

Shift right testing is a DevOps culture. It mandates testing the software in a real world scenario earlier in the development phase. This approach is the opposite of the shift left methodology, which requires the developers to perform quality and security checks in development before getting the code into production.

Both approaches complement each other. However, achieving shift right testing is operationally intensive. That is because it involves reproducing the production environment and simulating heavy user traffic, which should be of the same order of magnitude as production traffic.

Like chaos testing, shift right testing is a broader debugging strategy. This approach de-risks the production deployment from unforeseen issues that may cause disruptions later due to undiscovered severe bugs.

4. Continuous Observability for Live Debugging

Continuous observability allows developers to observe and record the internal state of software during the entire DevOps cycle. More importantly, this is performed without any alteration at the source code level. This approach is best suited for live debugging of specific issues without halting runtime execution or forcing changes to the source code to capture telemetry data.

Continuous observability is best achieved by injecting an agent within the running software. The agent occupies a minimum footprint and captures logs, snapshots, and other metrics required for analysis during live debugging. This technique also complements time travel debugging since the data captured during live debugging can be sorted in time order to analyze the bug.

Supercharge Live Debugging with Lightrun

At Lightrun, we are passionate about helping developers improve their debugging productivity. Lightrun is designed to integrate with IDEs just like their native debuggers but with advanced live debugging support.

Unlike traditional debuggers, which halt the runtime execution of software during debugging, Lightrun allows developers to perform these steps dynamically while the runtime execution carries on. Behind the scenes, this capability is backed by dynamic logs, dynamic telemetry, and dynamic instrumentation.

Dynamic logs from Lightrun can be exported to a visualization platform for time travel debugging. Dynamic telemetry allows chaos and shift right tests to capture valuable data about system performance under various simulated load conditions. Above all, dynamic instrumentation allows developers to set virtual breakpoints anywhere in the source code for continuous observability of the software under production.

If you want to experience what it is like to perform live debugging on running production software, sign up for a free Lightrun trial and get started within minutes with your Java, Python, Node.js, or .NET applications. If you’d rather know more before you start, feel free to request a Lightrun demo.

It’s Really not that Complicated.

You can actually understand what’s going on inside your live applications.

Try Lightrun’s Playground

Deployment Patterns

Environments

IDEs

New!

Debugging Modern Applications: Advanced Techniques

1: Time Travel Debugging for Live Issue Analysis

2. Chaos Testing for Live Simulation of Disasters

3. Shift Right Testing for Live Performance Predictions

4. Continuous Observability for Live Debugging

Supercharge Live Debugging with Lightrun

It’s Really not that Complicated.

Deployment Patterns

Environments

IDEs

New!

Debugging Modern Applications: Advanced Techniques

1: Time Travel Debugging for Live Issue Analysis

2. Chaos Testing for Live Simulation of Disasters

3. Shift Right Testing for Live Performance Predictions

4. Continuous Observability for Live Debugging

Supercharge Live Debugging with Lightrun

7 Must-Have Steps for Production Debugging in Any Language

A Peek into the Next Generation Observability Solutions

Dynamic Observability Tools for API Live Debugging

It’s Really not that Complicated.

Lets Talk!