Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Proposal: Destructible Types

See original GitHub issue

Background

C# is a managed language. One of the primary things that’s “managed” is memory, a key resource that programs require. Programs are able to instantiate objects, requesting memory from the system, and at some point later when they’re done with the memory, that memory can be reclaimed automatically by the system’s garbage collector (GC). This reclaiming of memory happens non-deterministically, meaning that even though some memory is now unused and can be reclaimed, exactly when it will be is up to the system rather than being left to the programmer to determine. Other languages, in particular those that don’t use garbage collection, are more deterministic in when memory will be reclaimed. C++, for example, requires that developers explicitly free their memory; there is typically no GC to manage this for the developer, but that also means the developer gets complete control over when resources are reclaimed, as they’re handling it themselves.

Memory is just one example of a resource. Another might be a handle to a file or to a network connection. As with any resource, a developer using C++ needs to be explicit about when such resources are freed; often this is done using a “smart pointer,” a type that looks like a pointer but that provides additional functionality on top of it, such as keeping track of any outstanding references to the pointer and freeing the underlying resource when the last reference is released.

C# provides multiple ways of working with such “unmanaged” resources, resources that, unlike memory, are not implicitly managed by the system. One way is by linking such a resource to a piece of memory; since the system does know how to track objects and to release the associated memory after that object is no longer being referenced, the system allows developers to piggyback on this and to associate an additional piece of logic that should be run when the object is collected. This logic, known as a “finalizer,” allows a developer to create an object that wraps an unmanaged resource, and then to release that resource when the associated object is collected. This can be a significant simplification from a usability perspective, as it allows the developer to treat any resource just as it does memory, allowing the system to automatically clean up after the developer.

However, there are multiple downsides to this approach, and some of the biggest reliability problems in production systems have resulted from an over-reliance on finalization. One issue is that the system is managing memory, not unmanaged resources. It has heuristics that help it to determine the appropriate time to clean up memory based on the system’s understanding of the memory being used throughout the system, but such a view of memory doesn’t provide an accurate picture about any pressures that might exist on the associated unmanaged resources. For example, if the developer has allocated but then stopped using a lot of file-related objects, unless the developer has allocated enough memory to trigger the garbage collector to run, the system will not know that it should run the garbage collector because it doesn’t know how to monitor the “pressure” on the file system. Over the years, a variety of techniques have been developed to help the system with this, but none of them have addressed the problem completely. There is also a performance impact to abusing the GC in this manner, in that allocating lots of finalizable objects can add a significant amount of overhead to the system.

The biggest issue with relying on finalizers is the non-determinism that results. As mentioned, the developer doesn’t have control over when exactly the resources will be reclaimed, and this can lead to a wide variety of problems. Consider an object that’s used to represent a file: the object is created when the file is opened, and when the object is finalized, the file is closed. A developer opens the file, manipulates it, and then releases the object associated with it; at this point, the file is still open, and it won’t be closed until some non-deterministic point in the future when the system decides to run the garbage collector and finalize any unreachable objects. In the meantime, other code in the system might try to access the file, and be denied, even though no one is actively still using it.

To address this, the .NET Framework has provided a means for doing more deterministic resource management: IDisposable. IDisposable is a deceptively simple interface that exposes a single Dispose method. This method is meant to be implemented by an object that wraps an unmanaged resource, either directly (a field of the object points to the resource) or indirectly (a field of the object points to another disposable object), which the Dispose method frees. C# then provides the ‘using’ construct to make it easier to create resources used for a particular scope and then freed at the end of that scope:

using (var writer = new StreamWriter("file.txt")) { // writer created
    writer.WriteLine("hello, file");
}                                                   // writer disposed

Problem

While helpful in doing more deterministic resource management, the IDisposable mechanism does suffer from problems. For one, there’s no guarantee made that it will be used to deterministically free resources. You’re able to, but not required to, use a ‘using’ to manage an IDisposable instance.

This is complicated further by cases where an IDisposable instance is embedded in another object. Over the years, FxCop rules have been developed to help developers track cases where an IDisposable goes undisposed, but the rules have often yielded non-trivial numbers of both false positives and false negatives, resulting in the rules often being disabled.

Additionally, the IDisposable pattern is notoriously difficult to implement correctly, compounded by the fact that because objects may not be deterministically disposed of via IDisposable, IDisposable objects also frequently implement finalizers, making the pattern that much more challenging to get right. Helper classes (like SafeHandle) have been introduced over the years to assist with this, but the problem still remains for a large number of developers.

Solution: Destructible Types

To address this, we could add the notion of “destructible types” to C#, which would enable the compiler to ensure that resources are deterministically freed. The syntax for creating a destructible type, which could be either a struct or a class, would be straightforward: annotate the type as ‘destructible’ and then use the ‘~’ (the same character used to name finalizers) to name the destructor.

public destructible struct OutputMessageOnDestruction(string message)
{
    string m_message = message;

    ~OutputMessageOnDestruction() // destructor
    {
        if (message != null)
            Console.WriteLine(message);
    }
}

An instance of this type may then be constructed, and the compiler guarantees that the resource will be destructed when the instance goes out of scope:

public void Example()
{
    var omod = new OutputMessageOnDestruction("Destructed!");
    SomeMethod();
} // 'omod' destructed here

No matter what happens in SomeMethod, regardless of whether it returns successfully or throws an exception, the destructor of ‘omod’ will be invoked as soon as the ‘omod’ variable goes out of scope at the end of the method, guaranteeing that “Destructed!” will be written to the console.

Note that it’s possible for a destructible value type to be initialized to a default value, and as such the destruction could be run when none of the fields have been initialized. Destructible value type destructors need to be coded to handle this, as was done in the ‘OutputMessageOnDestruction’ type previously by checking whether the message was non-null before attempting to output it.

public void Example()
{
    OutputMessageOnDestruction omod = default(OutputMessageOnDestruction);
    SomeMethod();
} // default 'omod' destructed here

Now, back to the original example, consider what would happen if ‘omod’ were stored into another variable. We’d then end up with two variables effectively wrapping the same resource, and if both variables were then destructed, our resource would effectively be destructed twice (in our example resulting in “Destructed!” being written twice), which is definitely not what we want. Fortunately, the compiler would ensure this can’t happen. The following code would fail to compile:

OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
OutputMessageOnDestruction omod2 = omod1; // Error: can't copy destructible type

The compiler would prevent such situations from occurring by guaranteeing that there will only ever be one variable that effectively owns the underlying resource. If you want to assign to another variable, you can do that, but you need to use the ‘move’ keyword (#160) to transfer the ownership from one to the other; this effectively performs the copy and then zeroes out the previous value so that it’s no longer usable. In compiler speak, a destructible type would be a “linear type,” guaranteeing that destructible values are never inappropriately “aliased”.

OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
OutputMessageOnDestruction omod2 = move omod1; // Ok, 'omod1' now uninitialized; won't be destructed

This applies to passing destructible values into method calls as well. In order to pass a destructible value into a method, it must be 'move’d, and when the method’s parameter goes out of scope when the method returns, the value will be destructed:

void SomeMethod(OutputMessageOnDestruction omod2)
{
    ...
} // 'omod2' destructed here
...
OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(move omod1); // Ok, 'omod1' now uninitializedl; won't be destructed

In this case, the value needs to be moved into SomeMethod so that SomeMethod can take ownership of the destruction. If you want to be able to write a helper method that works with a destructible value but that doesn’t assume ownership for the destruction, the value can be passed by reference:

void SomeMethod(ref OutputMessageOnDestruction omod2)
{
   ...
} // 'omod2' not destructed here
…
OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(ref omod1); // Ok, 'omod1' still valid

In addition to being able to destructively read a destructible instance using ‘move’ and being able to pass a destructible instance by reference to a method, you can also access fields of or call instance methods on destructible instances. You can also store destructible instances in fields of other types, but those other types must also be destructible types, and the compiler guarantees that these fields will get destructed when the containing type is destructed.

destructible struct WrapperData(SensitiveData data)
{
    SensitiveData m_data = move data; // 'm_data' will be destructed when 'this' is destructed
    …
}
destructible struct SensitiveData { … }

There would be a well-defined order in which destruction happens when destructible types contain other destructible types. Destructible fields would be destructed in the reverse order from which the fields are declared on the containing type. The fields of a derived type are destructed before the fields of a base type. And user-defined code runs in a destructor before the type’s fields are destructed.

Similarly, there’d be a well-defined order for how destruction happens with locals. Destructible locals are destructed at the end of the scope in which they are created, in reverse declaration order. Further, destructible temporaries (destructible values produced as the result of an expression and not immediately stored into a storage location) would behave exactly as a destructible locals declared at the same position, but the scope of a destructible temporary is the full expression in which it is created.

Destructible locals may also be captured into lambdas. Doing so results in the closure instance itself being destructible (since it contains destructible fields resulting from capturing destructible locals), which in turn means that the delegate to which the lambda is bound must also be destructible. Just capturing a local by reference into a closure would be problematic, as it would result in a destructible value being accessible both to the containing method and to the lambda. To deal with this, closures may capture destructible values, but only if an explicit capture list (#117) is used to ‘move’ the destructible value into the lambda (such support would also require destructible delegate types):

OutputMessageOnDestruction omod = new OutputMessageOnDestruction("Destructed!");
DestructibleAction action = [var localOmod = move omod]() => {
    Console.WriteLine("Action!");
}

The destructible types feature would enable a developer to express some intention around how something should behave, enabling the compiler to then do a lot of heavy lifting for the developer in making sure that the program is as correct-by-construction as possible. Developers familiar with C++ should feel right at home using destructible types, as it provides a solid Resource Acquisition Is Initialization (RAII) approach to ensuring that resources are properly destructed and that resource leaks are avoided.

Issue Analytics

State:
Created 9 years ago
Reactions:35
Comments:83 (39 by maintainers)

Top GitHub Comments

1reaction

bbarrycommented, Nov 27, 2015

@drauch [SuppressMessage("Acme.RequiresUsing", "AR....", Justification = "Disposed via reflection.")]

1reaction

govertcommented, May 12, 2015

I would like to suggest a poor-man’s version of this feature, where the existing IDisposable / using mechanism is extended by compiler help and syntactic sugar to assist with some of the specific problems in current use. This would be a bit like the FxCop rules, but built into the compiler, improving the use of the current feature but not providing hard guarantees. For example:

Add an attribute, say [RequiresUsing] to indicate that a class which implements IDisposable should only be constructed in a using block, or in an assignment to a field in another [RequiresUsing] type. (An alternative would be an interface that extends IDisposable.)
Creating a [RequiresUsing] object outside a using block or some assignment to a field in a [RequiresUsing] type generates a compiler warning.
In an IDisposable type, a field of a type that implements IDisposable can be marked as [Dispose]. The compiler will auto-generate a Dispose() (maybe with an existing Dispose() being called by the compiler-generated method).
Some syntax like use x = new A(); is just shorthand for using (x = new A()) {...} where the block extent is as small as possible in the method - until just after the last use of x. Feature like async / await and exception handling already works right with using.
Add any flow analysis that the compiler can easily do, to provide warnings for misuse, like cases where an object might be used after disposal - e.g. if it is passed to a method from inside a using block that stores the reference and would allow the reference to live beyond the call lifetime.

This does not address the move / ref ownership issues comprehensively, so provides no guarantee around deterministic disposal. But it has the great advantage of not adding a tricky new language concept, instead making the compiler more helpful in using the existing paradigm for deterministic disposal.