24 February 2012

Memory Leaks in .NET Application - Don't let them slip through your eyes.

Before talking about the main topic, I would like to briefly go through the CLR garbage collection mechanism.  When a .NET application is executed, CLR allocates a block of memory which is called managed heap. This managed heap is logically divided among 3 generations - Gen0, Gen1 and Gen2. Usually Gen0 contains the newly created and short lived objects. So, here is a quick view on what happens during garbage collection -
  1. Whenever generation 0 gets full, garbage collection occurs. During garbage collection, the garbage collector examines each object in Gen0 to know whether the object is a root. A root  object is one which has a valid reference (or simply the object is still in use). After the root examination, garbage collector collects all non-root objects and frees the memory occupied by them and the root objects which survived the garbage collection are moved to Gen1. At the end of garbage collection, Gen0 will be 100% empty.
  2. Now, suppose the application needs more memory than which is available in Gen0. So, Garbage collection must occur which collects objects in both Gen0 and Gen1. Again, the garbage collection starts from identifying root objects in Gen1. All the non-root objects are collected and their memory is reclaimed. The survived objects (roots) will be moved to Gen2. Then Garbage collector collects Gen0 objects as explained in Step1.
  3. Sometimes Later the application might require more memory than which is available in Gen0 & Gen1 together. So, the garbage collection must occur on all three generations. So, Garbage Collector starts examining the objects in Gen2. All non-root objects are collected and their memory is freed-up. The survived objects are going to remain in Gen2 only. Then Garbage collector collects Gen1 and Gen0 objects as explained in Step1 & 2.
What is a Memory Leak?

What we can see from the above explanation is that until an object has a root it is going to remain in memory and is always promoted to higher generation. Having this said, let's see what is memory leak.

Consider you have a class which holds a reference to an unmanaged handle as shown below.

class UnmanagedClass
         IntPtr handle;
         Int32[] someBigArray = new Int32[200000];  //a dummy array to hold sufficiently large memory.

         public UnmanagedClass()
                 handle = GetUnmanagedHandle(); // consider this method returns an unmanaged handle

Then I will create an instance of above class as below,

private void MemoryLeakTest()
        for(int i = 0;  i < 1000000; i++)
                String str = new String();
                UnmanagedClass uc = new UnmanagedClass();  //doesn't

In the above function, after every loop, both sr & uc become eligible for garbage collection. Suppose, after 5 loops, generation 0 becomes full and GC must run. See that after 5 loops there are 5 string objects and 5 UnmanagedClass objects are created on heap. Garbage collection starts and sees that all five string objects have no roots. So, it frees the memory occupied by the string objects. Then it starts examiniting the Unmanaged class objects. But, each uc object has an unmanaged root. Since GC cares for only managed object but not unmanaged objects, it will not examine the unmanaged handle. Hence, it treats all 5 UnmanagedClass instances to be roots and moves them to Generation1. At this point the generations look like below.

Ultimately Generation0 becomes empty and the loop starts executing again. Now, again after 5 loops Gen0 becomes full and GC occurs. As explained previously, all 5 Unmanaged objects are treated to be roots and they are moved to Generation1. But, there may not be sufficient space in Gen1 to accumulate all objects that survived in Gen0 collection. So, GC has to run on Gen1 as well. Hence, GC starts examining UnmanagedClass objects in Gen1. Again GC sees that they contain a valid handle hence they are moved to Gen2. At this point Gen1 and Gen2 have 5 UnmanagedClass objects each and Gen0 is empty.

In the same way, after another 5 loops, the 5 UnmanagedClass objects in Gen1 will survive GC and moved to Gen2 and Gen0 objects will be moved to Gen1 and the picture looks like below,

You can see that now the generation2 is getting full and it has to be garbage collected. But, again, all the objects in contain an unmanaged handle and they will not be collected at all. Hence, at the next GC, there will be no memory to left in Gen2 to move any objects into it. At this point, you can say that the application is leaking memory.

So, if the application continues to run, at some point of time, there will be no memory left to allocate any objects and CLR will throw OutOfMemoryException and process terminates.

How to avoid Memory Leaks?
  • If your class has an unmanaged handle, implement Finalize and Dispose pattern to release the unmanaged handle. This article gives you an overview of Dispose pattern and this page shows you how dispose an unmanaged handle.
  • If you are a consumer of a class that implements IDisposable, as a developer, you are responsible for calling Dispose on it. Ensure, all Disposable objects are disposed or at least those objects that do not have finalizers.
  • Avoid static collections. You might know that a type loaded in memory is never loaded until the application is shutdown. Since, static members are type members, they are going to stay in the memory always. So, use static members carefully.
I hope you enjoyed reading this article. Happy Programming.